Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Engineering Statistics Handbook 2003

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1522

Engineering Statistics Handbook

2003

1. Exploratory Data Analysis

2. Measurement Process Characterization

3. Production Process Characterization

4. Process Modeling

5. Process Improvement

6. Process or Product Monitoring and Control

7. Product and Process Comparisons

8. Assessing Product Reliability


1. Exploratory Data Analysis 1. Exploratory Data Analysis

1. Exploratory Data Analysis


This chapter presents the assumptions, principles, and techniques necessary to gain
1. Exploratory Data Analysis - Detailed Table of
insight into data via EDA--exploratory data analysis. Contents [1.]
1. EDA Introduction 2. EDA Assumptions
1. What is EDA? 1. Underlying Assumptions This chapter presents the assumptions, principles, and techniques necessary to gain insight into
data via EDA--exploratory data analysis.
2. EDA vs Classical & Bayesian 2. Importance
1. EDA Introduction [1.1.]
3. EDA vs Summary 3. Techniques for Testing
1. What is EDA? [1.1.1.]
4. EDA Goals Assumptions
2. How Does Exploratory Data Analysis differ from Classical Data Analysis? [1.1.2.]
5. The Role of Graphics 4. Interpretation of 4-Plot
5. Consequences 1. Model [1.1.2.1.]
6. An EDA/Graphics Example
2. Focus [1.1.2.2.]
7. General Problem Categories
3. Techniques [1.1.2.3.]
3. EDA Techniques 4. EDA Case Studies 4. Rigor [1.1.2.4.]
1. Introduction 1. Introduction 5. Data Treatment [1.1.2.5.]
2. Analysis Questions 2. By Problem Category 6. Assumptions [1.1.2.6.]
3. Graphical Techniques: Alphabetical 3. How Does Exploratory Data Analysis Differ from Summary Analysis? [1.1.3.]
4. Graphical Techniques: By Problem 4. What are the EDA Goals? [1.1.4.]
Category 5. The Role of Graphics [1.1.5.]
5. Quantitative Techniques 6. An EDA/Graphics Example [1.1.6.]
6. Probability Distributions 7. General Problem Categories [1.1.7.]

Detailed Chapter Table of Contents 2. EDA Assumptions [1.2.]


References 1. Underlying Assumptions [1.2.1.]
Dataplot Commands for EDA Techniques
2. Importance [1.2.2.]
3. Techniques for Testing Assumptions [1.2.3.]
4. Interpretation of 4-Plot [1.2.4.]
5. Consequences [1.2.5.]
1. Consequences of Non-Randomness [1.2.5.1.]
2. Consequences of Non-Fixed Location Parameter [1.2.5.2.]

http://www.itl.nist.gov/div898/handbook/eda/eda.htm [11/13/2003 5:30:57 PM] http://www.itl.nist.gov/div898/handbook/eda/eda_d.htm (1 of 8) [11/13/2003 5:31:20 PM]


1. Exploratory Data Analysis 1. Exploratory Data Analysis

3. Consequences of Non-Fixed Variation Parameter [1.2.5.3.] 6. Histogram Interpretation: Skewed (Non-Normal) Right [1.3.3.14.6.]
4. Consequences Related to Distributional Assumptions [1.2.5.4.] 7. Histogram Interpretation: Skewed (Non-Symmetric) Left [1.3.3.14.7.]
8. Histogram Interpretation: Symmetric with Outlier [1.3.3.14.8.]
3. EDA Techniques [1.3.] 15. Lag Plot [1.3.3.15.]
1. Introduction [1.3.1.] 1. Lag Plot: Random Data [1.3.3.15.1.]
2. Analysis Questions [1.3.2.] 2. Lag Plot: Moderate Autocorrelation [1.3.3.15.2.]
3. Graphical Techniques: Alphabetic [1.3.3.] 3. Lag Plot: Strong Autocorrelation and Autoregressive
1. Autocorrelation Plot [1.3.3.1.] Model [1.3.3.15.3.]
1. Autocorrelation Plot: Random Data [1.3.3.1.1.] 4. Lag Plot: Sinusoidal Models and Outliers [1.3.3.15.4.]
2. Autocorrelation Plot: Moderate Autocorrelation [1.3.3.1.2.] 16. Linear Correlation Plot [1.3.3.16.]
3. Autocorrelation Plot: Strong Autocorrelation and Autoregressive 17. Linear Intercept Plot [1.3.3.17.]
Model [1.3.3.1.3.] 18. Linear Slope Plot [1.3.3.18.]
4. Autocorrelation Plot: Sinusoidal Model [1.3.3.1.4.] 19. Linear Residual Standard Deviation Plot [1.3.3.19.]
2. Bihistogram [1.3.3.2.] 20. Mean Plot [1.3.3.20.]
3. Block Plot [1.3.3.3.] 21. Normal Probability Plot [1.3.3.21.]
4. Bootstrap Plot [1.3.3.4.] 1. Normal Probability Plot: Normally Distributed Data [1.3.3.21.1.]
5. Box-Cox Linearity Plot [1.3.3.5.] 2. Normal Probability Plot: Data Have Short Tails [1.3.3.21.2.]
6. Box-Cox Normality Plot [1.3.3.6.] 3. Normal Probability Plot: Data Have Long Tails [1.3.3.21.3.]
7. Box Plot [1.3.3.7.] 4. Normal Probability Plot: Data are Skewed Right [1.3.3.21.4.]
8. Complex Demodulation Amplitude Plot [1.3.3.8.] 22. Probability Plot [1.3.3.22.]
9. Complex Demodulation Phase Plot [1.3.3.9.] 23. Probability Plot Correlation Coefficient Plot [1.3.3.23.]
10. Contour Plot [1.3.3.10.] 24. Quantile-Quantile Plot [1.3.3.24.]
1. DEX Contour Plot [1.3.3.10.1.] 25. Run-Sequence Plot [1.3.3.25.]
11. DEX Scatter Plot [1.3.3.11.] 26. Scatter Plot [1.3.3.26.]
12. DEX Mean Plot [1.3.3.12.] 1. Scatter Plot: No Relationship [1.3.3.26.1.]
13. DEX Standard Deviation Plot [1.3.3.13.] 2. Scatter Plot: Strong Linear (positive correlation)
14. Histogram [1.3.3.14.] Relationship [1.3.3.26.2.]
1. Histogram Interpretation: Normal [1.3.3.14.1.] 3. Scatter Plot: Strong Linear (negative correlation)
2. Histogram Interpretation: Symmetric, Non-Normal, Relationship [1.3.3.26.3.]
Short-Tailed [1.3.3.14.2.] 4. Scatter Plot: Exact Linear (positive correlation)
3. Histogram Interpretation: Symmetric, Non-Normal, Relationship [1.3.3.26.4.]
Long-Tailed [1.3.3.14.3.] 5. Scatter Plot: Quadratic Relationship [1.3.3.26.5.]
4. Histogram Interpretation: Symmetric and Bimodal [1.3.3.14.4.] 6. Scatter Plot: Exponential Relationship [1.3.3.26.6.]
5. Histogram Interpretation: Bimodal Mixture of 2 Normals [1.3.3.14.5.] 7. Scatter Plot: Sinusoidal Relationship (damped) [1.3.3.26.7.]

http://www.itl.nist.gov/div898/handbook/eda/eda_d.htm (2 of 8) [11/13/2003 5:31:20 PM] http://www.itl.nist.gov/div898/handbook/eda/eda_d.htm (3 of 8) [11/13/2003 5:31:20 PM]


1. Exploratory Data Analysis 1. Exploratory Data Analysis

8. Scatter Plot: Variation of Y Does Not Depend on X 12. Autocorrelation [1.3.5.12.]


(homoscedastic) [1.3.3.26.8.] 13. Runs Test for Detecting Non-randomness [1.3.5.13.]
9. Scatter Plot: Variation of Y Does Depend on X 14. Anderson-Darling Test [1.3.5.14.]
(heteroscedastic) [1.3.3.26.9.]
15. Chi-Square Goodness-of-Fit Test [1.3.5.15.]
10. Scatter Plot: Outlier [1.3.3.26.10.]
16. Kolmogorov-Smirnov Goodness-of-Fit Test [1.3.5.16.]
11. Scatterplot Matrix [1.3.3.26.11.]
17. Grubbs' Test for Outliers [1.3.5.17.]
12. Conditioning Plot [1.3.3.26.12.]
18. Yates Analysis [1.3.5.18.]
27. Spectral Plot [1.3.3.27.]
1. Defining Models and Prediction Equations [1.3.5.18.1.]
1. Spectral Plot: Random Data [1.3.3.27.1.]
2. Important Factors [1.3.5.18.2.]
2. Spectral Plot: Strong Autocorrelation and Autoregressive
6. Probability Distributions [1.3.6.]
Model [1.3.3.27.2.]
1. What is a Probability Distribution [1.3.6.1.]
3. Spectral Plot: Sinusoidal Model [1.3.3.27.3.]
2. Related Distributions [1.3.6.2.]
28. Standard Deviation Plot [1.3.3.28.]
3. Families of Distributions [1.3.6.3.]
29. Star Plot [1.3.3.29.]
4. Location and Scale Parameters [1.3.6.4.]
30. Weibull Plot [1.3.3.30.]
5. Estimating the Parameters of a Distribution [1.3.6.5.]
31. Youden Plot [1.3.3.31.]
1. Method of Moments [1.3.6.5.1.]
1. DEX Youden Plot [1.3.3.31.1.]
2. Maximum Likelihood [1.3.6.5.2.]
32. 4-Plot [1.3.3.32.]
3. Least Squares [1.3.6.5.3.]
33. 6-Plot [1.3.3.33.]
4. PPCC and Probability Plots [1.3.6.5.4.]
4. Graphical Techniques: By Problem Category [1.3.4.]
6. Gallery of Distributions [1.3.6.6.]
5. Quantitative Techniques [1.3.5.]
1. Normal Distribution [1.3.6.6.1.]
1. Measures of Location [1.3.5.1.]
2. Uniform Distribution [1.3.6.6.2.]
2. Confidence Limits for the Mean [1.3.5.2.]
3. Cauchy Distribution [1.3.6.6.3.]
3. Two-Sample t-Test for Equal Means [1.3.5.3.]
4. t Distribution [1.3.6.6.4.]
1. Data Used for Two-Sample t-Test [1.3.5.3.1.]
5. F Distribution [1.3.6.6.5.]
4. One-Factor ANOVA [1.3.5.4.]
6. Chi-Square Distribution [1.3.6.6.6.]
5. Multi-factor Analysis of Variance [1.3.5.5.]
7. Exponential Distribution [1.3.6.6.7.]
6. Measures of Scale [1.3.5.6.]
8. Weibull Distribution [1.3.6.6.8.]
7. Bartlett's Test [1.3.5.7.]
9. Lognormal Distribution [1.3.6.6.9.]
8. Chi-Square Test for the Standard Deviation [1.3.5.8.]
10. Fatigue Life Distribution [1.3.6.6.10.]
1. Data Used for Chi-Square Test for the Standard Deviation [1.3.5.8.1.]
11. Gamma Distribution [1.3.6.6.11.]
9. F-Test for Equality of Two Standard Deviations [1.3.5.9.]
12. Double Exponential Distribution [1.3.6.6.12.]
10. Levene Test for Equality of Variances [1.3.5.10.]
13. Power Normal Distribution [1.3.6.6.13.]
11. Measures of Skewness and Kurtosis [1.3.5.11.]

http://www.itl.nist.gov/div898/handbook/eda/eda_d.htm (4 of 8) [11/13/2003 5:31:20 PM] http://www.itl.nist.gov/div898/handbook/eda/eda_d.htm (5 of 8) [11/13/2003 5:31:20 PM]


1. Exploratory Data Analysis 1. Exploratory Data Analysis

14. Power Lognormal Distribution [1.3.6.6.14.] 4. Josephson Junction Cryothermometry [1.4.2.4.]


15. Tukey-Lambda Distribution [1.3.6.6.15.] 1. Background and Data [1.4.2.4.1.]
16. Extreme Value Type I Distribution [1.3.6.6.16.] 2. Graphical Output and Interpretation [1.4.2.4.2.]
17. Beta Distribution [1.3.6.6.17.] 3. Quantitative Output and Interpretation [1.4.2.4.3.]
18. Binomial Distribution [1.3.6.6.18.] 4. Work This Example Yourself [1.4.2.4.4.]
19. Poisson Distribution [1.3.6.6.19.] 5. Beam Deflections [1.4.2.5.]
7. Tables for Probability Distributions [1.3.6.7.] 1. Background and Data [1.4.2.5.1.]
1. Cumulative Distribution Function of the Standard Normal 2. Test Underlying Assumptions [1.4.2.5.2.]
Distribution [1.3.6.7.1.]
3. Develop a Better Model [1.4.2.5.3.]
2. Upper Critical Values of the Student's-t Distribution [1.3.6.7.2.]
4. Validate New Model [1.4.2.5.4.]
3. Upper Critical Values of the F Distribution [1.3.6.7.3.]
5. Work This Example Yourself [1.4.2.5.5.]
4. Critical Values of the Chi-Square Distribution [1.3.6.7.4.]
6. Filter Transmittance [1.4.2.6.]
5. Critical Values of the t* Distribution [1.3.6.7.5.]
1. Background and Data [1.4.2.6.1.]
6. Critical Values of the Normal PPCC Distribution [1.3.6.7.6.]
2. Graphical Output and Interpretation [1.4.2.6.2.]
3. Quantitative Output and Interpretation [1.4.2.6.3.]
4. EDA Case Studies [1.4.]
4. Work This Example Yourself [1.4.2.6.4.]
1. Case Studies Introduction [1.4.1.]
7. Standard Resistor [1.4.2.7.]
2. Case Studies [1.4.2.]
1. Background and Data [1.4.2.7.1.]
1. Normal Random Numbers [1.4.2.1.]
2. Graphical Output and Interpretation [1.4.2.7.2.]
1. Background and Data [1.4.2.1.1.]
3. Quantitative Output and Interpretation [1.4.2.7.3.]
2. Graphical Output and Interpretation [1.4.2.1.2.]
4. Work This Example Yourself [1.4.2.7.4.]
3. Quantitative Output and Interpretation [1.4.2.1.3.]
8. Heat Flow Meter 1 [1.4.2.8.]
4. Work This Example Yourself [1.4.2.1.4.]
1. Background and Data [1.4.2.8.1.]
2. Uniform Random Numbers [1.4.2.2.]
2. Graphical Output and Interpretation [1.4.2.8.2.]
1. Background and Data [1.4.2.2.1.]
3. Quantitative Output and Interpretation [1.4.2.8.3.]
2. Graphical Output and Interpretation [1.4.2.2.2.]
4. Work This Example Yourself [1.4.2.8.4.]
3. Quantitative Output and Interpretation [1.4.2.2.3.]
9. Airplane Glass Failure Time [1.4.2.9.]
4. Work This Example Yourself [1.4.2.2.4.]
1. Background and Data [1.4.2.9.1.]
3. Random Walk [1.4.2.3.]
2. Graphical Output and Interpretation [1.4.2.9.2.]
1. Background and Data [1.4.2.3.1.]
3. Weibull Analysis [1.4.2.9.3.]
2. Test Underlying Assumptions [1.4.2.3.2.]
4. Lognormal Analysis [1.4.2.9.4.]
3. Develop A Better Model [1.4.2.3.3.]
5. Gamma Analysis [1.4.2.9.5.]
4. Validate New Model [1.4.2.3.4.]
6. Power Normal Analysis [1.4.2.9.6.]
5. Work This Example Yourself [1.4.2.3.5.]

http://www.itl.nist.gov/div898/handbook/eda/eda_d.htm (6 of 8) [11/13/2003 5:31:20 PM] http://www.itl.nist.gov/div898/handbook/eda/eda_d.htm (7 of 8) [11/13/2003 5:31:20 PM]


1. Exploratory Data Analysis 1.1. EDA Introduction

7. Power Lognormal Analysis [1.4.2.9.7.]


8. Work This Example Yourself [1.4.2.9.8.]
10. Ceramic Strength [1.4.2.10.]
1. Background and Data [1.4.2.10.1.] 1. Exploratory Data Analysis
2. Analysis of the Response Variable [1.4.2.10.2.]
3. Analysis of the Batch Effect [1.4.2.10.3.]
1.1. EDA Introduction
4. Analysis of the Lab Effect [1.4.2.10.4.]
5. Analysis of Primary Factors [1.4.2.10.5.] Summary What is exploratory data analysis? How did it begin? How and where
6. Work This Example Yourself [1.4.2.10.6.] did it originate? How is it differentiated from other data analysis
approaches, such as classical and Bayesian? Is EDA the same as
3. References For Chapter 1: Exploratory Data Analysis [1.4.3.]
statistical graphics? What role does statistical graphics play in EDA? Is
statistical graphics identical to EDA?
These questions and related questions are dealt with in this section. This
section answers these questions and provides the necessary frame of
reference for EDA assumptions, principles, and techniques.

Table of 1. What is EDA?


Contents for 2. EDA versus Classical and Bayesian
Section 1
1. Models
2. Focus
3. Techniques
4. Rigor
5. Data Treatment
6. Assumptions
3. EDA vs Summary
4. EDA Goals
5. The Role of Graphics
6. An EDA/Graphics Example
7. General Problem Categories

http://www.itl.nist.gov/div898/handbook/eda/eda_d.htm (8 of 8) [11/13/2003 5:31:20 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda1.htm [11/13/2003 5:31:37 PM]


1.1.1. What is EDA? 1.1.1. What is EDA?

History The seminal work in EDA is Exploratory Data Analysis, Tukey, (1977).
Over the years it has benefitted from other noteworthy publications such
as Data Analysis and Regression, Mosteller and Tukey (1977),
1. Exploratory Data Analysis Interactive Data Analysis, Hoaglin (1977), The ABC's of EDA,
1.1. EDA Introduction Velleman and Hoaglin (1981) and has gained a large following as "the"
way to analyze a data set.
1.1.1. What is EDA? Techniques Most EDA techniques are graphical in nature with a few quantitative
techniques. The reason for the heavy reliance on graphics is that by its
Approach Exploratory Data Analysis (EDA) is an approach/philosophy for data very nature the main role of EDA is to open-mindedly explore, and
analysis that employs a variety of techniques (mostly graphical) to graphics gives the analysts unparalleled power to do so, enticing the
1. maximize insight into a data set; data to reveal its structural secrets, and being always ready to gain some
2. uncover underlying structure; new, often unsuspected, insight into the data. In combination with the
natural pattern-recognition capabilities that we all possess, graphics
3. extract important variables; provides, of course, unparalleled power to carry this out.
4. detect outliers and anomalies;
The particular graphical techniques employed in EDA are often quite
5. test underlying assumptions;
simple, consisting of various techniques of:
6. develop parsimonious models; and
1. Plotting the raw data (such as data traces, histograms,
7. determine optimal factor settings. bihistograms, probability plots, lag plots, block plots, and Youden
plots.
Focus The EDA approach is precisely that--an approach--not a set of
techniques, but an attitude/philosophy about how a data analysis should 2. Plotting simple statistics such as mean plots, standard deviation
be carried out. plots, box plots, and main effects plots of the raw data.
3. Positioning such plots so as to maximize our natural
Philosophy EDA is not identical to statistical graphics although the two terms are pattern-recognition abilities, such as using multiple plots per
used almost interchangeably. Statistical graphics is a collection of page.
techniques--all graphically based and all focusing on one data
characterization aspect. EDA encompasses a larger venue; EDA is an
approach to data analysis that postpones the usual assumptions about
what kind of model the data follow with the more direct approach of
allowing the data itself to reveal its underlying structure and model.
EDA is not a mere collection of techniques; EDA is a philosophy as to
how we dissect a data set; what we look for; how we look; and how we
interpret. It is true that EDA heavily uses the collection of techniques
that we call "statistical graphics", but it is not identical to statistical
graphics per se.

http://www.itl.nist.gov/div898/handbook/eda/section1/eda11.htm (1 of 2) [11/13/2003 5:31:37 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda11.htm (2 of 2) [11/13/2003 5:31:37 PM]


1.1.2. How Does Exploratory Data Analysis differ from Classical Data Analysis? 1.1.2. How Does Exploratory Data Analysis differ from Classical Data Analysis?

Method of Thus for classical analysis, the data collection is followed by the
dealing with imposition of a model (normality, linearity, etc.) and the analysis,
underlying estimation, and testing that follows are focused on the parameters of
1. Exploratory Data Analysis model for that model. For EDA, the data collection is not followed by a model
1.1. EDA Introduction the data imposition; rather it is followed immediately by analysis with a goal of
distinguishes inferring what model would be appropriate. Finally, for a Bayesian
the 3 analysis, the analyst attempts to incorporate scientific/engineering
1.1.2. How Does Exploratory Data Analysis approaches knowledge/expertise into the analysis by imposing a data-independent
distribution on the parameters of the selected model; the analysis thus
differ from Classical Data Analysis? consists of formally combining both the prior distribution on the
parameters and the collected data to jointly make inferences and/or test
Data EDA is a data analysis approach. What other data analysis approaches assumptions about the model parameters.
Analysis exist and how does EDA differ from these other approaches? Three
In the real world, data analysts freely mix elements of all of the above
Approaches popular data analysis approaches are:
three approaches (and other approaches). The above distinctions were
1. Classical made to emphasize the major differences among the three approaches.
2. Exploratory (EDA)
3. Bayesian Further Focusing on EDA versus classical, these two approaches differ as
discussion of follows:
Paradigms These three approaches are similar in that they all start with a general the 1. Models
for Analysis science/engineering problem and all yield science/engineering distinction
between the 2. Focus
Techniques conclusions. The difference is the sequence and focus of the
intermediate steps. classical and 3. Techniques
EDA 4. Rigor
For classical analysis, the sequence is approaches
5. Data Treatment
Problem => Data => Model => Analysis => Conclusions
For EDA, the sequence is 6. Assumptions
Problem => Data => Analysis => Model => Conclusions
For Bayesian, the sequence is
Problem => Data => Model => Prior Distribution => Analysis =>
Conclusions

http://www.itl.nist.gov/div898/handbook/eda/section1/eda12.htm (1 of 2) [11/13/2003 5:31:37 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda12.htm (2 of 2) [11/13/2003 5:31:37 PM]


1.1.2.1. Model 1.1.2.2. Focus

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.1. EDA Introduction 1.1. EDA Introduction
1.1.2. How Does Exploratory Data Analysis differ from Classical Data Analysis? 1.1.2. How Does Exploratory Data Analysis differ from Classical Data Analysis?

1.1.2.1. Model 1.1.2.2. Focus


Classical The classical approach imposes models (both deterministic and Classical The two approaches differ substantially in focus. For classical analysis,
probabilistic) on the data. Deterministic models include, for example, the focus is on the model--estimating parameters of the model and
regression models and analysis of variance (ANOVA) models. The most generating predicted values from the model.
common probabilistic model assumes that the errors about the
deterministic model are normally distributed--this assumption affects the Exploratory For exploratory data analysis, the focus is on the data--its structure,
validity of the ANOVA F tests. outliers, and models suggested by the data.

Exploratory The Exploratory Data Analysis approach does not impose deterministic
or probabilistic models on the data. On the contrary, the EDA approach
allows the data to suggest admissible models that best fit the data.

http://www.itl.nist.gov/div898/handbook/eda/section1/eda121.htm [11/13/2003 5:31:37 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda122.htm [11/13/2003 5:31:38 PM]


1.1.2.3. Techniques 1.1.2.4. Rigor

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.1. EDA Introduction 1.1. EDA Introduction
1.1.2. How Does Exploratory Data Analysis differ from Classical Data Analysis? 1.1.2. How Does Exploratory Data Analysis differ from Classical Data Analysis?

1.1.2.3. Techniques 1.1.2.4. Rigor


Classical Classical techniques are generally quantitative in nature. They include Classical Classical techniques serve as the probabilistic foundation of science and
ANOVA, t tests, chi-squared tests, and F tests. engineering; the most important characteristic of classical techniques is
that they are rigorous, formal, and "objective".
Exploratory EDA techniques are generally graphical. They include scatter plots,
character plots, box plots, histograms, bihistograms, probability plots, Exploratory EDA techniques do not share in that rigor or formality. EDA techniques
make up for that lack of rigor by being very suggestive, indicative, and
residual plots, and mean plots.
insightful about what the appropriate model should be.
EDA techniques are subjective and depend on interpretation which may
differ from analyst to analyst, although experienced analysts commonly
arrive at identical conclusions.

http://www.itl.nist.gov/div898/handbook/eda/section1/eda123.htm [11/13/2003 5:31:38 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda124.htm [11/13/2003 5:31:38 PM]


1.1.2.5. Data Treatment 1.1.2.6. Assumptions

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.1. EDA Introduction 1.1. EDA Introduction
1.1.2. How Does Exploratory Data Analysis differ from Classical Data Analysis? 1.1.2. How Does Exploratory Data Analysis differ from Classical Data Analysis?

1.1.2.5. Data Treatment 1.1.2.6. Assumptions


Classical Classical estimation techniques have the characteristic of taking all of Classical The "good news" of the classical approach is that tests based on
the data and mapping the data into a few numbers ("estimates"). This is classical techniques are usually very sensitive--that is, if a true shift in
both a virtue and a vice. The virtue is that these few numbers focus on location, say, has occurred, such tests frequently have the power to
important characteristics (location, variation, etc.) of the population. The detect such a shift and to conclude that such a shift is "statistically
vice is that concentrating on these few characteristics can filter out other significant". The "bad news" is that classical tests depend on underlying
characteristics (skewness, tail length, autocorrelation, etc.) of the same assumptions (e.g., normality), and hence the validity of the test
population. In this sense there is a loss of information due to this conclusions becomes dependent on the validity of the underlying
"filtering" process. assumptions. Worse yet, the exact underlying assumptions may be
unknown to the analyst, or if known, untested. Thus the validity of the
Exploratory The EDA approach, on the other hand, often makes use of (and shows) scientific conclusions becomes intrinsically linked to the validity of the
all of the available data. In this sense there is no corresponding loss of underlying assumptions. In practice, if such assumptions are unknown
information. or untested, the validity of the scientific conclusions becomes suspect.

Exploratory Many EDA techniques make little or no assumptions--they present and


show the data--all of the data--as is, with fewer encumbering
assumptions.

http://www.itl.nist.gov/div898/handbook/eda/section1/eda125.htm [11/13/2003 5:31:38 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda126.htm [11/13/2003 5:31:38 PM]


1.1.3. How Does Exploratory Data Analysis Differ from Summary Analysis? 1.1.4. What are the EDA Goals?

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.1. EDA Introduction 1.1. EDA Introduction

1.1.3. How Does Exploratory Data Analysis 1.1.4. What are the EDA Goals?
Differ from Summary Analysis? Primary and The primary goal of EDA is to maximize the analyst's insight into a data
Secondary set and into the underlying structure of a data set, while providing all of
Summary A summary analysis is simply a numeric reduction of a historical data Goals the specific items that an analyst would want to extract from a data set,
set. It is quite passive. Its focus is in the past. Quite commonly, its such as:
purpose is to simply arrive at a few key statistics (for example, mean
and standard deviation) which may then either replace the data set or be 1. a good-fitting, parsimonious model
added to the data set in the form of a summary table. 2. a list of outliers
3. a sense of robustness of conclusions
Exploratory In contrast, EDA has as its broadest goal the desire to gain insight into 4. estimates for parameters
the engineering/scientific process behind the data. Whereas summary
5. uncertainties for those estimates
statistics are passive and historical, EDA is active and futuristic. In an
attempt to "understand" the process and improve it in the future, EDA 6. a ranked list of important factors
uses the data as a "window" to peer into the heart of the process that 7. conclusions as to whether individual factors are statistically
generated the data. There is an archival role in the research and significant
manufacturing world for summary statistics, but there is an enormously 8. optimal settings
larger role for the EDA approach.
Insight into Insight implies detecting and uncovering underlying structure in the
the Data data. Such underlying structure may not be encapsulated in the list of
items above; such items serve as the specific targets of an analysis, but
the real insight and "feel" for a data set comes as the analyst judiciously
probes and explores the various subtleties of the data. The "feel" for the
data comes almost exclusively from the application of various graphical
techniques, the collection of which serves as the window into the
essence of the data. Graphics are irreplaceable--there are no quantitative
analogues that will give the same insight as well-chosen graphics.
To get a "feel" for the data, it is not enough for the analyst to know what
is in the data; the analyst also must know what is not in the data, and the
only way to do that is to draw on our own human pattern-recognition
and comparative abilities in the context of a series of judicious graphical
techniques applied to the data.

http://www.itl.nist.gov/div898/handbook/eda/section1/eda13.htm [11/13/2003 5:31:38 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda14.htm [11/13/2003 5:31:38 PM]


1.1.5. The Role of Graphics 1.1.5. The Role of Graphics

EDA The EDA approach relies heavily on these and similar graphical
Approach techniques. Graphical procedures are not just tools that we could use in
Relies an EDA context, they are tools that we must use. Such graphical tools
1. Exploratory Data Analysis Heavily on are the shortest path to gaining insight into a data set in terms of
1.1. EDA Introduction Graphical ● testing assumptions
Techniques
● model selection

1.1.5. The Role of Graphics ● model validation

● estimator selection

Quantitative/ Statistics and data analysis procedures can broadly be split into two ● relationship identification

Graphical parts: ● factor effect determination

● quantitative ● outlier detection

● graphical If one is not using statistical graphics, then one is forfeiting insight into
one or more aspects of the underlying structure of the data.
Quantitative Quantitative techniques are the set of statistical procedures that yield
numeric or tabular output. Examples of quantitative techniques include:
● hypothesis testing

● analysis of variance
● point estimates and confidence intervals
● least squares regression
These and similar techniques are all valuable and are mainstream in
terms of classical analysis.

Graphical On the other hand, there is a large collection of statistical tools that we
generally refer to as graphical techniques. These include:
● scatter plots

● histograms
● probability plots
● residual plots
● box plots
● block plots

http://www.itl.nist.gov/div898/handbook/eda/section1/eda15.htm (1 of 2) [11/13/2003 5:31:38 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda15.htm (2 of 2) [11/13/2003 5:31:38 PM]


1.1.6. An EDA/Graphics Example 1.1.6. An EDA/Graphics Example

Scatter Plot In contrast, the following simple scatter plot of the data

1. Exploratory Data Analysis


1.1. EDA Introduction

1.1.6. An EDA/Graphics Example


Anscombe A simple, classic (Anscombe) example of the central role that graphics
Example play in terms of providing insight into a data set starts with the
following data set:

Data
X Y
10.00 8.04
8.00 6.95
13.00 7.58
9.00 8.81
11.00 8.33 suggests the following:
14.00 9.96
1. The data set "behaves like" a linear curve with some scatter;
6.00 7.24
4.00 4.26 2. there is no justification for a more complicated model (e.g.,
12.00 10.84 quadratic);
7.00 4.82 3. there are no outliers;
5.00 5.68 4. the vertical spread of the data appears to be of equal height
irrespective of the X-value; this indicates that the data are
Summary If the goal of the analysis is to compute summary statistics plus equally-precise throughout and so a "regular" (that is,
Statistics determine the best linear fit for Y as a function of X, the results might equi-weighted) fit is appropriate.
be given as:
N = 11 Three This kind of characterization for the data serves as the core for getting
Mean of X = 9.0 Additional insight/feel for the data. Such insight/feel does not come from the
Mean of Y = 7.5 Data Sets quantitative statistics; on the contrary, calculations of quantitative
Intercept = 3 statistics such as intercept and slope should be subsequent to the
Slope = 0.5 characterization and will make sense only if the characterization is
Residual standard deviation = 1.237 true. To illustrate the loss of information that results when the graphics
Correlation = 0.816 insight step is skipped, consider the following three data sets
[Anscombe data sets 2, 3, and 4]:
The above quantitative analysis, although valuable, gives us only
limited insight into the data.
X2 Y2 X3 Y3 X4 Y4
10.00 9.14 10.00 7.46 8.00 6.58
8.00 8.14 8.00 6.77 8.00 5.76
13.00 8.74 13.00 12.74 8.00 7.71

http://www.itl.nist.gov/div898/handbook/eda/section1/eda16.htm (1 of 5) [11/13/2003 5:31:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda16.htm (2 of 5) [11/13/2003 5:31:39 PM]


1.1.6. An EDA/Graphics Example 1.1.6. An EDA/Graphics Example

9.00 8.77 9.00 7.11 8.00 8.84


11.00 9.26 11.00 7.81 8.00 8.47 Scatter Plots
14.00 8.10 14.00 8.84 8.00 7.04
6.00 6.13 6.00 6.08 8.00 5.25
4.00 3.10 4.00 5.39 19.00 12.50
12.00 9.13 12.00 8.15 8.00 5.56
7.00 7.26 7.00 6.42 8.00 7.91
5.00 4.74 5.00 5.73 8.00 6.89

Quantitative A quantitative analysis on data set 2 yields


Statistics for N = 11
Data Set 2 Mean of X = 9.0
Mean of Y = 7.5
Intercept = 3
Slope = 0.5
Residual standard deviation = 1.237
Correlation = 0.816
which is identical to the analysis for data set 1. One might naively
assume that the two data sets are "equivalent" since that is what the
statistics tell us; but what do the statistics not tell us? Interpretation Conclusions from the scatter plots are:
of Scatter 1. data set 1 is clearly linear with some scatter.
Quantitative Remarkably, a quantitative analysis on data sets 3 and 4 also yields Plots
Statistics for 2. data set 2 is clearly quadratic.
N = 11
Data Sets 3 Mean of X = 9.0 3. data set 3 clearly has an outlier.
and 4 Mean of Y = 7.5 4. data set 4 is obviously the victim of a poor experimental design
Intercept = 3 with a single point far removed from the bulk of the data
Slope = 0.5 "wagging the dog".
Residual standard deviation = 1.236
Correlation = 0.816 (0.817 for data set 4) Importance These points are exactly the substance that provide and define "insight"
which implies that in some quantitative sense, all four of the data sets of and "feel" for a data set. They are the goals and the fruits of an open
are "equivalent". In fact, the four data sets are far from "equivalent" Exploratory exploratory data analysis (EDA) approach to the data. Quantitative
and a scatter plot of each data set, which would be step 1 of any EDA Analysis statistics are not wrong per se, but they are incomplete. They are
approach, would tell us that immediately. incomplete because they are numeric summaries which in the
summarization operation do a good job of focusing on a particular
aspect of the data (e.g., location, intercept, slope, degree of relatedness,
etc.) by judiciously reducing the data to a few numbers. Doing so also
filters the data, necessarily omitting and screening out other sometimes
crucial information in the focusing operation. Quantitative statistics
focus but also filter; and filtering is exactly what makes the
quantitative approach incomplete at best and misleading at worst.
The estimated intercepts (= 3) and slopes (= 0.5) for data sets 2, 3, and
4 are misleading because the estimation is done in the context of an
assumed linear model and that linearity assumption is the fatal flaw in
this analysis.

http://www.itl.nist.gov/div898/handbook/eda/section1/eda16.htm (3 of 5) [11/13/2003 5:31:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda16.htm (4 of 5) [11/13/2003 5:31:39 PM]


1.1.6. An EDA/Graphics Example 1.1.7. General Problem Categories

The EDA approach of deliberately postponing the model selection until


further along in the analysis has many rewards, not the least of which is
the ultimate convergence to a much-improved model and the
formulation of valid and supportable scientific and engineering 1. Exploratory Data Analysis
conclusions. 1.1. EDA Introduction

1.1.7. General Problem Categories


Problem The following table is a convenient way to classify EDA problems.
Classification

Univariate
UNIVARIATE CONTROL
and Control
Data: Data:
A single column of A single column of
numbers, Y. numbers, Y.
Model: Model:
y = constant + error y = constant + error
Output: Output:
1. A number (the estimated A "yes" or "no" to the
constant in the model). question "Is the system
2. An estimate of uncertainty out of control?".
for the constant. Techniques:
3. An estimate of the ● Control Charts
distribution for the error.
Techniques:
● 4-Plot

● Probability Plot
● PPCC Plot

http://www.itl.nist.gov/div898/handbook/eda/section1/eda16.htm (5 of 5) [11/13/2003 5:31:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda17.htm (1 of 4) [11/13/2003 5:31:39 PM]


1.1.7. General Problem Categories 1.1.7. General Problem Categories

● Least Squares Fitting Techniques:


Comparative ● Contour Plot ● Least Squares Fitting
COMPARATIVE SCREENING
and ● Scatter Plot
Screening Data: Data:
● 6-Plot
A single response variable A single response variable
and k independent and k independent
variables (Y, X1, X2, ... , variables (Y, X1, X2, ... , Time Series
TIME SERIES MULTIVARIATE
Xk), primary focus is on Xk). and
one (the primary factor) of Multivariate Data: Data:
Model:
these independent A column of time k factor variables (X1, X2, ... ,
y = f(x1, x2, ..., xk) + error
variables. dependent numbers, Y. Xk).
Model: Output: In addition, time is an
1. A ranked list (from most indpendent variable. Model:
y = f(x1, x2, ..., xk) + error
important to least The time variable can The model is not explicit.
Output: important) of factors. be either explicit or Output:
A "yes" or "no" to the 2. Best settings for the implied. If the data are Identify underlying
question "Is the primary factors. not equi-spaced, the correlation structure in the
factor significant?". time variable should be data.
3. A good model/prediction
Techniques: explicitly provided.
equation relating Y to the Techniques:
● Block Plot factors. Model:
● Star Plot
● Scatter Plot Techniques: yt = f(t) + error
● Scatter Plot Matrix
● Block Plot
The model can be either
● Box Plot a time domain based or ● Conditioning Plot
● Probability Plot frequency domain ● Profile Plot
● Bihistogram based.
● Principal Components
Output:
●Clustering
Optimization A good
OPTIMIZATION REGRESSION ● Discrimination/Classification
and model/prediction
Regression equation relating Y to Note that multivarate analysis is
Data: Data:
previous values of Y. only covered lightly in this
A single response variable A single response variable Handbook.
Techniques:
and k independent and k independent
variables (Y, X1, X2, ... , variables (Y, X1, X2, ... , ● Autocorrelation Plot

Xk). Xk). The independent ● Spectrum


Model: variables can be ● Complex Demodulation
continuous. Amplitude Plot
y = f(x1, x2, ..., xk) + error
Model: ● Complex Demodulation
Output:
y = f(x1, x2, ..., xk) + error Phase Plot
Best settings for the factor
variables. Output: ● ARIMA Models
Techniques: A good model/prediction
equation relating Y to the
● Block Plot
factors.

http://www.itl.nist.gov/div898/handbook/eda/section1/eda17.htm (2 of 4) [11/13/2003 5:31:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section1/eda17.htm (3 of 4) [11/13/2003 5:31:39 PM]


1.1.7. General Problem Categories 1.2. EDA Assumptions

1. Exploratory Data Analysis

1.2. EDA Assumptions


Summary The gamut of scientific and engineering experimentation is virtually
limitless. In this sea of diversity is there any common basis that allows
the analyst to systematically and validly arrive at supportable, repeatable
research conclusions?
Fortunately, there is such a basis and it is rooted in the fact that every
measurement process, however complicated, has certain underlying
assumptions. This section deals with what those assumptions are, why
they are important, how to go about testing them, and what the
consequences are if the assumptions do not hold.

Table of 1. Underlying Assumptions


Contents for 2. Importance
Section 2
3. Testing Assumptions
4. Importance of Plots
5. Consequences

http://www.itl.nist.gov/div898/handbook/eda/section1/eda17.htm (4 of 4) [11/13/2003 5:31:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section2/eda2.htm [11/13/2003 5:31:39 PM]


1.2.1. Underlying Assumptions 1.2.1. Underlying Assumptions

Residuals Will The key point is that regardless of how many factors there are, and
Behave regardless of how complicated the function is, if the engineer succeeds
According to in choosing a good model, then the differences (residuals) between the
1. Exploratory Data Analysis Univariate raw response data and the predicted values from the fitted model
1.2. EDA Assumptions Assumptions should themselves behave like a univariate process. Furthermore, the
residuals from this univariate process fit will behave like:
● random drawings;
1.2.1. Underlying Assumptions ● from a fixed distribution;

● with fixed location (namely, 0 in this case); and


Assumptions There are four assumptions that typically underlie all measurement
● with fixed variation.
Underlying a processes; namely, that the data from the process at hand "behave
Measurement like":
Process Validation of Thus if the residuals from the fitted model do in fact behave like the
1. random drawings; Model ideal, then testing of underlying assumptions becomes a tool for the
2. from a fixed distribution; validation and quality of fit of the chosen model. On the other hand, if
3. with the distribution having fixed location; and the residuals from the chosen fitted model violate one or more of the
4. with the distribution having fixed variation. above univariate assumptions, then the chosen fitted model is
inadequate and an opportunity exists for arriving at an improved
Univariate or The "fixed location" referred to in item 3 above differs for different model.
Single problem types. The simplest problem type is univariate; that is, a
Response single variable. For the univariate problem, the general model
Variable response = deterministic component + random component
becomes
response = constant + error

Assumptions For this case, the "fixed location" is simply the unknown constant. We
for Univariate can thus imagine the process at hand to be operating under constant
Model conditions that produce a single column of data with the properties
that
● the data are uncorrelated with one another;

● the random component has a fixed distribution;

● the deterministic component consists of only a constant; and

● the random component has fixed variation.

Extrapolation The universal power and importance of the univariate model is that it
to a Function can easily be extended to the more general case where the
of Many deterministic component is not just a constant, but is in fact a function
Variables of many variables, and the engineering objective is to characterize and
model the function.

http://www.itl.nist.gov/div898/handbook/eda/section2/eda21.htm (1 of 2) [11/13/2003 5:31:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section2/eda21.htm (2 of 2) [11/13/2003 5:31:39 PM]


1.2.2. Importance 1.2.3. Techniques for Testing Assumptions

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.2. EDA Assumptions 1.2. EDA Assumptions

1.2.2. Importance 1.2.3. Techniques for Testing Assumptions


Predictability Predictability is an all-important goal in science and engineering. If the Testing Because the validity of the final scientific/engineering conclusions
and four underlying assumptions hold, then we have achieved probabilistic Underlying is inextricably linked to the validity of the underlying univariate
Statistical predictability--the ability to make probability statements not only Assumptions assumptions, it naturally follows that there is a real necessity that
Control about the process in the past, but also about the process in the future. Helps Assure the each and every one of the above four assumptions be routinely
In short, such processes are said to be "in statistical control". Validity of tested.
Scientific and
Validity of Moreover, if the four assumptions are valid, then the process is Engineering
Engineering amenable to the generation of valid scientific and engineering Conclusions
Conclusions conclusions. If the four assumptions are not valid, then the process is
drifting (with respect to location, variation, or distribution), Four Techniques The following EDA techniques are simple, efficient, and powerful
unpredictable, and out of control. A simple characterization of such to Test for the routine testing of underlying assumptions:
processes by a location estimate, a variation estimate, or a distribution Underlying 1. run sequence plot (Yi versus i)
"estimate" inevitably leads to engineering conclusions that are not Assumptions
valid, are not supportable (scientifically or legally), and which are not 2. lag plot (Yi versus Yi-1)
repeatable in the laboratory. 3. histogram (counts versus subgroups of Y)
4. normal probability plot (ordered Y versus theoretical ordered
Y)

Plot on a Single The four EDA plots can be juxtaposed for a quick look at the
Page for a characteristics of the data. The plots below are ordered as follows:
Quick 1. Run sequence plot - upper left
Characterization
2. Lag plot - upper right
of the Data
3. Histogram - lower left
4. Normal probability plot - lower right

http://www.itl.nist.gov/div898/handbook/eda/section2/eda22.htm [11/13/2003 5:31:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section2/eda23.htm (1 of 3) [11/13/2003 5:31:40 PM]


1.2.3. Techniques for Testing Assumptions 1.2.3. Techniques for Testing Assumptions

This 4-plot reveals a process that has fixed location, fixed variation,
Sample Plot: is non-random (oscillatory), has a non-normal, U-shaped
Assumptions distribution, and has several outliers.
Hold

This 4-plot reveals a process that has fixed location, fixed variation,
is random, apparently has a fixed approximately normal
distribution, and has no outliers.

Sample Plot: If one or more of the four underlying assumptions do not hold, then
Assumptions Do it will show up in the various plots as demonstrated in the following
Not Hold example.

http://www.itl.nist.gov/div898/handbook/eda/section2/eda23.htm (2 of 3) [11/13/2003 5:31:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section2/eda23.htm (3 of 3) [11/13/2003 5:31:40 PM]


1.2.4. Interpretation of 4-Plot 1.2.4. Interpretation of 4-Plot

If the normal probability plot is linear, the underlying


distribution is approximately normal.
If all four of the assumptions hold, then the process is said
definitionally to be "in statistical control".
1. Exploratory Data Analysis
1.2. EDA Assumptions

1.2.4. Interpretation of 4-Plot


Interpretation The four EDA plots discussed on the previous page are used to test the
of EDA Plots: underlying assumptions:
Flat and 1. Fixed Location:
Equi-Banded, If the fixed location assumption holds, then the run sequence
Random, plot will be flat and non-drifting.
Bell-Shaped,
and Linear 2. Fixed Variation:
If the fixed variation assumption holds, then the vertical spread
in the run sequence plot will be the approximately the same over
the entire horizontal axis.
3. Randomness:
If the randomness assumption holds, then the lag plot will be
structureless and random.
4. Fixed Distribution:
If the fixed distribution assumption holds, in particular if the
fixed normal distribution holds, then
1. the histogram will be bell-shaped, and
2. the normal probability plot will be linear.

Plots Utilized Conversely, the underlying assumptions are tested using the EDA
to Test the plots:
Assumptions ● Run Sequence Plot:
If the run sequence plot is flat and non-drifting, the
fixed-location assumption holds. If the run sequence plot has a
vertical spread that is about the same over the entire plot, then
the fixed-variation assumption holds.
● Lag Plot:
If the lag plot is structureless, then the randomness assumption
holds.
● Histogram:
If the histogram is bell-shaped, the underlying distribution is
symmetric and perhaps approximately normal.
● Normal Probability Plot:

http://www.itl.nist.gov/div898/handbook/eda/section2/eda24.htm (1 of 2) [11/13/2003 5:31:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section2/eda24.htm (2 of 2) [11/13/2003 5:31:40 PM]


1.2.5. Consequences 1.2.5.1. Consequences of Non-Randomness

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.2. EDA Assumptions 1.2. EDA Assumptions
1.2.5. Consequences

1.2.5. Consequences
1.2.5.1. Consequences of Non-Randomness
What If If some of the underlying assumptions do not hold, what can be done
Assumptions about it? What corrective actions can be taken? The positive way of Randomness There are four underlying assumptions:
Do Not Hold? approaching this is to view the testing of underlying assumptions as a Assumption 1. randomness;
framework for learning about the process. Assumption-testing 2. fixed location;
promotes insight into important aspects of the process that may not
have surfaced otherwise. 3. fixed variation; and
4. fixed distribution.
Primary Goal The primary goal is to have correct, validated, and complete The randomness assumption is the most critical but the least tested.
is Correct and scientific/engineering conclusions flowing from the analysis. This
Valid usually includes intermediate goals such as the derivation of a Consequeces of If the randomness assumption does not hold, then
Scientific good-fitting model and the computation of realistic parameter Non-Randomness 1. All of the usual statistical tests are invalid.
Conclusions estimates. It should always include the ultimate goal of an
understanding and a "feel" for "what makes the process tick". There is 2. The calculated uncertainties for commonly used statistics
no more powerful catalyst for discovery than the bringing together of become meaningless.
an experienced/expert scientist/engineer and a data set ripe with 3. The calculated minimal sample size required for a
intriguing "anomalies" and characteristics. pre-specified tolerance becomes meaningless.
4. The simple model: y = constant + error becomes invalid.
Consequences The following sections discuss in more detail the consequences of 5. The parameter estimates become suspect and
of Invalid invalid assumptions: non-supportable.
Assumptions 1. Consequences of non-randomness
2. Consequences of non-fixed location parameter Non-Randomness One specific and common type of non-randomness is
Due to autocorrelation. Autocorrelation is the correlation between Yt and
3. Consequences of non-fixed variation
Autocorrelation Yt-k, where k is an integer that defines the lag for the
4. Consequences related to distributional assumptions autocorrelation. That is, autocorrelation is a time dependent
non-randomness. This means that the value of the current point is
highly dependent on the previous point if k = 1 (or k points ago if k
is not 1). Autocorrelation is typically detected via an
autocorrelation plot or a lag plot.
If the data are not random due to autocorrelation, then
1. Adjacent data values may be related.
2. There may not be n independent snapshots of the
phenomenon under study.

http://www.itl.nist.gov/div898/handbook/eda/section2/eda25.htm [11/13/2003 5:31:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section2/eda251.htm (1 of 2) [11/13/2003 5:31:40 PM]


1.2.5.1. Consequences of Non-Randomness 1.2.5.2. Consequences of Non-Fixed Location Parameter

3. There may be undetected "junk"-outliers.


4. There may be undetected "information-rich"-outliers.

1. Exploratory Data Analysis


1.2. EDA Assumptions
1.2.5. Consequences

1.2.5.2. Consequences of Non-Fixed


Location Parameter
Location The usual estimate of location is the mean
Estimate

from N measurements Y1, Y2, ... , YN.

Consequences If the run sequence plot does not support the assumption of fixed
of Non-Fixed location, then
Location 1. The location may be drifting.
2. The single location estimate may be meaningless (if the process
is drifting).
3. The choice of location estimator (e.g., the sample mean) may be
sub-optimal.
4. The usual formula for the uncertainty of the mean:

may be invalid and the numerical value optimistically small.


5. The location estimate may be poor.
6. The location estimate may be biased.

http://www.itl.nist.gov/div898/handbook/eda/section2/eda251.htm (2 of 2) [11/13/2003 5:31:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section2/eda252.htm [11/13/2003 5:31:40 PM]


1.2.5.3. Consequences of Non-Fixed Variation Parameter 1.2.5.4. Consequences Related to Distributional Assumptions

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.2. EDA Assumptions 1.2. EDA Assumptions
1.2.5. Consequences 1.2.5. Consequences

1.2.5.3. Consequences of Non-Fixed 1.2.5.4. Consequences Related to


Variation Parameter Distributional Assumptions
Variation The usual estimate of variation is the standard deviation Distributional Scientists and engineers routinely use the mean (average) to estimate
Estimate Analysis the "middle" of a distribution. It is not so well known that the
variability and the noisiness of the mean as a location estimator are
intrinsically linked with the underlying distribution of the data. For
certain distributions, the mean is a poor choice. For any given
from N measurements Y1, Y2, ... , YN. distribution, there exists an optimal choice-- that is, the estimator
with minimum variability/noisiness. This optimal choice may be, for
example, the median, the midrange, the midmean, the mean, or
Consequences If the run sequence plot does not support the assumption of fixed something else. The implication of this is to "estimate" the
of Non-Fixed variation, then
distribution first, and then--based on the distribution--choose the
Variation 1. The variation may be drifting. optimal estimator. The resulting engineering parameter estimators
2. The single variation estimate may be meaningless (if the process will have less variability than if this approach is not followed.
variation is drifting).
3. The variation estimate may be poor. Case Studies The airplane glass failure case study gives an example of determining
4. The variation estimate may be biased. an appropriate distribution and estimating the parameters of that
distribution. The uniform random numbers case study gives an
example of determining a more appropriate centrality parameter for a
non-normal distribution.

Other consequences that flow from problems with distributional


assumptions are:

Distribution 1. The distribution may be changing.


2. The single distribution estimate may be meaningless (if the
process distribution is changing).
3. The distribution may be markedly non-normal.
4. The distribution may be unknown.
5. The true probability distribution for the error may remain
unknown.

http://www.itl.nist.gov/div898/handbook/eda/section2/eda253.htm [11/13/2003 5:31:54 PM] http://www.itl.nist.gov/div898/handbook/eda/section2/eda254.htm (1 of 2) [11/13/2003 5:31:54 PM]


1.2.5.4. Consequences Related to Distributional Assumptions 1.3. EDA Techniques

Model 1. The model may be changing.


2. The single model estimate may be meaningless.
3. The default model
1. Exploratory Data Analysis
Y = constant + error
may be invalid.
4. If the default model is insufficient, information about a better 1.3. EDA Techniques
model may remain undetected.
5. A poor deterministic model may be fit. Summary After you have collected a set of data, how do you do an exploratory
6. Information about an improved model may go undetected. data analysis? What techniques do you employ? What do the various
techniques focus on? What conclusions can you expect to reach?
Process 1. The process may be out-of-control. This section provides answers to these kinds of questions via a gallery
2. The process may be unpredictable. of EDA techniques and a detailed description of each technique. The
3. The process may be un-modelable. techniques are divided into graphical and quantitative techniques. For
exploratory data analysis, the emphasis is primarily on the graphical
techniques.

Table of 1. Introduction
Contents for 2. Analysis Questions
Section 3
3. Graphical Techniques: Alphabetical
4. Graphical Techniques: By Problem Category
5. Quantitative Techniques: Alphabetical
6. Probability Distributions

http://www.itl.nist.gov/div898/handbook/eda/section2/eda254.htm (2 of 2) [11/13/2003 5:31:54 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3.htm [11/13/2003 5:31:54 PM]


1.3.1. Introduction 1.3.2. Analysis Questions

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.3. EDA Techniques 1.3. EDA Techniques

1.3.1. Introduction 1.3.2. Analysis Questions


Graphical This section describes many techniques that are commonly used in EDA Some common questions that exploratory data analysis is used to
and exploratory and classical data analysis. This list is by no means meant Questions answer are:
Quantitative to be exhaustive. Additional techniques (both graphical and 1. What is a typical value?
Techniques quantitative) are discussed in the other chapters. Specifically, the
product comparisons chapter has a much more detailed description of 2. What is the uncertainty for a typical value?
many classical statistical techniques. 3. What is a good distributional fit for a set of numbers?
EDA emphasizes graphical techniques while classical techniques 4. What is a percentile?
emphasize quantitative techniques. In practice, an analyst typically 5. Does an engineering modification have an effect?
uses a mixture of graphical and quantitative techniques. In this section, 6. Does a factor have an effect?
we have divided the descriptions into graphical and quantitative
techniques. This is for organizational clarity and is not meant to 7. What are the most important factors?
discourage the use of both graphical and quantitiative techniques when 8. Are measurements coming from different laboratories equivalent?
analyzing data.
9. What is the best function for relating a response variable to a set
of factor variables?
Use of This section emphasizes the techniques themselves; how the graph or
Techniques test is defined, published references, and sample output. The use of the 10. What are the best settings for factors?
Shown in techniques to answer engineering questions is demonstrated in the case 11. Can we separate signal from noise in time dependent data?
Case Studies studies section. The case studies do not demonstrate all of the
12. Can we extract any structure from multivariate data?
techniques.
13. Does the data have outliers?
Availability The sample plots and output in this section were generated with the
in Software Dataplot software program. Other general purpose statistical data Analyst A critical early step in any analysis is to identify (for the engineering
analysis programs can generate most of the plots, intervals, and tests Should problem at hand) which of the above questions are relevant. That is, we
discussed here, or macros can be written to acheive the same result. Identify need to identify which questions we want answered and which questions
Relevant have no bearing on the problem at hand. After collecting such a set of
Questions questions, an equally important step, which is invaluable for maintaining
for his focus, is to prioritize those questions in decreasing order of importance.
Engineering EDA techniques are tied in with each of the questions. There are some
Problem EDA techniques (e.g., the scatter plot) that are broad-brushed and apply
almost universally. On the other hand, there are a large number of EDA
techniques that are specific and whose specificity is tied in with one of
the above questions. Clearly if one chooses not to explicitly identify
relevant questions, then one cannot take advantage of these
question-specific EDA technqiues.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda31.htm [11/13/2003 5:31:54 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda32.htm (1 of 2) [11/13/2003 5:31:54 PM]


1.3.2. Analysis Questions 1.3.3. Graphical Techniques: Alphabetic

EDA Most of these questions can be addressed by techniques discussed in this


Approach chapter. The process modeling and process improvement chapters also
Emphasizes address many of the questions above. These questions are also relevant
Graphics for the classical approach to statistics. What distinguishes the EDA 1. Exploratory Data Analysis
approach is an emphasis on graphical techniques to gain insight as 1.3. EDA Techniques
opposed to the classical approach of quantitative tests. Most data
analysts will use a mix of graphical and classical quantitative techniques
to address these problems. 1.3.3. Graphical Techniques: Alphabetic
This section provides a gallery of some useful graphical techniques. The
techniques are ordered alphabetically, so this section is not intended to
be read in a sequential fashion. The use of most of these graphical
techniques is demonstrated in the case studies in this chapter. A few of
these graphical techniques are demonstrated in later chapters.

Autocorrelation Bihistogram: Block Plot: 1.3.3.3 Bootstrap Plot:


Plot: 1.3.3.1 1.3.3.2 1.3.3.4

Box-Cox Linearity Box-Cox Box Plot: 1.3.3.7 Complex


Plot: 1.3.3.5 Normality Plot: Demodulation
1.3.3.6 Amplitude Plot:
1.3.3.8

Complex Contour Plot: DEX Scatter Plot: DEX Mean Plot:


Demodulation 1.3.3.10 1.3.3.11 1.3.3.12
Phase Plot: 1.3.3.9

http://www.itl.nist.gov/div898/handbook/eda/section3/eda32.htm (2 of 2) [11/13/2003 5:31:54 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33.htm (1 of 3) [11/13/2003 5:31:55 PM]


1.3.3. Graphical Techniques: Alphabetic 1.3.3. Graphical Techniques: Alphabetic

DEX Standard Histogram: Lag Plot: 1.3.3.15 Linear Correlation Star Plot: 1.3.3.29 Weibull Plot: Youden Plot: 4-Plot: 1.3.3.32
Deviation Plot: 1.3.3.14 Plot: 1.3.3.16 1.3.3.30 1.3.3.31
1.3.3.13

6-Plot: 1.3.3.33
Linear Intercept Linear Slope Plot: Linear Residual Mean Plot: 1.3.3.20
Plot: 1.3.3.17 1.3.3.18 Standard Deviation
Plot: 1.3.3.19

Normal Probability Probability Plot: Probability Plot Quantile-Quantile


Plot: 1.3.3.21 1.3.3.22 Correlation Plot: 1.3.3.24
Coefficient Plot:
1.3.3.23

Run Sequence Scatter Plot: Spectrum: 1.3.3.27 Standard Deviation


Plot: 1.3.3.25 1.3.3.26 Plot: 1.3.3.28

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33.htm (2 of 3) [11/13/2003 5:31:55 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33.htm (3 of 3) [11/13/2003 5:31:55 PM]


1.3.3.1. Autocorrelation Plot 1.3.3.1. Autocorrelation Plot

Definition: Autocorrelation plots are formed by


r(h) versus h ● Vertical axis: Autocorrelation coefficient

1. Exploratory Data Analysis


1.3. EDA Techniques where Ch is the autocovariance function
1.3.3. Graphical Techniques: Alphabetic

1.3.3.1. Autocorrelation Plot


and C0 is the variance function
Purpose: Autocorrelation plots (Box and Jenkins, pp. 28-32) are a
Check commonly-used tool for checking randomness in a data set. This
Randomness randomness is ascertained by computing autocorrelations for data
values at varying time lags. If random, such autocorrelations should
be near zero for any and all time-lag separations. If non-random, Note--Rh is between -1 and +1.
then one or more of the autocorrelations will be significantly ● Horizontal axis: Time lag h (h = 1, 2, 3, ...)
non-zero. ● The above line also contains several horizontal reference
In addition, autocorrelation plots are used in the model identification lines. The middle line is at zero. The other four lines are 95%
stage for Box-Jenkins autoregressive, moving average time series and 99% confidence bands. Note that there are two distinct
formulas for generating the confidence bands.
models.
1. If the autocorrelation plot is being used to test for
Sample Plot: randomness (i.e., there is no time dependence in the
Autocorrelations data), the following formula is recommended:
should be
near-zero for
randomness.
Such is not the where N is the sample size, z is the percent point
case in this function of the standard normal distribution and is
example and the. significance level. In this case, the confidence
thus the bands have fixed width that depends on the sample
randomness size. This is the formula that was used to generate the
assumption fails confidence bands in the above plot.
2. Autocorrelation plots are also used in the model
identification stage for fitting ARIMA models. In this
case, a moving average model is assumed for the data
and the following confidence bands should be
generated:

This sample autocorrelation plot shows that the time series is not
random, but rather has a high degree of autocorrelation between
adjacent and near-adjacent observations.
where k is the lag, N is the sample size, z is the percent

http://www.itl.nist.gov/div898/handbook/eda/section3/eda331.htm (1 of 4) [11/13/2003 5:31:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda331.htm (2 of 4) [11/13/2003 5:31:56 PM]


1.3.3.1. Autocorrelation Plot 1.3.3.1. Autocorrelation Plot

point function of the standard normal distribution and In short, if the analyst does not check for randomness, then the
is. the significance level. In this case, the confidence validity of many of the statistical conclusions becomes suspect. The
bands increase as the lag increases. autocorrelation plot is an excellent way of checking for such
randomness.
Questions The autocorrelation plot can provide answers to the following
questions: Examples Examples of the autocorrelation plot for several common situations
1. Are the data random? are given in the following pages.
2. Is an observation related to an adjacent observation? 1. Random (= White Noise)
3. Is an observation related to an observation twice-removed? 2. Weak autocorrelation
(etc.)
3. Strong autocorrelation and autoregressive model
4. Is the observed time series white noise?
4. Sinusoidal model
5. Is the observed time series sinusoidal?
6. Is the observed time series autoregressive? Related Partial Autocorrelation Plot
7. What is an appropriate model for the observed time series? Techniques Lag Plot
8. Is the model Spectral Plot
Y = constant + error Seasonal Subseries Plot
valid and sufficient?
Case Study The autocorrelation plot is demonstrated in the beam deflection data
9. Is the formula valid? case study.

Importance: Randomness (along with fixed model, fixed variation, and fixed Software Autocorrelation plots are available in most general purpose
Ensure validity distribution) is one of the four assumptions that typically underlie all statistical software programs including Dataplot.
of engineering measurement processes. The randomness assumption is critically
conclusions important for the following three reasons:
1. Most standard statistical tests depend on randomness. The
validity of the test conclusions is directly linked to the
validity of the randomness assumption.
2. Many commonly-used statistical formulae depend on the
randomness assumption, the most common formula being the
formula for determining the standard deviation of the sample
mean:

where is the standard deviation of the data. Although


heavily used, the results from using this formula are of no
value unless the randomness assumption holds.
3. For univariate data, the default model is
Y = constant + error
If the data are not random, this model is incorrect and invalid,
and the estimates for the parameters (such as the constant)
become nonsensical and invalid.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda331.htm (3 of 4) [11/13/2003 5:31:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda331.htm (4 of 4) [11/13/2003 5:31:56 PM]


1.3.3.1.1. Autocorrelation Plot: Random Data 1.3.3.1.1. Autocorrelation Plot: Random Data

Discussion Note that with the exception of lag 0, which is always 1 by


definition, almost all of the autocorrelations fall within the 95%
confidence limits. In addition, there is no apparent pattern (such as
1. Exploratory Data Analysis the first twenty-five being positive and the second twenty-five being
1.3. EDA Techniques negative). This is the abscence of a pattern we expect to see if the
1.3.3. Graphical Techniques: Alphabetic data are in fact random.
1.3.3.1. Autocorrelation Plot
A few lags slightly outside the 95% and 99% confidence limits do
not neccessarily indicate non-randomness. For a 95% confidence
1.3.3.1.1. Autocorrelation Plot: Random interval, we might expect about one out of twenty lags to be
statistically significant due to random fluctuations.
Data There is no associative ability to infer from a current value Yi as to
what the next value Yi+1 will be. Such non-association is the essense
Autocorrelation The following is a sample autocorrelation plot.
of randomness. In short, adjacent observations do not "co-relate", so
Plot
we call this the "no autocorrelation" case.

Conclusions We can make the following conclusions from this plot.


1. There are no significant autocorrelations.
2. The data are random.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3311.htm (1 of 2) [11/13/2003 5:31:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3311.htm (2 of 2) [11/13/2003 5:31:56 PM]


1.3.3.1.2. Autocorrelation Plot: Moderate Autocorrelation 1.3.3.1.2. Autocorrelation Plot: Moderate Autocorrelation

Recommended The next step would be to estimate the parameters for the
Next Step autoregressive model:

1. Exploratory Data Analysis Such estimation can be performed by using least squares linear
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
regression or by fitting a Box-Jenkins autoregressive (AR) model.
1.3.3.1. Autocorrelation Plot
The randomness assumption for least squares fitting applies to the
residuals of the model. That is, even though the original data exhibit
randomness, the residuals after fitting Yi against Yi-1 should result in
1.3.3.1.2. Autocorrelation Plot: Moderate random residuals. Assessing whether or not the proposed model in
Autocorrelation fact sufficiently removed the randomness is discussed in detail in the
Process Modeling chapter.
Autocorrelation The following is a sample autocorrelation plot. The residual standard deviation for this autoregressive model will be
Plot much smaller than the residual standard deviation for the default
model

Conclusions We can make the following conclusions from this plot.


1. The data come from an underlying autoregressive model with
moderate positive autocorrelation.

Discussion The plot starts with a moderately high autocorrelation at lag 1


(approximately 0.75) that gradually decreases. The decreasing
autocorrelation is generally linear, but with significant noise. Such a
pattern is the autocorrelation plot signature of "moderate
autocorrelation", which in turn provides moderate predictability if
modeled properly.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3312.htm (1 of 2) [11/13/2003 5:31:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3312.htm (2 of 2) [11/13/2003 5:31:56 PM]


1.3.3.1.3. Autocorrelation Plot: Strong Autocorrelation and Autoregressive Model 1.3.3.1.3. Autocorrelation Plot: Strong Autocorrelation and Autoregressive Model

Discussion The plot starts with a high autocorrelation at lag 1 (only slightly less
than 1) that slowly declines. It continues decreasing until it becomes
negative and starts showing an incresing negative autocorrelation.
1. Exploratory Data Analysis The decreasing autocorrelation is generally linear with little noise.
1.3. EDA Techniques Such a pattern is the autocorrelation plot signature of "strong
1.3.3. Graphical Techniques: Alphabetic autocorrelation", which in turn provides high predictability if
1.3.3.1. Autocorrelation Plot modeled properly.

Recommended The next step would be to estimate the parameters for the
1.3.3.1.3. Autocorrelation Plot: Strong Next Step autoregressive model:
Autocorrelation and
Such estimation can be performed by using least squares linear
Autoregressive Model regression or by fitting a Box-Jenkins autoregressive (AR) model.

Autocorrelation The following is a sample autocorrelation plot. The randomness assumption for least squares fitting applies to the
Plot for Strong residuals of the model. That is, even though the original data exhibit
Autocorrelation randomness, the residuals after fitting Yi against Yi-1 should result in
random residuals. Assessing whether or not the proposed model in
fact sufficiently removed the randomness is discussed in detail in the
Process Modeling chapter.
The residual standard deviation for this autoregressive model will be
much smaller than the residual standard deviation for the default
model

Conclusions We can make the following conclusions from the above plot.
1. The data come from an underlying autoregressive model with
strong positive autocorrelation.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3313.htm (1 of 2) [11/13/2003 5:31:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3313.htm (2 of 2) [11/13/2003 5:31:56 PM]


1.3.3.1.4. Autocorrelation Plot: Sinusoidal Model 1.3.3.1.4. Autocorrelation Plot: Sinusoidal Model

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.1. Autocorrelation Plot

1.3.3.1.4. Autocorrelation Plot: Sinusoidal


Model
Autocorrelation The following is a sample autocorrelation plot.
Plot for
Sinusoidal
Model

Conclusions We can make the following conclusions from the above plot.
1. The data come from an underlying sinusoidal model.

Discussion The plot exhibits an alternating sequence of positive and negative


spikes. These spikes are not decaying to zero. Such a pattern is the
autocorrelation plot signature of a sinusoidal model.

Recommended The beam deflection case study gives an example of modeling a


Next Step sinusoidal model.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3314.htm (1 of 2) [11/13/2003 5:31:57 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3314.htm (2 of 2) [11/13/2003 5:31:57 PM]


1.3.3.2. Bihistogram 1.3.3.2. Bihistogram

factor has a significant effect on the location (typical value) for strength
and hence batch is said to be "significant" or to "have an effect". We
thus see graphically and convincingly what a t-test or analysis of
variance would indicate quantitatively.
1. Exploratory Data Analysis With respect to variation, note that the spread (variation) of the
1.3. EDA Techniques
above-axis batch 1 histogram does not appear to be that much different
1.3.3. Graphical Techniques: Alphabetic
from the below-axis batch 2 histogram. With respect to distributional
shape, note that the batch 1 histogram is skewed left while the batch 2
1.3.3.2. Bihistogram histogram is more symmetric with even a hint of a slight skewness to
the right.

Purpose: The bihistogram is an EDA tool for assessing whether a Thus the bihistogram reveals that there is a clear difference between the
Check for a before-versus-after engineering modification has caused a change in batches with respect to location and distribution, but not in regard to
change in ● location;
variation. Comparing batch 1 and batch 2, we also note that batch 1 is
location, the "better batch" due to its 100-unit higher average strength (around
● variation; or
variation, or 725).
distribution ● distribution.

It is a graphical alternative to the two-sample t-test. The bihistogram Definition: Bihistograms are formed by vertically juxtaposing two histograms:
can be more powerful than the t-test in that all of the distributional Two ● Above the axis: Histogram of the response variable for condition
features (location, scale, skewness, outliers) are evident on a single plot. adjoined 1
It is also based on the common and well-understood histogram. histograms
● Below the axis: Histogram of the response variable for condition
2
Sample Plot:
This Questions The bihistogram can provide answers to the following questions:
bihistogram 1. Is a (2-level) factor significant?
reveals that
there is a 2. Does a (2-level) factor have an effect?
significant 3. Does the location change between the 2 subgroups?
difference in 4. Does the variation change between the 2 subgroups?
ceramic 5. Does the distributional shape change between subgroups?
breaking
strength 6. Are there any outliers?
between
batch 1 Importance: The bihistogram is an important EDA tool for determining if a factor
(above) and Checks 3 out "has an effect". Since the bihistogram provides insight into the validity
batch 2 of the 4 of three (location, variation, and distribution) out of the four (missing
(below) underlying only randomness) underlying assumptions in a measurement process, it
assumptions is an especially valuable tool. Because of the dual (above/below) nature
of a of the plot, the bihistogram is restricted to assessing factors that have
measurement only two levels. However, this is very common in the
From the above bihistogram, we can see that batch 1 is centered at a process before-versus-after character of many scientific and engineering
ceramic strength value of approximately 725 while batch 2 is centered experiments.
at a ceramic strength value of approximately 625. That indicates that
these batches are displaced by about 100 strength units. Thus the batch

http://www.itl.nist.gov/div898/handbook/eda/section3/eda332.htm (1 of 3) [11/13/2003 5:31:57 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda332.htm (2 of 3) [11/13/2003 5:31:57 PM]


1.3.3.2. Bihistogram 1.3.3.3. Block Plot

Related t test (for shift in location)


Techniques F test (for shift in variation)
Kolmogorov-Smirnov test (for shift in distribution)
Quantile-quantile plot (for shift in location and distribution) 1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
Case Study The bihistogram is demonstrated in the ceramic strength data case
study.
1.3.3.3. Block Plot
Software The bihistogram is not widely available in general purpose statistical
software programs. Bihistograms can be generated using Dataplot
Purpose: The block plot (Filliben 1993) is an EDA tool for assessing whether the
Check to factor of interest (the primary factor) has a statistically significant effect
determine if on the response, and whether that conclusion about the primary factor
a factor of effect is valid robustly over all other nuisance or secondary factors in
interest has the experiment.
an effect
robust over It replaces the analysis of variance test with a less
all other assumption-dependent binomial test and should be routinely used
factors whenever we are trying to robustly decide whether a primary factor has
an effect.

Sample
Plot:
Weld
method 2 is
lower
(better) than
weld method
1 in 10 of 12
cases

This block plot reveals that in 10 of the 12 cases (bars), weld method 2
is lower (better) than weld method 1. From a binomial point of view,
weld method is statistically significant.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda332.htm (3 of 3) [11/13/2003 5:31:57 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda333.htm (1 of 4) [11/13/2003 5:31:57 PM]


1.3.3.3. Block Plot 1.3.3.3. Block Plot

Definition Block Plots are formed as follows: Setting 2 is In the block plot for the first bar (plant 1, speed 1, shift 1), weld method
● Vertical axis: Response variable Y better than 1 yields about 28 defects per hour while weld method 2 yields about 22
setting 1 in defects per hour--hence the difference for this combination is about 6
● Horizontal axis: All combinations of all levels of all nuisance
10 out of 12 defects per hour and weld method 2 is seen to be better (smaller number
(secondary) factors X1, X2, ...
cases of defects per hour).
● Plot Character: Levels of the primary factor XP
Is "weld method 2 is better than weld method 1" a general conclusion?
Discussion: Average number of defective lead wires per hour from a study with four For the second bar (plant 1, speed 1, shift 2), weld method 1 is about 37
Primary factors, while weld method 2 is only about 18. Thus weld method 2 is again seen
factor is 1. weld strength (2 levels) to be better than weld method 1. Similarly for bar 3 (plant 1, speed 1,
denoted by shift 3), we see weld method 2 is smaller than weld method 1. Scanning
2. plant (2 levels)
plot over all of the 12 bars, we see that weld method 2 is smaller than weld
character: 3. speed (2 levels)
method 1 in 10 of the 12 cases, which is highly suggestive of a robust
within-bar 4. shift (3 levels) weld method effect.
plot are shown in the plot above. Weld strength is the primary factor and the
character. other three factors are nuisance factors. The 12 distinct positions along An event What is the chance of 10 out of 12 happening by chance? This is
the horizontal axis correspond to all possible combinations of the three with chance probabilistically equivalent to testing whether a coin is fair by flipping it
nuisance factors, i.e., 12 = 2 plants x 2 speeds x 3 shifts. These 12 probability and getting 10 heads in 12 tosses. The chance (from the binomial
conditions provide the framework for assessing whether any conclusions of only 2% distribution) of getting 10 (or more extreme: 11, 12) heads in 12 flips of
about the 2 levels of the primary factor (weld method) can truly be
a fair coin is about 2%. Such low-probability events are usually rejected
called "general conclusions". If we find that one weld method setting
as untenable and in practice we would conclude that there is a difference
does better (smaller average defects per hour) than the other weld
in weld methods.
method setting for all or most of these 12 nuisance factor combinations,
then the conclusion is in fact general and robust.
Advantage: The advantages of the block plot are as follows:
Graphical ● A quantitative procedure (analysis of variance) is replaced by a
Ordering In the above chart, the ordering along the horizontal axis is as follows:
and graphical procedure.
along the ● The left 6 bars are from plant 1 and the right 6 bars are from plant binomial
horizontal 2. ● An F-test (analysis of variance) is replaced with a binomial test,
axis which requires fewer assumptions.
● The first 3 bars are from speed 1, the next 3 bars are from speed
2, the next 3 bars are from speed 1, and the last 3 bars are from
speed 2. Questions The block plot can provide answers to the following questions:
● Bars 1, 4, 7, and 10 are from the first shift, bars 2, 5, 8, and 11 are
1. Is the factor of interest significant?
from the second shift, and bars 3, 6, 9, and 12 are from the third 2. Does the factor of interest have an effect?
shift. 3. Does the location change between levels of the primary factor?
4. Has the process improved?
5. What is the best setting (= level) of the primary factor?
6. How much of an average improvement can we expect with this
best setting of the primary factor?
7. Is there an interaction between the primary factor and one or more
nuisance factors?
8. Does the effect of the primary factor change depending on the
setting of some nuisance factor?

http://www.itl.nist.gov/div898/handbook/eda/section3/eda333.htm (2 of 4) [11/13/2003 5:31:57 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda333.htm (3 of 4) [11/13/2003 5:31:57 PM]


1.3.3.3. Block Plot 1.3.3.4. Bootstrap Plot

9. Are there any outliers?

Importance: The block plot is a graphical technique that pointedly focuses on


Robustly whether or not the primary factor conclusions are in fact robustly
checks the general. This question is fundamentally different from the generic 1. Exploratory Data Analysis
significance multi-factor experiment question where the analyst asks, "What factors 1.3. EDA Techniques
of the factor are important and what factors are not" (a screening problem)? Global 1.3.3. Graphical Techniques: Alphabetic
of interest data analysis techniques, such as analysis of variance, can potentially be
improved by local, focused data analysis techniques that take advantage
of this difference. 1.3.3.4. Bootstrap Plot
Related t test (for shift in location for exactly 2 levels)
Purpose: The bootstrap (Efron and Gong) plot is used to estimate the uncertainty
Techniques ANOVA (for shift in location for 2 or more levels) Estimate of a statistic.
Bihistogram (for shift in location, variation, and distribution for exactly uncertainty
2 levels).
Generate To generate a bootstrap uncertainty estimate for a given statistic from a
Case Study The block plot is demonstrated in the ceramic strength data case study. subsamples set of data, a subsample of a size less than or equal to the size of the data
with set is generated from the data, and the statistic is calculated. This
Software Block plots can be generated with the Dataplot software program. They replacement subsample is generated with replacement so that any data point can be
are not currently available in other statistical software programs. sampled multiple times or not sampled at all. This process is repeated
for many subsamples, typically between 500 and 1000. The computed
values for the statistic form an estimate of the sampling distribution of
the statistic.
For example, to estimate the uncertainty of the median from a dataset
with 50 elements, we generate a subsample of 50 elements and calculate
the median. This is repeated at least 500 times so that we have at least
500 values for the median. Although the number of bootstrap samples to
use is somewhat arbitrary, 500 subsamples is usually sufficient. To
calculate a 90% confidence interval for the median, the sample medians
are sorted into ascending order and the value of the 25th median
(assuming exactly 500 subsamples were taken) is the lower confidence
limit while the value of the 475th median (assuming exactly 500
subsamples were taken) is the upper confidence limit.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda333.htm (4 of 4) [11/13/2003 5:31:57 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda334.htm (1 of 3) [11/13/2003 5:31:58 PM]


1.3.3.4. Bootstrap Plot 1.3.3.4. Bootstrap Plot

Sample Importance The most common uncertainty calculation is generating a confidence


Plot: interval for the mean. In this case, the uncertainty formula can be
derived mathematically. However, there are many situations in which
the uncertainty formulas are mathematically intractable. The bootstrap
provides a method for calculating the uncertainty in these cases.

Cautuion on The bootstrap is not appropriate for all distributions and statistics (Efron
use of the and Tibrashani). For example, because of the shape of the uniform
bootstrap distribution, the bootstrap is not appropriate for estimating the
distribution of statistics that are heavily dependent on the tails, such as
the range.

Related Histogram
Techniques Jackknife
The jacknife is a technique that is closely related to the bootstrap. The
jackknife is beyond the scope of this handbook. See the Efron and Gong
article for a discussion of the jackknife.
This bootstrap plot was generated from 500 uniform random numbers.
Bootstrap plots and corresponding histograms were generated for the Case Study The bootstrap plot is demonstrated in the uniform random numbers case
mean, median, and mid-range. The histograms for the corresponding study.
statistics clearly show that for uniform random numbers the mid-range
has the smallest variance and is, therefore, a superior location estimator Software The bootstrap is becoming more common in general purpose statistical
to the mean or the median. software programs. However, it is still not supported in many of these
programs. Dataplot supports a bootstrap capability.
Definition The bootstrap plot is formed by:
● Vertical axis: Computed value of the desired statistic for a given
subsample.
● Horizontal axis: Subsample number.

The bootstrap plot is simply the computed value of the statistic versus
the subsample number. That is, the bootstrap plot generates the values
for the desired statistic. This is usually immediately followed by a
histogram or some other distributional plot to show the location and
variation of the sampling distribution of the statistic.

Questions The bootstrap plot is used to answer the following questions:


● What does the sampling distribution for the statistic look like?

● What is a 95% confidence interval for the statistic?

● Which statistic has a sampling distribution with the smallest


variance? That is, which statistic generates the narrowest
confidence interval?

http://www.itl.nist.gov/div898/handbook/eda/section3/eda334.htm (2 of 3) [11/13/2003 5:31:58 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda334.htm (3 of 3) [11/13/2003 5:31:58 PM]


1.3.3.5. Box-Cox Linearity Plot 1.3.3.5. Box-Cox Linearity Plot

Sample Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.5. Box-Cox Linearity Plot


Purpose: When performing a linear fit of Y against X, an appropriate
Find the transformation of X can often significantly improve the fit. The
transformation Box-Cox transformation (Box and Cox, 1964) is a particularly useful
of the X family of transformations. It is defined as:
variable that
maximizes the
correlation where X is the variable being transformed and is the transformation
between a Y parameter. For = 0, the natural log of the data is taken instead of
and an X The plot of the original data with the predicted values from a linear fit
using the above formula.
variable indicate that a quadratic fit might be preferable. The Box-Cox
The Box-Cox linearity plot is a plot of the correlation between Y and linearity plot shows a value of = 2.0. The plot of the transformed
the transformed X for given values of . That is, is the coordinate data with the predicted values from a linear fit with the transformed
for the horizontal axis variable and the value of the correlation data shows a better fit (verified by the significant reduction in the
between Y and the transformed X is the coordinate for the vertical residual standard deviation).
axis of the plot. The value of corresponding to the maximum
correlation (or minimum for negative correlation) on the plot is then Definition Box-Cox linearity plots are formed by
the optimal choice for . ● Vertical axis: Correlation coefficient from the transformed X
and Y
Transforming X is used to improve the fit. The Box-Cox
transformation applied to Y can be used as the basis for meeting the ● Horizontal axis: Value for
error assumptions. That case is not covered here. See page 225 of
(Draper and Smith, 1981) or page 77 of (Ryan, 1997) for a discussion Questions The Box-Cox linearity plot can provide answers to the following
questions:
of this case.
1. Would a suitable transformation improve my fit?
2. What is the optimal value of the transformation parameter?

Importance: Transformations can often significantly improve a fit. The Box-Cox


Find a linearity plot provides a convenient way to find a suitable
suitable transformation without engaging in a lot of trial and error fitting.
transformation

Related Linear Regression


Techniques Box-Cox Normality Plot

http://www.itl.nist.gov/div898/handbook/eda/section3/eda335.htm (1 of 3) [11/13/2003 5:31:58 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda335.htm (2 of 3) [11/13/2003 5:31:58 PM]


1.3.3.5. Box-Cox Linearity Plot 1.3.3.6. Box-Cox Normality Plot

Case Study The Box-Cox linearity plot is demonstrated in the Alaska pipeline
data case study.

Software Box-Cox linearity plots are not a standard part of most general 1. Exploratory Data Analysis
purpose statistical software programs. However, the underlying 1.3. EDA Techniques
technique is based on a transformation and computing a correlation 1.3.3. Graphical Techniques: Alphabetic
coefficient. So if a statistical program supports these capabilities,
writing a macro for a Box-Cox linearity plot should be feasible.
Dataplot supports a Box-Cox linearity plot directly. 1.3.3.6. Box-Cox Normality Plot
Purpose: Many statistical tests and intervals are based on the assumption of
Find normality. The assumption of normality often leads to tests that are
transformation simple, mathematically tractable, and powerful compared to tests that
to normalize do not make the normality assumption. Unfortunately, many real data
data sets are in fact not approximately normal. However, an appropriate
transformation of a data set can often yield a data set that does follow
approximately a normal distribution. This increases the applicability
and usefulness of statistical techniques based on the normality
assumption.
The Box-Cox transformation is a particulary useful family of
transformations. It is defined as:

where Y is the response variable and is the transformation


parameter. For = 0, the natural log of the data is taken instead of
using the above formula.
Given a particular transformation such as the Box-Cox transformation
defined above, it is helpful to define a measure of the normality of the
resulting transformation. One measure is to compute the correlation
coefficient of a normal probability plot. The correlation is computed
between the vertical and horizontal axis variables of the probability
plot and is a convenient measure of the linearity of the probability plot
(the more linear the probability plot, the better a normal distribution
fits the data).
The Box-Cox normality plot is a plot of these correlation coefficients
for various values of the parameter. The value of corresponding
to the maximum correlation on the plot is then the optimal choice for
.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda335.htm (3 of 3) [11/13/2003 5:31:58 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda336.htm (1 of 3) [11/13/2003 5:31:58 PM]


1.3.3.6. Box-Cox Normality Plot 1.3.3.6. Box-Cox Normality Plot

Sample Plot Related Normal Probability Plot


Techniques Box-Cox Linearity Plot

Software Box-Cox normality plots are not a standard part of most general
purpose statistical software programs. However, the underlying
technique is based on a normal probability plot and computing a
correlation coefficient. So if a statistical program supports these
capabilities, writing a macro for a Box-Cox normality plot should be
feasible. Dataplot supports a Box-Cox normality plot directly.

The histogram in the upper left-hand corner shows a data set that has
significant right skewness (and so does not follow a normal
distribution). The Box-Cox normality plot shows that the maximum
value of the correlation coefficient is at = -0.3. The histogram of the
data after applying the Box-Cox transformation with = -0.3 shows a
data set for which the normality assumption is reasonable. This is
verified with a normal probability plot of the transformed data.

Definition Box-Cox normality plots are formed by:


● Vertical axis: Correlation coefficient from the normal
probability plot after applying Box-Cox transformation
● Horizontal axis: Value for

Questions The Box-Cox normality plot can provide answers to the following
questions:
1. Is there a transformation that will normalize my data?
2. What is the optimal value of the transformation parameter?

Importance: Normality assumptions are critical for many univariate intervals and
Normalization hypothesis tests. It is important to test the normality assumption. If the
Improves data are in fact clearly not normal, the Box-Cox normality plot can
Validity of often be used to find a transformation that will approximately
Tests normalize the data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda336.htm (2 of 3) [11/13/2003 5:31:58 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda336.htm (3 of 3) [11/13/2003 5:31:58 PM]


1.3.3.7. Box Plot 1.3.3.7. Box Plot

Definition Box plots are formed by


Vertical axis: Response variable
Horizontal axis: The factor of interest
1. Exploratory Data Analysis More specifically, we
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1. Calculate the median and the quartiles (the lower quartile is the
25th percentile and the upper quartile is the 75th percentile).
2. Plot a symbol at the median (or draw a line) and draw a box
1.3.3.7. Box Plot (hence the name--box plot) between the lower and upper
quartiles; this box represents the middle 50% of the data--the
Purpose: Box plots (Chambers 1983) are an excellent tool for conveying location "body" of the data.
Check and variation information in data sets, particularly for detecting and 3. Draw a line from the lower quartile to the minimum point and
location and illustrating location and variation changes between different groups of another line from the upper quartile to the maximum point.
variation data. Typically a symbol is drawn at these minimum and maximum
shifts points, although this is optional.
Thus the box plot identifies the middle 50% of the data, the median, and
Sample the extreme points.
Plot:
This box Single or A single box plot can be drawn for one batch of data with no distinct
plot reveals multiple box groups. Alternatively, multiple box plots can be drawn together to
that plots can be compare multiple data sets or to compare groups in a single data set. For
machine has drawn a single box plot, the width of the box is arbitrary. For multiple box
a significant plots, the width of the box plot can be set proportional to the number of
effect on points in the given group or sample (some software implementations of
energy with the box plot simply set all the boxes to the same width).
respect to
location and Box plots There is a useful variation of the box plot that more specifically
possibly with fences identifies outliers. To create this variation:
variation
1. Calculate the median and the lower and upper quartiles.
2. Plot a symbol at the median and draw a box between the lower
and upper quartiles.
3. Calculate the interquartile range (the difference between the upper
and lower quartile) and call it IQ.
This box plot, comparing four machines for energy output, shows that 4. Calculate the following points:
machine has a significant effect on energy with respect to both location L1 = lower quartile - 1.5*IQ
and variation. Machine 3 has the highest energy response (about 72.5); L2 = lower quartile - 3.0*IQ
machine 4 has the least variable energy response with about 50% of its U1 = upper quartile + 1.5*IQ
readings being within 1 energy unit. U2 = upper quartile + 3.0*IQ
5. The line from the lower quartile to the minimum is now drawn
from the lower quartile to the smallest point that is greater than
L1. Likewise, the line from the upper quartile to the maximum is
now drawn to the largest point smaller than U1.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda337.htm (1 of 3) [11/13/2003 5:31:58 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda337.htm (2 of 3) [11/13/2003 5:31:58 PM]


1.3.3.7. Box Plot 1.3.3.8. Complex Demodulation Amplitude Plot

6. Points between L1 and L2 or between U1 and U2 are drawn as


small circles. Points less than L2 or greater than U2 are drawn as
large circles.

Questions The box plot can provide answers to the following questions:
1. Is a factor significant?
2. Does the location differ between subgroups?
3. Does the variation differ between subgroups? 1. Exploratory Data Analysis
1.3. EDA Techniques
4. Are there any outliers? 1.3.3. Graphical Techniques: Alphabetic

Importance: The box plot is an important EDA tool for determining if a factor has a
Check the significant effect on the response with respect to either location or 1.3.3.8. Complex Demodulation Amplitude
significance variation.
of a factor
The box plot is also an effective tool for summarizing large quantities of
Plot
information.
Purpose: In the frequency analysis of time series models, a common model is the
Detect sinusoidal model:
Related Mean Plot
Changing
Techniques Analysis of Variance Amplitude in
Sinusoidal In this equation, is the amplitude, is the phase shift, and is the
Case Study The box plot is demonstrated in the ceramic strength data case study. Models dominant frequency. In the above model, and are constant, that is
they do not vary with time, ti.
Software Box plots are available in most general purpose statistical software
programs, including Dataplot. The complex demodulation amplitude plot (Granger, 1964) is used to
determine if the assumption of constant amplitude is justifiable. If the
slope of the complex demodulation amplitude plot is zero, then the
above model is typically replaced with the model:

where is some type of linear model fit with standard least squares.
The most common case is a linear fit, that is the model becomes

Quadratic models are sometimes used. Higher order models are


relatively rare.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda337.htm (3 of 3) [11/13/2003 5:31:58 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda338.htm (1 of 3) [11/13/2003 5:31:59 PM]


1.3.3.8. Complex Demodulation Amplitude Plot 1.3.3.8. Complex Demodulation Amplitude Plot

Sample Importance: As stated previously, in the frequency analysis of time series models, a
Plot: Assumption common model is the sinusoidal model:
Checking

In this equation, is assumed to be constant, that is it does not vary


with time. It is important to check whether or not this assumption is
reasonable.
The complex demodulation amplitude plot can be used to verify this
assumption. If the slope of this plot is essentially zero, then the
assumption of constant amplitude is justified. If it is not, should be
replaced with some type of time-varying model. The most common
cases are linear (B0 + B1*t) and quadratic (B0 + B1*t + B2*t2).

Related Spectral Plot


Techniques Complex Demodulation Phase Plot
Non-Linear Fitting

This complex demodulation amplitude plot shows that: Case Study The complex demodulation amplitude plot is demonstrated in the beam
● the amplitude is fixed at approximately 390; deflection data case study.
● there is a start-up effect; and

● there is a change in amplitude at around x = 160 that should be


Software Complex demodulation amplitude plots are available in some, but not
investigated for an outlier. most, general purpose statistical software programs. Dataplot supports
complex demodulation amplitude plots.
Definition: The complex demodulation amplitude plot is formed by:
● Vertical axis: Amplitude

● Horizontal axis: Time

The mathematical computations for determining the amplitude are


beyond the scope of the Handbook. Consult Granger (Granger, 1964)
for details.

Questions The complex demodulation amplitude plot answers the following


questions:
1. Does the amplitude change over time?
2. Are there any outliers that need to be investigated?
3. Is the amplitude different at the beginning of the series (i.e., is
there a start-up effect)?

http://www.itl.nist.gov/div898/handbook/eda/section3/eda338.htm (2 of 3) [11/13/2003 5:31:59 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda338.htm (3 of 3) [11/13/2003 5:31:59 PM]


1.3.3.9. Complex Demodulation Phase Plot 1.3.3.9. Complex Demodulation Phase Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.9. Complex Demodulation Phase Plot


Purpose: As stated previously, in the frequency analysis of time series models, a
Improve the common model is the sinusoidal model:
estimate of
frequency in This complex demodulation phase plot shows that:
sinusoidal In this equation, is the amplitude, is the phase shift, and is the
● the specified demodulation frequency is incorrect;
time series dominant frequency. In the above model, and are constant, that is
models ● the demodulation frequency should be increased.
they do not vary with time ti.

The complex demodulation phase plot (Granger, 1964) is used to Definition The complex demodulation phase plot is formed by:
improve the estimate of the frequency (i.e., ) in this model. ● Vertical axis: Phase

● Horizontal axis: Time


If the complex demodulation phase plot shows lines sloping from left to
right, then the estimate of the frequency should be increased. If it shows The mathematical computations for the phase plot are beyond the scope
lines sloping right to left, then the frequency should be decreased. If of the Handbook. Consult Granger (Granger, 1964) for details.
there is essentially zero slope, then the frequency estimate does not need
to be modified. Questions The complex demodulation phase plot answers the following question:
Is the specified demodulation frequency correct?
Sample
Plot:
Importance The non-linear fitting for the sinusoidal model:
of a Good
Initial
Estimate for is usually quite sensitive to the choice of good starting values. The
the initial estimate of the frequency, , is obtained from a spectral plot. The
Frequency complex demodulation phase plot is used to assess whether this estimate
is adequate, and if it is not, whether it should be increased or decreased.
Using the complex demodulation phase plot with the spectral plot can
significantly improve the quality of the non-linear fits obtained.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda339.htm (1 of 3) [11/13/2003 5:31:59 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda339.htm (2 of 3) [11/13/2003 5:31:59 PM]


1.3.3.9. Complex Demodulation Phase Plot 1.3.3.10. Contour Plot

Related Spectral Plot


Techniques Complex Demodulation Phase Plot
Non-Linear Fitting

Case Study The complex demodulation amplitude plot is demonstrated in the beam
deflection data case study.
1. Exploratory Data Analysis
1.3. EDA Techniques
Software Complex demodulation phase plots are available in some, but not most,
1.3.3. Graphical Techniques: Alphabetic
general purpose statistical software programs. Dataplot supports
complex demodulation phase plots.
1.3.3.10. Contour Plot
Purpose: A contour plot is a graphical technique for representing a
Display 3-d 3-dimensional surface by plotting constant z slices, called contours, on
surface on a 2-dimensional format. That is, given a value for z, lines are drawn for
2-d plot connecting the (x,y) coordinates where that z value occurs.
The contour plot is an alternative to a 3-D surface plot.

Sample Plot:

This contour plot shows that the surface is symmetric and peaks in the
center.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda339.htm (3 of 3) [11/13/2003 5:31:59 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33a.htm (1 of 3) [11/13/2003 5:31:59 PM]


1.3.3.10. Contour Plot 1.3.3.10. Contour Plot

Definition The contour plot is formed by: Software Contour plots are available in most general purpose statistical software
● Vertical axis: Independent variable 2 programs. They are also available in many general purpose graphics
and mathematics programs. These programs vary widely in the
● Horizontal axis: Independent variable 1
capabilities for the contour plots they generate. Many provide just a
● Lines: iso-response values basic contour plot over a rectangular grid while others permit color
The independent variables are usually restricted to a regular grid. The filled or shaded contours. Dataplot supports a fairly basic contour plot.
actual techniques for determining the correct iso-response values are
rather complex and are almost always computer generated. Most statistical software programs that support design of experiments
will provide a dex contour plot capability.
An additional variable may be required to specify the Z values for
drawing the iso-lines. Some software packages require explicit values.
Other software packages will determine them automatically.
If the data (or function) do not form a regular grid, you typically need
to perform a 2-D interpolation to form a regular grid.

Questions The contour plot is used to answer the question


How does Z change as a function of X and Y?

Importance: For univariate data, a run sequence plot and a histogram are considered
Visualizing necessary first steps in understanding the data. For 2-dimensional data,
3-dimensional a scatter plot is a necessary first step in understanding the data.
data
In a similar manner, 3-dimensional data should be plotted. Small data
sets, such as result from designed experiments, can typically be
represented by block plots, dex mean plots, and the like (here, "DEX"
stands for "Design of Experiments"). For large data sets, a contour plot
or a 3-D surface plot should be considered a necessary first step in
understanding the data.

DEX Contour The dex contour plot is a specialized contour plot used in the design of
Plot experiments. In particular, it is useful for full and fractional designs.

Related 3-D Plot


Techniques

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33a.htm (2 of 3) [11/13/2003 5:31:59 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33a.htm (3 of 3) [11/13/2003 5:31:59 PM]


1.3.3.10.1. DEX Contour Plot 1.3.3.10.1. DEX Contour Plot

Construction The following are the primary steps in the construction of the dex contour
of DEX plot.
Contour Plot 1. The x and y axes of the plot represent the values of the first and
1. Exploratory Data Analysis second factor (independent) variables.
1.3. EDA Techniques 2. The four vertex points are drawn. The vertex points are (-1,-1),
1.3.3. Graphical Techniques: Alphabetic (-1,1), (1,1), (1,-1). At each vertex point, the average of all the
1.3.3.10. Contour Plot
response values at that vertex point is printed.
3. Similarly, if there are center points, a point is drawn at (0,0) and the
1.3.3.10.1. DEX Contour Plot average of the response values at the center points is printed.
4. The linear dex contour plot assumes the model:
DEX Contour The dex contour plot is a specialized contour plot used in the analysis of
Plot: full and fractional experimental designs. These designs often have a low
Introduction level, coded as "-1" or "-", and a high level, coded as "+1" or "+" for each where is the overall mean of the response variable. The values of
factor. In addition, there can optionally be one or more center points. , , , and are estimated from the vertex points using a
Center points are at the mid-point between the low and high level for each Yates analysis (the Yates analysis utilizes the special structure of the
factor and are coded as "0". 2-level full and fractional factorial designs to simplify the
computation of these parameter estimates). Note that for the dex
The dex contour plot is generated for two factors. Typically, this would be
contour plot, a full Yates analysis does not need to performed,
the two most important factors as determined by previous analyses (e.g.,
simply the calculations for generating the parameter estimates.
through the use of the dex mean plots and a Yates analysis). If more than
two factors are important, you may want to generate a series of dex In order to generate a single contour line, we need a value for Y, say
contour plots, each of which is drawn for two of these factors. You can Y0. Next, we solve for U2 in terms of U1 and, after doing the
also generate a matrix of all pairwise dex contour plots for a number of algebra, we have the equation:
important factors (similar to the scatter plot matrix for scatter plots).
The typical application of the dex contour plot is in determining settings
that will maximize (or minimize) the response variable. It can also be
helpful in determining settings that result in the response variable hitting a We generate a sequence of points for U1 in the range -2 to 2 and
pre-determined target value. The dex contour plot plays a useful role in compute the corresponding values of U2. These points constitute a
determining the settings for the next iteration of the experiment. That is, single contour line corresponding to Y = Y0.
the initial experiment is typically a fractional factorial design with a fairly
large number of factors. After the most important factors are determined, The user specifies the target values for which contour lines will be
the dex contour plot can be used to help define settings for a full factorial generated.
or response surface design based on a smaller number of factors.
The above algorithm assumes a linear model for the design. Dex contour
plots can also be generated for the case in which we assume a quadratic
model for the design. The algebra for solving for U2 in terms of U1
becomes more complicated, but the fundamental idea is the same.
Quadratic models are needed for the case when the average for the center
points does not fall in the range defined by the vertex point (i.e., there is
curvature).

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33a1.htm (1 of 4) [11/13/2003 5:32:00 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33a1.htm (2 of 4) [11/13/2003 5:32:00 PM]


1.3.3.10.1. DEX Contour Plot 1.3.3.10.1. DEX Contour Plot

Sample DEX The following is a dex contour plot for the data used in the Eddy current Best Settings To determine the best factor settings for the already-run experiment, we
Contour Plot case study. The analysis in that case study demonstrated that X1 and X2 first must define what "best" means. For the Eddy current data set used to
were the most important factors. generate this dex contour plot, "best" means to maximize (rather than
minimize or hit a target) the response. Hence from the contour plot we
determine the best settings for the two dominant factors by simply
scanning the four vertices and choosing the vertex with the largest value
(= average response). In this case, it is (X1 = +1, X2 = +1).
As for factor X3, the contour plot provides no best setting information, and
so we would resort to other tools: the main effects plot, the interaction
effects matrix, or the ordered data to determine optimal X3 settings.

Case Study The Eddy current case study demonstrates the use of the dex contour plot
in the context of the analysis of a full factorial design.

Software DEX contour plots are available in many statistical software programs that
analyze data from designed experiments. Dataplot supports a linear dex
contour plot and it provides a macro for generating a quadratic dex contour
plot.

Interpretation From the above dex contour plot we can derive the following information.
of the Sample 1. Interaction significance;
DEX Contour
2. Best (data) setting for these 2 dominant factors;
Plot

Interaction Note the appearance of the contour plot. If the contour curves are linear,
Significance then that implies that the interaction term is not significant; if the contour
curves have considerable curvature, then that implies that the interaction
term is large and important. In our case, the contour curves do not have
considerable curvature, and so we conclude that the X1*X2 term is not
significant.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33a1.htm (3 of 4) [11/13/2003 5:32:00 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33a1.htm (4 of 4) [11/13/2003 5:32:00 PM]


1.3.3.11. DEX Scatter Plot 1.3.3.11. DEX Scatter Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.11. DEX Scatter Plot


Purpose: The dex scatter plot shows the response values for each level of each
Determine factor (i.e., independent) variable. This graphically shows how the
Important location and scale vary for both within a factor variable and between
Factors with different factor variables. This graphically shows which are the
Respect to important factors and can help provide a ranked list of important
Location and factors from a designed experiment. The dex scatter plot is a
Scale complement to the traditional analyis of variance of designed Description For this sample plot, there are seven factors and each factor has two
experiments. of the Plot levels. For each factor, we define a distinct x coordinate for each level
of the factor. For example, for factor 1, level 1 is coded as 0.8 and level
Dex scatter plots are typically used in conjunction with the dex mean 2 is coded as 1.2. The y coordinate is simply the value of the response
plot and the dex standard deviation plot. The dex mean plot replaces variable. The solid horizontal line is drawn at the overall mean of the
the raw response values with mean response values while the dex response variable. The vertical dotted lines are added for clarity.
standard deviation plot replaces the raw response values with the
standard deviation of the response values. There is value in generating Although the plot can be drawn with an arbitrary number of levels for a
all 3 of these plots. The dex mean and standard deviation plots are factor, it is really only useful when there are two or three levels for a
useful in that the summary measures of location and spread stand out factor.
(they can sometimes get lost with the raw plot). However, the raw data
points can reveal subtleties, such as the presence of outliers, that might Conclusions This sample dex scatter plot shows that:
get lost with the summary statistics. 1. there does not appear to be any outliers;
2. the levels of factors 2 and 4 show distinct location differences;
Sample Plot: and
Factors 4, 2, 3. the levels of factor 1 show distinct scale differences.
3, and 7 are
the Important Definition: Dex scatter plots are formed by:
Factors. Response ● Vertical axis: Value of the response variable
Values
● Horizontal axis: Factor variable (with each level of the factor
Versus
coded with a slightly offset x coordinate)
Factor
Variables

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33b.htm (1 of 5) [11/13/2003 5:32:00 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33b.htm (2 of 5) [11/13/2003 5:32:00 PM]


1.3.3.11. DEX Scatter Plot 1.3.3.11. DEX Scatter Plot

Questions The dex scatter plot can be used to answer the following questions:
1. Which factors are important with respect to location and scale?
2. Are there outliers?

Importance: The goal of many designed experiments is to determine which factors


Identify are important with respect to location and scale. A ranked list of the
Important important factors is also often of interest. Dex scatter, mean, and
Factors with standard deviation plots show this graphically. The dex scatter plot
Respect to additionally shows if outliers may potentially be distorting the results.
Location and
Scale Dex scatter plots were designed primarily for analyzing designed
experiments. However, they are useful for any type of multi-factor data
(i.e., a response variable with 2 or more factor variables having a small
number of distinct levels) whether or not the data were generated from
a designed experiment.

Extension for
Interaction Interpretation We can first examine the diagonal elements for the main effects. These
Effects of the Dex diagonal plots show a great deal of overlap between the levels for all
Interaction three factors. This indicates that location and scale effects will be
Effects Plot relatively small.
We can then examine the off-diagonal plots for the first order
interaction effects. For example, the plot in the first row and second
column is the interaction between factors X1 and X2. As with the main
Using the concept of the scatterplot matrix, the dex scatter plot can be effect plots, no clear patterns are evident.
extended to display first order interaction effects.
Specifically, if there are k factors, we create a matrix of plots with k Related Dex mean plot
rows and k columns. On the diagonal, the plot is simply a dex scatter Techniques Dex standard deviation plot
plot with a single factor. For the off-diagonal plots, we multiply the Block plot
values of Xi and Xj. For the common 2-level designs (i.e., each factor Box plot
has two levels) the values are typically coded as -1 and 1, so the Analysis of variance
multiplied values are also -1 and 1. We then generate a dex scatter plot
for this interaction variable. This plot is called a dex interaction effects Case Study The dex scatter plot is demonstrated in the ceramic strength data case
plot and an example is shown below.
study.

Software Dex scatter plots are available in some general purpose statistical
software programs, although the format may vary somewhat between
these programs. They are essentially just scatter plots with the X
variable defined in a particular way, so it should be feasible to write
macros for dex scatter plots in most statistical software programs.
Dataplot supports a dex scatter plot.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33b.htm (3 of 5) [11/13/2003 5:32:00 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33b.htm (4 of 5) [11/13/2003 5:32:00 PM]


1.3.3.11. DEX Scatter Plot 1.3.3.12. DEX Mean Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.12. DEX Mean Plot


Purpose: The dex mean plot is appropriate for analyzing data from a designed
Detect experiment, with respect to important factors, where the factors are at
Important two or more levels. The plot shows mean values for the two or more
Factors with levels of each factor plotted by factor. The means for a single factor are
Respect to connected by a straight line. The dex mean plot is a complement to the
Location traditional analysis of variance of designed experiments.
This plot is typically generated for the mean. However, it can be
generated for other location statistics such as the median.

Sample
Plot:
Factors 4, 2,
and 1 are
the Most
Important
Factors

This sample dex mean plot shows that:


1. factor 4 is the most important;
2. factor 2 is the second most important;
3. factor 1 is the third most important;

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33b.htm (5 of 5) [11/13/2003 5:32:00 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33c.htm (1 of 3) [11/13/2003 5:32:00 PM]


1.3.3.12. DEX Mean Plot 1.3.3.12. DEX Mean Plot

4. factor 7 is the fourth most important;


5. factor 6 is the fifth most important; DEX
Interaction
6. factors 3 and 5 are relatively unimportant.
Effects Plot
In summary, factors 4, 2, and 1 seem to be clearly important, factors 3
and 5 seem to be clearly unimportant, and factors 6 and 7 are borderline
factors whose inclusion in any subsequent models will be determined by
further analyses.

Definition: Dex mean plots are formed by:


Mean ● Vertical axis: Mean of the response variable for each level of the
Response factor
Versus
● Horizontal axis: Factor variable
Factor
Variables

Questions The dex mean plot can be used to answer the following questions:
1. Which factors are important? The dex mean plot does not provide
a definitive answer to this question, but it does help categorize
factors as "clearly important", "clearly not important", and This plot shows that the most significant factor is X1 and the most
"borderline importance". significant interaction is between X1 and X3.
2. What is the ranking list of the important factors?
Related Dex scatter plot
Importance: The goal of many designed experiments is to determine which factors Techniques Dex standard deviation plot
Determine are significant. A ranked order listing of the important factors is also Block plot
Significant often of interest. The dex mean plot is ideally suited for answering these Box plot
Factors types of questions and we recommend its routine use in analyzing Analysis of variance
designed experiments.
Case Study The dex mean plot and the dex interaction effects plot are demonstrated
Extension Using the concept of the scatter plot matrix, the dex mean plot can be
in the ceramic strength data case study.
for extended to display first-order interaction effects.
Interaction
Effects Specifically, if there are k factors, we create a matrix of plots with k Software Dex mean plots are available in some general purpose statistical
rows and k columns. On the diagonal, the plot is simply a dex mean plot software programs, although the format may vary somewhat between
with a single factor. For the off-diagonal plots, measurements at each these programs. It may be feasible to write macros for dex mean plots in
level of the interaction are plotted versus level, where level is Xi times some statistical software programs that do not support this plot directly.
Xj and Xi is the code for the ith main effect level and Xj is the code for Dataplot supports both a dex mean plot and a dex interaction effects
the jth main effect. For the common 2-level designs (i.e., each factor has plot.
two levels) the values are typically coded as -1 and 1, so the multiplied
values are also -1 and 1. We then generate a dex mean plot for this
interaction variable. This plot is called a dex interaction effects plot and
an example is shown below.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33c.htm (2 of 3) [11/13/2003 5:32:00 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33c.htm (3 of 3) [11/13/2003 5:32:00 PM]


1.3.3.13. DEX Standard Deviation Plot 1.3.3.13. DEX Standard Deviation Plot

1. factor 1 has the greatest difference in standard deviations between


factor levels;
2. factor 4 has a significantly lower average standard deviation than
the average standard deviations of other factors (but the level 1
standard deviation for factor 1 is about the same as the level 1
standard deviation for factor 4);
1. Exploratory Data Analysis
1.3. EDA Techniques 3. for all factors, the level 1 standard deviation is smaller than the
1.3.3. Graphical Techniques: Alphabetic level 2 standard deviation.

Definition: Dex standard deviation plots are formed by:


1.3.3.13. DEX Standard Deviation Plot Response ● Vertical axis: Standard deviation of the response variable for each
Standard level of the factor
Purpose: The dex standard deviation plot is appropriate for analyzing data from a Deviations
● Horizontal axis: Factor variable
Detect designed experiment, with respect to important factors, where the Versus
Important factors are at two or more levels and there are repeated values at each Factor
Factors with level. The plot shows standard deviation values for the two or more Variables
Respect to levels of each factor plotted by factor. The standard deviations for a
Scale single factor are connected by a straight line. The dex standard deviation Questions The dex standard deviation plot can be used to answer the following
plot is a complement to the traditional analysis of variance of designed questions:
experiments. 1. How do the standard deviations vary across factors?
This plot is typically generated for the standard deviation. However, it 2. How do the standard deviations vary within a factor?
can also be generated for other scale statistics such as the range, the 3. Which are the most important factors with respect to scale?
median absolute deviation, or the average absolute deviation. 4. What is the ranked list of the important factors with respect to
scale?
Sample Plot
Importance: The goal with many designed experiments is to determine which factors
Assess are significant. This is usually determined from the means of the factor
Variability levels (which can be conveniently shown with a dex mean plot). A
secondary goal is to assess the variability of the responses both within a
factor and between factors. The dex standard deviation plot is a
convenient way to do this.

Related Dex scatter plot


Techniques Dex mean plot
Block plot
Box plot
Analysis of variance

Case Study The dex standard deviation plot is demonstrated in the ceramic strength
data case study.

This sample dex standard deviation plot shows that:

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33d.htm (1 of 3) [11/13/2003 5:32:01 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33d.htm (2 of 3) [11/13/2003 5:32:01 PM]


1.3.3.13. DEX Standard Deviation Plot 1.3.3.14. Histogram

Software Dex standard deviation plots are not available in most general purpose
statistical software programs. It may be feasible to write macros for dex
standard deviation plots in some statistical software programs that do
not support them directly. Dataplot supports a dex standard deviation
plot. 1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.14. Histogram
Purpose: The purpose of a histogram (Chambers) is to graphically summarize the
Summarize distribution of a univariate data set.
a Univariate
Data Set The histogram graphically shows the following:
1. center (i.e., the location) of the data;
2. spread (i.e., the scale) of the data;
3. skewness of the data;
4. presence of outliers; and
5. presence of multiple modes in the data.
These features provide strong indications of the proper distributional
model for the data. The probability plot or a goodness-of-fit test can be
used to verify the distributional model.
The examples section shows the appearance of a number of common
features revealed by histograms.

Sample Plot

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33d.htm (3 of 3) [11/13/2003 5:32:01 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e.htm (1 of 4) [11/13/2003 5:32:01 PM]


1.3.3.14. Histogram 1.3.3.14. Histogram

number of observations times the class width. For this


normalization, the area (or integral) under the histogram is equal
to one. From a probabilistic point of view, this normalization
results in a relative histogram that is most akin to the probability
density function and a relative cumulative histogram that is most
akin to the cumulative distribution function. If you want to
overlay a probability density or cumulative distribution function
on top of the histogram, use this normalization. Although this
normalization is less intuitive (relative frequencies greater than 1
are quite permissible), it is the appropriate normalization if you
are using the histogram to model a probability density function.

Questions The histogram can be used to answer the following questions:


1. What kind of population distribution do the data come from?
2. Where are the data located?
3. How spread out are the data?
4. Are the data symmetric or skewed?
Definition The most common form of the histogram is obtained by splitting the 5. Are there outliers in the data?
range of the data into equal-sized bins (called classes). Then for each
bin, the number of points from the data set that fall into each bin are Examples 1. Normal
counted. That is 2. Symmetric, Non-Normal, Short-Tailed
● Vertical axis: Frequency (i.e., counts for each bin)
3. Symmetric, Non-Normal, Long-Tailed
● Horizontal axis: Response variable
4. Symmetric and Bimodal
The classes can either be defined arbitrarily by the user or via some
systematic rule. A number of theoretically derived rules have been 5. Bimodal Mixture of 2 Normals
proposed by Scott (Scott 1992). 6. Skewed (Non-Symmetric) Right
The cumulative histogram is a variation of the histogram in which the 7. Skewed (Non-Symmetric) Left
vertical axis gives not just the counts for a single bin, but rather gives 8. Symmetric with Outlier
the counts for that bin plus all bins for smaller values of the response
variable. Related Box plot
Both the histogram and cumulative histogram have an additional variant Techniques Probability plot
whereby the counts are replaced by the normalized counts. The names
The techniques below are not discussed in the Handbook. However,
for these variants are the relative histogram and the relative cumulative
they are similar in purpose to the histogram. Additional information on
histogram.
them is contained in the Chambers and Scott references.
There are two common ways to normalize the counts.
Frequency Plot
1. The normalized count is the count in a class divided by the total Stem and Leaf Plot
number of observations. In this case the relative counts are Density Trace
normalized to sum to one (or 100 if a percentage scale is used).
This is the intuitive case where the height of the histogram bar
Case Study The histogram is demonstrated in the heat flow meter data case study.
represents the proportion of the data in each class.
2. The normalized count is the count in the class divided by the

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e.htm (2 of 4) [11/13/2003 5:32:01 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e.htm (3 of 4) [11/13/2003 5:32:01 PM]


1.3.3.14. Histogram 1.3.3.14.1. Histogram Interpretation: Normal

Software Histograms are available in most general purpose statistical software


programs. They are also supported in most general purpose charting,
spreadsheet, and business graphics programs. Dataplot supports
histograms. 1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.14. Histogram

1.3.3.14.1. Histogram Interpretation: Normal


Symmetric,
Moderate-
Tailed
Histogram

Note the classical bell-shaped, symmetric histogram with most of the


frequency counts bunched in the middle and with the counts dying off
out in the tails. From a physical science/engineering point of view, the
normal distribution is that distribution which occurs most often in
nature (due in part to the central limit theorem).

Recommended If the histogram indicates a symmetric, moderate tailed distribution,


Next Step then the recommended next step is to do a normal probability plot to
confirm approximate normality. If the normal probability plot is linear,
then the normal distribution is a good model for the data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e.htm (4 of 4) [11/13/2003 5:32:01 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e1.htm (1 of 2) [11/13/2003 5:32:01 PM]


1.3.3.14.1. Histogram Interpretation: Normal 1.3.3.14.2. Histogram Interpretation: Symmetric, Non-Normal, Short-Tailed

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.14. Histogram

1.3.3.14.2. Histogram Interpretation:


Symmetric, Non-Normal,
Short-Tailed
Symmetric,
Short-Tailed
Histogram

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e1.htm (2 of 2) [11/13/2003 5:32:01 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e2.htm (1 of 3) [11/13/2003 5:32:02 PM]


1.3.3.14.2. Histogram Interpretation: Symmetric, Non-Normal, Short-Tailed 1.3.3.14.2. Histogram Interpretation: Symmetric, Non-Normal, Short-Tailed

Description of For a symmetric distribution, the "body" of a distribution refers to the


What "center" of the distribution--commonly that region of the distribution
Short-Tailed where most of the probability resides--the "fat" part of the distribution.
Means The "tail" of a distribution refers to the extreme regions of the
distribution--both left and right. The "tail length" of a distribution is a
term that indicates how fast these extremes approach zero.
For a short-tailed distribution, the tails approach zero very fast. Such
distributions commonly have a truncated ("sawed-off") look. The
classical short-tailed distribution is the uniform (rectangular)
distribution in which the probability is constant over a given range and
then drops to zero everywhere else--we would speak of this as having
no tails, or extremely short tails.
For a moderate-tailed distribution, the tails decline to zero in a
moderate fashion. The classical moderate-tailed distribution is the
normal (Gaussian) distribution.
For a long-tailed distribution, the tails decline to zero very slowly--and
hence one is apt to see probability a long way from the body of the
distribution. The classical long-tailed distribution is the Cauchy
distribution.
In terms of tail length, the histogram shown above would be
characteristic of a "short-tailed" distribution.
The optimal (unbiased and most precise) estimator for location for the
center of a distribution is heavily dependent on the tail length of the
distribution. The common choice of taking N observations and using
the calculated sample mean as the best estimate for the center of the
distribution is a good choice for the normal distribution (moderate
tailed), a poor choice for the uniform distribution (short tailed), and a
horrible choice for the Cauchy distribution (long tailed). Although for
the normal distribution the sample mean is as precise an estimator as
we can get, for the uniform and Cauchy distributions, the sample mean
is not the best estimator.
For the uniform distribution, the midrange
midrange = (smallest + largest) / 2
is the best estimator of location. For a Cauchy distribution, the median
is the best estimator of location.

Recommended If the histogram indicates a symmetric, short-tailed distribution, the


Next Step recommended next step is to generate a uniform probability plot. If the
uniform probability plot is linear, then the uniform distribution is an
appropriate model for the data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e2.htm (2 of 3) [11/13/2003 5:32:02 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e2.htm (3 of 3) [11/13/2003 5:32:02 PM]


1.3.3.14.3. Histogram Interpretation: Symmetric, Non-Normal, Long-Tailed 1.3.3.14.3. Histogram Interpretation: Symmetric, Non-Normal, Long-Tailed

Recommended If the histogram indicates a symmetric, long tailed distribution, the


Next Step recommended next step is to do a Cauchy probability plot. If the
Cauchy probability plot is linear, then the Cauchy distribution is an
1. Exploratory Data Analysis appropriate model for the data. Alternatively, a Tukey Lambda PPCC
1.3. EDA Techniques plot may provide insight into a suitable distributional model for the
1.3.3. Graphical Techniques: Alphabetic data.
1.3.3.14. Histogram

1.3.3.14.3. Histogram Interpretation:


Symmetric, Non-Normal,
Long-Tailed
Symmetric,
Long-Tailed
Histogram

Description of The previous example contains a discussion of the distinction between


Long-Tailed short-tailed, moderate-tailed, and long-tailed distributions.
In terms of tail length, the histogram shown above would be
characteristic of a "long-tailed" distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e3.htm (1 of 2) [11/13/2003 5:32:02 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e3.htm (2 of 2) [11/13/2003 5:32:02 PM]


1.3.3.14.4. Histogram Interpretation: Symmetric and Bimodal 1.3.3.14.4. Histogram Interpretation: Symmetric and Bimodal

improved deterministic modeling of the phenomenon under study. For


example, for the data presented above, the bimodal histogram is
caused by sinusoidality in the data.

1. Exploratory Data Analysis


Recommended If the histogram indicates a symmetric, bimodal distribution, the
1.3. EDA Techniques Next Step recommended next steps are to:
1.3.3. Graphical Techniques: Alphabetic 1. Do a run sequence plot or a scatter plot to check for
1.3.3.14. Histogram sinusoidality.
2. Do a lag plot to check for sinusoidality. If the lag plot is
1.3.3.14.4. Histogram Interpretation: elliptical, then the data are sinusoidal.
3. If the data are sinusoidal, then a spectral plot is used to
Symmetric and Bimodal graphically estimate the underlying sinusoidal frequency.
4. If the data are not sinusoidal, then a Tukey Lambda PPCC plot
Symmetric, may determine the best-fit symmetric distribution for the data.
Bimodal
5. The data may be fit with a mixture of two distributions. A
Histogram
common approach to this case is to fit a mixture of 2 normal or
lognormal distributions. Further discussion of fitting mixtures of
distributions is beyond the scope of this Handbook.

Description of The mode of a distribution is that value which is most frequently


Bimodal occurring or has the largest probability of occurrence. The sample
mode occurs at the peak of the histogram.
For many phenomena, it is quite common for the distribution of the
response values to cluster around a single mode (unimodal) and then
distribute themselves with lesser frequency out into the tails. The
normal distribution is the classic example of a unimodal distribution.
The histogram shown above illustrates data from a bimodal (2 peak)
distribution. The histogram serves as a tool for diagnosing problems
such as bimodality. Questioning the underlying reason for
distributional non-unimodality frequently leads to greater insight and

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e4.htm (1 of 2) [11/13/2003 5:32:02 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e4.htm (2 of 2) [11/13/2003 5:32:02 PM]


1.3.3.14.5. Histogram Interpretation: Bimodal Mixture of 2 Normals 1.3.3.14.5. Histogram Interpretation: Bimodal Mixture of 2 Normals

Recommended If the histogram indicates that the data might be appropriately fit with
Next Steps a mixture of two normal distributions, the recommended next step is:
Fit the normal mixture model using either least squares or maximum
1. Exploratory Data Analysis likelihood. The general normal mixing model is
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.14. Histogram
where p is the mixing proportion (between 0 and 1) and and are
normal probability density functions with location and scale
1.3.3.14.5. Histogram Interpretation: parameters , , , and , respectively. That is, there are 5
parameters to estimate in the fit.
Bimodal Mixture of 2 Normals
Whether maximum likelihood or least squares is used, the quality of
the fit is sensitive to good starting values. For the mixture of two
Histogram
normals, the histogram can be used to provide initial estimates for the
from Mixture
location and scale parameters of the two normal distributions.
of 2 Normal
Distributions Dataplot can generate a least squares fit of the mixture of two normals
with the following sequence of commands:
RELATIVE HISTOGRAM Y
LET Y2 = YPLOT
LET X2 = XPLOT
RETAIN Y2 X2 SUBSET TAGPLOT = 1
LET U1 = <estimated value from histogram>
LET SD1 = <estimated value from histogram>
LET U2 = <estimated value from histogram>
LET SD2 = <estimated value from histogram>
LET P = 0.5
FIT Y2 = NORMXPDF(X2,U1,S1,U2,S2,P)

Discussion of The histogram shown above illustrates data from a bimodal (2 peak)
Unimodal and distribution.
Bimodal
In contrast to the previous example, this example illustrates bimodality
due not to an underlying deterministic model, but bimodality due to a
mixture of probability models. In this case, each of the modes appears
to have a rough bell-shaped component. One could easily imagine the
above histogram being generated by a process consisting of two
normal distributions with the same standard deviation but with two
different locations (one centered at approximately 9.17 and the other
centered at approximately 9.26). If this is the case, then the research
challenge is to determine physically why there are two similar but
separate sub-processes.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e5.htm (1 of 2) [11/13/2003 5:32:02 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e5.htm (2 of 2) [11/13/2003 5:32:02 PM]


1.3.3.14.6. Histogram Interpretation: Skewed (Non-Normal) Right 1.3.3.14.6. Histogram Interpretation: Skewed (Non-Normal) Right

specific, suppose that the analyst has a collection of 100 values


randomly drawn from a distribution, and wishes to summarize these
100 observations by a "typical value". What does typical value mean?
If the distribution is symmetric, the typical value is unambiguous-- it is
a well-defined center of the distribution. For example, for a
1. Exploratory Data Analysis
bell-shaped symmetric distribution, a center point is identical to that
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
value at the peak of the distribution.
1.3.3.14. Histogram For a skewed distribution, however, there is no "center" in the usual
sense of the word. Be that as it may, several "typical value" metrics are
often used for skewed distributions. The first metric is the mode of the
1.3.3.14.6. Histogram Interpretation: distribution. Unfortunately, for severely-skewed distributions, the
Skewed (Non-Normal) Right mode may be at or near the left or right tail of the data and so it seems
not to be a good representative of the center of the distribution. As a
second choice, one could conceptually argue that the mean (the point
Right-Skewed
on the horizontal axis where the distributiuon would balance) would
Histogram
serve well as the typical value. As a third choice, others may argue
that the median (that value on the horizontal axis which has exactly
50% of the data to the left (and also to the right) would serve as a good
typical value.
For symmetric distributions, the conceptual problem disappears
because at the population level the mode, mean, and median are
identical. For skewed distributions, however, these 3 metrics are
markedly different. In practice, for skewed distributions the most
commonly reported typical value is the mean; the next most common
is the median; the least common is the mode. Because each of these 3
metrics reflects a different aspect of "centerness", it is recommended
that the analyst report at least 2 (mean and median), and preferably all
3 (mean, median, and mode) in summarizing and characterizing a data
set.

Some Causes Skewed data often occur due to lower or upper bounds on the data.
for Skewed That is, data that have a lower bound are often skewed right while data
Discussion of A symmetric distribution is one in which the 2 "halves" of the Data that have an upper bound are often skewed left. Skewness can also
Skewness histogram appear as mirror-images of one another. A skewed result from start-up effects. For example, in reliability applications
(non-symmetric) distribution is a distribution in which there is no such some processes may have a large number of initial failures that could
mirror-imaging. cause left skewness. On the other hand, a reliability process could
have a long start-up period where failures are rare resulting in
For skewed distributions, it is quite common to have one tail of the
right-skewed data.
distribution considerably longer or drawn out relative to the other tail.
A "skewed right" distribution is one in which the tail is on the right Data collected in scientific and engineering applications often have a
side. A "skewed left" distribution is one in which the tail is on the left lower bound of zero. For example, failure data must be non-negative.
side. The above histogram is for a distribution that is skewed right. Many measurement processes generate only positive data. Time to
occurence and size are common measurements that cannot be less than
Skewed distributions bring a certain philosophical complexity to the
zero.
very process of estimating a "typical value" for the distribution. To be

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e6.htm (1 of 3) [11/13/2003 5:32:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e6.htm (2 of 3) [11/13/2003 5:32:09 PM]


1.3.3.14.6. Histogram Interpretation: Skewed (Non-Normal) Right 1.3.3.14.7. Histogram Interpretation: Skewed (Non-Symmetric) Left

Recommended If the histogram indicates a right-skewed data set, the recommended


Next Steps next steps are to:
1. Quantitatively summarize the data by computing and reporting
the sample mean, the sample median, and the sample mode. 1. Exploratory Data Analysis
2. Determine the best-fit distribution (skewed-right) from the 1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
❍ Weibull family (for the maximum)
1.3.3.14. Histogram
❍ Gamma family
Chi-square family

1.3.3.14.7. Histogram Interpretation:
❍ Lognormal family
❍ Power lognormal family
Skewed (Non-Symmetric) Left
3. Consider a normalizing transformation such as the Box-Cox Skewed Left
transformation. Histogram

The issues for skewed left data are similar to those for skewed right
data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e6.htm (3 of 3) [11/13/2003 5:32:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e7.htm [11/13/2003 5:32:09 PM]


1.3.3.14.8. Histogram Interpretation: Symmetric with Outlier 1.3.3.14.8. Histogram Interpretation: Symmetric with Outlier

6. warm-up effects
to more subtle causes such as
1. A change in settings of factors that (knowingly or unknowingly)
affect the response.
1. Exploratory Data Analysis
1.3. EDA Techniques
2. Nature is trying to tell us something.
1.3.3. Graphical Techniques: Alphabetic
1.3.3.14. Histogram Outliers All outliers should be taken seriously and should be investigated
Should be thoroughly for explanations. Automatic outlier-rejection schemes
Investigated (such as throw out all data beyond 4 sample standard deviations from
1.3.3.14.8. Histogram Interpretation: the sample mean) are particularly dangerous.

Symmetric with Outlier The classic case of automatic outlier rejection becoming automatic
information rejection was the South Pole ozone depletion problem.
Ozone depletion over the South Pole would have been detected years
Symmetric
earlier except for the fact that the satellite data recording the low
Histogram
ozone readings had outlier-rejection code that automatically screened
with Outlier
out the "outliers" (that is, the low ozone readings) before the analysis
was conducted. Such inadvertent (and incorrect) purging went on for
years. It was not until ground-based South Pole readings started
detecting low ozone readings that someone decided to double-check as
to why the satellite had not picked up this fact--it had, but it had gotten
thrown out!
The best attitude is that outliers are our "friends", outliers are trying to
tell us something, and we should not stop until we are comfortable in
the explanation for each outlier.

Recommended If the histogram shows the presence of outliers, the recommended next
Next Steps steps are:
1. Graphically check for outliers (in the commonly encountered
normal case) by generating a box plot. In general, box plots are
a much better graphical tool for detecting outliers than are
Discussion of A symmetric distribution is one in which the 2 "halves" of the histograms.
Outliers histogram appear as mirror-images of one another. The above example 2. Quantitatively check for outliers (in the commonly encountered
is symmetric with the exception of outlying data near Y = 4.5. normal case) by carrying out Grubbs test which indicates how
An outlier is a data point that comes from a distribution different (in many sample standard deviations away from the sample mean
location, scale, or distributional form) from the bulk of the data. In the are the data in question. Large values indicate outliers.
real world, outliers have a range of causes, from as simple as
1. operator blunders
2. equipment failures
3. day-to-day effects
4. batch-to-batch differences
5. anomalous input conditions

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e8.htm (1 of 2) [11/13/2003 5:32:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33e8.htm (2 of 2) [11/13/2003 5:32:10 PM]


1.3.3.15. Lag Plot 1.3.3.15. Lag Plot

Definition A lag is a fixed time displacement. For example, given a data set Y1, Y2
..., Yn, Y2 and Y7 have lag 5 since 7 - 2 = 5. Lag plots can be generated
for any arbitrary lag, although the most commonly used lag is 1.
A plot of lag 1 is a plot of the values of Yi versus Yi-1
1. Exploratory Data Analysis
1.3. EDA Techniques ● Vertical axis: Yi for all i
1.3.3. Graphical Techniques: Alphabetic
● Horizontal axis: Yi-1 for all i

1.3.3.15. Lag Plot Questions Lag plots can provide answers to the following questions:
1. Are the data random?
Purpose: A lag plot checks whether a data set or time series is random or not. 2. Is there serial correlation in the data?
Check for Random data should not exhibit any identifiable structure in the lag plot. 3. What is a suitable model for the data?
randomness Non-random structure in the lag plot indicates that the underlying data
4. Are there outliers in the data?
are not random. Several common patterns for lag plots are shown in the
examples below.
Importance Inasmuch as randomness is an underlying assumption for most statistical
estimation and testing techniques, the lag plot should be a routine tool
Sample Plot for researchers.

Examples ● Random (White Noise)


● Weak autocorrelation
● Strong autocorrelation and autoregressive model
● Sinusoidal model and outliers

Related Autocorrelation Plot


Techniques Spectrum
Runs Test

Case Study The lag plot is demonstrated in the beam deflection data case study.

Software Lag plots are not directly available in most general purpose statistical
software programs. Since the lag plot is essentially a scatter plot with
the 2 variables properly lagged, it should be feasible to write a macro for
This sample lag plot exhibits a linear pattern. This shows that the data the lag plot in most statistical programs. Dataplot supports a lag plot.
are strongly non-random and further suggests that an autoregressive
model might be appropriate.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f.htm (1 of 2) [11/13/2003 5:32:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f.htm (2 of 2) [11/13/2003 5:32:10 PM]


1.3.3.15.1. Lag Plot: Random Data 1.3.3.15.1. Lag Plot: Random Data

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.15. Lag Plot

1.3.3.15.1. Lag Plot: Random Data


Lag Plot

Conclusions We can make the following conclusions based on the above plot.
1. The data are random.
2. The data exhibit no autocorrelation.
3. The data contain no outliers.

Discussion The lag plot shown above is for lag = 1. Note the absence of structure.
One cannot infer, from a current value Yi-1, the next value Yi. Thus for a
known value Yi-1 on the horizontal axis (say, Yi-1 = +0.5), the Yi-th
value could be virtually anything (from Yi = -2.5 to Yi = +1.5). Such
non-association is the essence of randomness.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f1.htm (1 of 2) [11/13/2003 5:32:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f1.htm (2 of 2) [11/13/2003 5:32:10 PM]


1.3.3.15.2. Lag Plot: Moderate Autocorrelation 1.3.3.15.2. Lag Plot: Moderate Autocorrelation

Discussion In the plot above for lag = 1, note how the points tend to cluster (albeit
noisily) along the diagonal. Such clustering is the lag plot signature of
moderate autocorrelation.
1. Exploratory Data Analysis If the process were completely random, knowledge of a current
1.3. EDA Techniques
observation (say Yi-1 = 0) would yield virtually no knowledge about
1.3.3. Graphical Techniques: Alphabetic
1.3.3.15. Lag Plot the next observation Yi. If the process has moderate autocorrelation, as
above, and if Yi-1 = 0, then the range of possible values for Yi is seen
to be restricted to a smaller range (.01 to +.01). This suggests
1.3.3.15.2. Lag Plot: Moderate prediction is possible using an autoregressive model.
Autocorrelation Recommended Estimate the parameters for the autoregressive model:
Next Step
Lag Plot
Since Yi and Yi-1 are precisely the axes of the lag plot, such estimation
is a linear regression straight from the lag plot.
The residual standard deviation for the autoregressive model will be
much smaller than the residual standard deviation for the default
model

Conclusions We can make the conclusions based on the above plot.


1. The data are from an underlying autoregressive model with
moderate positive autocorrelation
2. The data contain no outliers.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f2.htm (1 of 2) [11/13/2003 5:32:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f2.htm (2 of 2) [11/13/2003 5:32:10 PM]


1.3.3.15.3. Lag Plot: Strong Autocorrelation and Autoregressive Model 1.3.3.15.3. Lag Plot: Strong Autocorrelation and Autoregressive Model

Discussion Note the tight clustering of points along the diagonal. This is the lag
plot signature of a process with strong positive autocorrelation. Such
processes are highly non-random--there is strong association between
1. Exploratory Data Analysis an observation and a succeeding observation. In short, if you know
1.3. EDA Techniques Yi-1 you can make a strong guess as to what Yi will be.
1.3.3. Graphical Techniques: Alphabetic
1.3.3.15. Lag Plot If the above process were completely random, the plot would have a
shotgun pattern, and knowledge of a current observation (say Yi-1 = 3)
would yield virtually no knowledge about the next observation Yi (it
1.3.3.15.3. Lag Plot: Strong Autocorrelation could here be anywhere from -2 to +8). On the other hand, if the
and Autoregressive Model process had strong autocorrelation, as seen above, and if Yi-1 = 3, then
the range of possible values for Yi is seen to be restricted to a smaller
Lag Plot range (2 to 4)--still wide, but an improvement nonetheless (relative to
-2 to +8) in predictive power.

Recommended When the lag plot shows a strongly autoregressive pattern and only
Next Step successive observations appear to be correlated, the next steps are to:
1. Extimate the parameters for the autoregressive model:

Since Yi and Yi-1 are precisely the axes of the lag plot, such
estimation is a linear regression straight from the lag plot.
The residual standard deviation for this autoregressive model
will be much smaller than the residual standard deviation for the
default model

2. Reexamine the system to arrive at an explanation for the strong


autocorrelation. Is it due to the
1. phenomenon under study; or
Conclusions We can make the following conclusions based on the above plot. 2. drifting in the environment; or
1. The data come from an underlying autoregressive model with 3. contamination from the data acquisition system?
strong positive autocorrelation
2. The data contain no outliers. Sometimes the source of the problem is contamination and
carry-over from the data acquisition system where the system
does not have time to electronically recover before collecting
the next data point. If this is the case, then consider slowing
down the sampling rate to achieve randomness.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f3.htm (1 of 2) [11/13/2003 5:32:11 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f3.htm (2 of 2) [11/13/2003 5:32:11 PM]


1.3.3.15.4. Lag Plot: Sinusoidal Models and Outliers 1.3.3.15.4. Lag Plot: Sinusoidal Models and Outliers

Consequences If one were to naively assume that the above process came from the
of Ignoring null model
Cyclical
1. Exploratory Data Analysis Pattern
and then estimate the constant by the sample mean, then the analysis
1.3. EDA Techniques
would suffer because
1.3.3. Graphical Techniques: Alphabetic
1.3.3.15. Lag Plot 1. the sample mean would be biased and meaningless;
2. the confidence limits would be meaningless and optimistically
small.
1.3.3.15.4. Lag Plot: Sinusoidal Models and The proper model
Outliers
(where is the amplitude, is the frequency--between 0 and .5
Lag Plot
cycles per observation--, and is the phase) can be fit by standard
non-linear least squares, to estimate the coefficients and their
uncertainties.
The lag plot is also of value in outlier detection. Note in the above plot
that there appears to be 4 points lying off the ellipse. However, in a lag
plot, each point in the original data set Y shows up twice in the lag
plot--once as Yi and once as Yi-1. Hence the outlier in the upper left at
Yi = 300 is the same raw data value that appears on the far right at Yi-1
= 300. Thus (-500,300) and (300,200) are due to the same outlier,
namely the 158th data point: 300. The correct value for this 158th
point should be approximately -300 and so it appears that a sign got
dropped in the data collection. The other two points lying off the
ellipse, at roughly (100,100) and at (0,-50), are caused by two faulty
data values: the third data point of -15 should be about +125 and the
fourth data point of +141 should be about -50, respectively. Hence the
4 apparent lag plot outliers are traceable to 3 actual outliers in the
original run sequence: at points 4 (-15), 5 (141) and 158 (300). In
Conclusions We can make the following conclusions based on the above plot. retrospect, only one of these (point 158 (= 300)) is an obvious outlier
1. The data come from an underlying single-cycle sinusoidal in the run sequence plot.
model.
Unexpected Frequently a technique (e.g., the lag plot) is constructed to check one
2. The data contain three outliers.
Value of EDA aspect (e.g., randomness) which it does well. Along the way, the
technique also highlights some other anomaly of the data (namely, that
Discussion In the plot above for lag = 1, note the tight elliptical clustering of there are 3 outliers). Such outlier identification and removal is
points. Processes with a single-cycle sinusoidal model will have such extremely important for detecting irregularities in the data collection
elliptical lag plots. system, and also for arriving at a "purified" data set for modeling. The
lag plot plays an important role in such outlier identification.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f4.htm (1 of 3) [11/13/2003 5:32:11 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f4.htm (2 of 3) [11/13/2003 5:32:11 PM]


1.3.3.15.4. Lag Plot: Sinusoidal Models and Outliers 1.3.3.16. Linear Correlation Plot

Recommended When the lag plot indicates a sinusoidal model with possible outliers,
Next Step the recommended next steps are:
1. Do a spectral plot to obtain an initial estimate of the frequency
of the underlying cycle. This will be helpful as a starting value
for the subsequent non-linear fitting.
2. Omit the outliers. 1. Exploratory Data Analysis
3. Carry out a non-linear fit of the model to the 197 points. 1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.16. Linear Correlation Plot


Purpose: Linear correlation plots are used to assess whether or not correlations
Detect are consistent across groups. That is, if your data is in groups, you may
changes in want to know if a single correlation can be used across all the groups or
correlation whether separate correlations are required for each group.
between
groups Linear correlation plots are often used in conjunction with linear slope,
linear intercept, and linear residual standard deviation plots. A linear
correlation plot could be generated intially to see if linear fitting would
be a fruitful direction. If the correlations are high, this implies it is
worthwhile to continue with the linear slope, intercept, and residual
standard deviation plots. If the correlations are weak, a different model
needs to be pursued.
In some cases, you might not have groups. Instead you may have
different data sets and you want to know if the same correlation can be
adequately applied to each of the data sets. In this case, simply think of
each distinct data set as a group and apply the linear slope plot as for
groups.

Sample Plot

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33f4.htm (3 of 3) [11/13/2003 5:32:11 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33g.htm (1 of 3) [11/13/2003 5:32:11 PM]


1.3.3.16. Linear Correlation Plot 1.3.3.16. Linear Correlation Plot

Case Study The linear correlation plot is demonstrated in the Alaska pipeline data
case study.

Software Most general purpose statistical software programs do not support a


linear correlation plot. However, if the statistical program can generate
correlations over a group, it should be feasible to write a macro to
generate this plot. Dataplot supports a linear correlation plot.

This linear correlation plot shows that the correlations are high for all
groups. This implies that linear fits could provide a good model for
each of these groups.

Definition: Linear correlation plots are formed by:


Group ● Vertical axis: Group correlations
Correlations
● Horizontal axis: Group identifier
Versus
Group ID A reference line is plotted at the correlation between the full data sets.

Questions The linear correlation plot can be used to answer the following
questions.
1. Are there linear relationships across groups?
2. Are the strength of the linear relationships relatively constant
across the groups?

Importance: For grouped data, it may be important to know whether the different
Checking groups are homogeneous (i.e., similar) or heterogeneous (i.e., different).
Group Linear correlation plots help answer this question in the context of
Homogeneity linear fitting.

Related Linear Intercept Plot


Techniques Linear Slope Plot
Linear Residual Standard Deviation Plot
Linear Fitting

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33g.htm (2 of 3) [11/13/2003 5:32:11 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33g.htm (3 of 3) [11/13/2003 5:32:11 PM]


1.3.3.17. Linear Intercept Plot 1.3.3.17. Linear Intercept Plot

the other groups. Note that these are small differences in the intercepts.

Definition: Linear intercept plots are formed by:


Group ● Vertical axis: Group intercepts from linear fits
Intercepts
● Horizontal axis: Group identifier
Versus
1. Exploratory Data Analysis Group ID A reference line is plotted at the intercept from a linear fit using all the
1.3. EDA Techniques data.
1.3.3. Graphical Techniques: Alphabetic
Questions The linear intercept plot can be used to answer the following questions.
1.3.3.17. Linear Intercept Plot 1. Is the intercept from linear fits relatively constant across groups?
2. If the intercepts vary across groups, is there a discernible pattern?
Purpose: Linear intercept plots are used to graphically assess whether or not
Detect linear fits are consistent across groups. That is, if your data have Importance: For grouped data, it may be important to know whether the different
changes in groups, you may want to know if a single fit can be used across all the Checking groups are homogeneous (i.e., similar) or heterogeneous (i.e., different).
linear groups or whether separate fits are required for each group. Group Linear intercept plots help answer this question in the context of linear
intercepts Homogeneity fitting.
between Linear intercept plots are typically used in conjunction with linear slope
groups and linear residual standard deviation plots. Related Linear Correlation Plot
Techniques Linear Slope Plot
In some cases you might not have groups. Instead, you have different
Linear Residual Standard Deviation Plot
data sets and you want to know if the same fit can be adequately applied
to each of the data sets. In this case, simply think of each distinct data Linear Fitting
set as a group and apply the linear intercept plot as for groups.
Case Study The linear intercept plot is demonstrated in the Alaska pipeline data
Sample Plot case study.

Software Most general purpose statistical software programs do not support a


linear intercept plot. However, if the statistical program can generate
linear fits over a group, it should be feasible to write a macro to
generate this plot. Dataplot supports a linear intercept plot.

This linear intercept plot shows that there is a shift in intercepts.


Specifically, the first three intercepts are lower than the intercepts for

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33h.htm (1 of 2) [11/13/2003 5:32:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33h.htm (2 of 2) [11/13/2003 5:32:12 PM]


1.3.3.18. Linear Slope Plot 1.3.3.18. Linear Slope Plot

Definition: Linear slope plots are formed by:


Group ● Vertical axis: Group slopes from linear fits
Slopes
● Horizontal axis: Group identifier
1. Exploratory Data Analysis Versus
Group ID A reference line is plotted at the slope from a linear fit using all the
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
data.

Questions The linear slope plot can be used to answer the following questions.
1.3.3.18. Linear Slope Plot 1. Do you get the same slope across groups for linear fits?
2. If the slopes differ, is there a discernible pattern in the slopes?
Purpose: Linear slope plots are used to graphically assess whether or not linear
Detect fits are consistent across groups. That is, if your data have groups, you Importance: For grouped data, it may be important to know whether the different
changes in may want to know if a single fit can be used across all the groups or Checking groups are homogeneous (i.e., similar) or heterogeneous (i.e., different).
linear slopes whether separate fits are required for each group. Group Linear slope plots help answer this question in the context of linear
between Homogeneity fitting.
groups Linear slope plots are typically used in conjunction with linear intercept
and linear residual standard deviation plots.
Related Linear Intercept Plot
In some cases you might not have groups. Instead, you have different Techniques Linear Correlation Plot
data sets and you want to know if the same fit can be adequately applied Linear Residual Standard Deviation Plot
to each of the data sets. In this case, simply think of each distinct data Linear Fitting
set as a group and apply the linear slope plot as for groups.
Case Study The linear slope plot is demonstrated in the Alaska pipeline data case
Sample Plot study.

Software Most general purpose statistical software programs do not support a


linear slope plot. However, if the statistical program can generate linear
fits over a group, it should be feasible to write a macro to generate this
plot. Dataplot supports a linear slope plot.

This linear slope plot shows that the slopes are about 0.174 (plus or
minus 0.002) for all groups. There does not appear to be a pattern in the
variation of the slopes. This implies that a single fit may be adequate.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33i.htm (1 of 2) [11/13/2003 5:32:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33i.htm (2 of 2) [11/13/2003 5:32:12 PM]


1.3.3.19. Linear Residual Standard Deviation Plot 1.3.3.19. Linear Residual Standard Deviation Plot

Sample Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.19. Linear Residual Standard


Deviation Plot
Purpose: Linear residual standard deviation (RESSD) plots are used to
Detect graphically assess whether or not linear fits are consistent across
Changes in groups. That is, if your data have groups, you may want to know if a
Linear single fit can be used across all the groups or whether separate fits are
Residual required for each group. This linear RESSD plot shows that the residual standard deviations
Standard from a linear fit are about 0.0025 for all the groups.
Deviation The residual standard deviation is a goodness-of-fit measure. That is,
Between the smaller the residual standard deviation, the closer is the fit to the Definition: Linear RESSD plots are formed by:
Groups data. Group ● Vertical axis: Group residual standard deviations from linear fits
Linear RESSD plots are typically used in conjunction with linear Residual
● Horizontal axis: Group identifier
intercept and linear slope plots. The linear intercept and slope plots Standard
Deviation A reference line is plotted at the residual standard deviation from a
convey whether or not the fits are consistent across groups while the linear fit using all the data. This reference line will typically be much
Versus
linear RESSD plot conveys whether the adequacy of the fit is consistent greater than any of the individual residual standard deviations.
Group ID
across groups.
In some cases you might not have groups. Instead, you have different Questions The linear RESSD plot can be used to answer the following questions.
data sets and you want to know if the same fit can be adequately applied 1. Is the residual standard deviation from a linear fit constant across
to each of the data sets. In this case, simply think of each distinct data groups?
set as a group and apply the linear RESSD plot as for groups.
2. If the residual standard deviations vary, is there a discernible
pattern across the groups?

Importance: For grouped data, it may be important to know whether the different
Checking groups are homogeneous (i.e., similar) or heterogeneous (i.e., different).
Group Linear RESSD plots help answer this question in the context of linear
Homogeneity fitting.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33j.htm (1 of 3) [11/13/2003 5:32:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33j.htm (2 of 3) [11/13/2003 5:32:12 PM]


1.3.3.19. Linear Residual Standard Deviation Plot 1.3.3.20. Mean Plot

Related Linear Intercept Plot


Techniques Linear Slope Plot
Linear Correlation Plot
Linear Fitting 1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
Case Study The linear residual standard deviation plot is demonstrated in the
Alaska pipeline data case study.
1.3.3.20. Mean Plot
Software Most general purpose statistical software programs do not support a
linear residual standard deviation plot. However, if the statistical
Purpose: Mean plots are used to see if the mean varies between different groups
program can generate linear fits over a group, it should be feasible to
write a macro to generate this plot. Dataplot supports a linear residual Detect of the data. The grouping is determined by the analyst. In most cases,
changes in the data set contains a specific grouping variable. For example, the
standard deviation plot.
location groups may be the levels of a factor variable. In the sample plot below,
between the months of the year provide the grouping.
groups
Mean plots can be used with ungrouped data to determine if the mean is
changing over time. In this case, the data are split into an arbitrary
number of equal-sized groups. For example, a data series with 400
points can be divided into 10 groups of 40 points each. A mean plot can
then be generated with these groups to see if the mean is increasing or
decreasing over time.
Although the mean is the most commonly used measure of location, the
same concept applies to other measures of location. For example,
instead of plotting the mean of each group, the median or the trimmed
mean might be plotted instead. This might be done if there were
significant outliers in the data and a more robust measure of location
than the mean was desired.
Mean plots are typically used in conjunction with standard deviation
plots. The mean plot checks for shifts in location while the standard
deviation plot checks for shifts in scale.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33j.htm (3 of 3) [11/13/2003 5:32:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33k.htm (1 of 3) [11/13/2003 5:32:12 PM]


1.3.3.20. Mean Plot 1.3.3.20. Mean Plot

Sample Plot Software Most general purpose statistical software programs do not support a
mean plot. However, if the statistical program can generate the mean
over a group, it should be feasible to write a macro to generate this plot.
Dataplot supports a mean plot.

This sample mean plot shows a shift of location after the 6th month.

Definition: Mean plots are formed by:


Group ● Vertical axis: Group mean
Means
● Horizontal axis: Group identifier
Versus
Group ID A reference line is plotted at the overall mean.

Questions The mean plot can be used to answer the following questions.
1. Are there any shifts in location?
2. What is the magnitude of the shifts in location?
3. Is there a distinct pattern in the shifts in location?

Importance: A common assumption in 1-factor analyses is that of constant location.


Checking That is, the location is the same for different levels of the factor
Assumptions variable. The mean plot provides a graphical check for that assumption.
A common assumption for univariate data is that the location is
constant. By grouping the data into equal intervals, the mean plot can
provide a graphical test of this assumption.

Related Standard Deviation Plot


Techniques Dex Mean Plot
Box Plot

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33k.htm (2 of 3) [11/13/2003 5:32:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33k.htm (3 of 3) [11/13/2003 5:32:12 PM]


1.3.3.21. Normal Probability Plot 1.3.3.21. Normal Probability Plot

Definition: The normal probability plot is formed by:


Ordered ● Vertical axis: Ordered response values
Response
● Horizontal axis: Normal order statistic medians
Values Versus
Normal Order The observations are plotted as a function of the corresponding normal
1. Exploratory Data Analysis
Statistic order statistic medians which are defined as:
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
Medians N(i) = G(U(i))
where U(i) are the uniform order statistic medians (defined below) and
G is the percent point function of the normal distribution. The percent
1.3.3.21. Normal Probability Plot point function is the inverse of the cumulative distribution function
(probability that x is less than or equal to some value). That is, given a
Purpose: The normal probability plot (Chambers 1983) is a graphical technique probability, we want the corresponding x of the cumulative
Check If Data for assessing whether or not a data set is approximately normally distribution function.
Are distributed.
Approximately The uniform order statistic medians are defined as:
Normally The data are plotted against a theoretical normal distribution in such a m(i) = 1 - m(n) for i = 1
Distributed way that the points should form an approximate straight line. m(i) = (i - 0.3175)/(n + 0.365) for i = 2, 3, ..., n-1
Departures from this straight line indicate departures from normality. m(i) = 0.5(1/n) for i = n
The normal probability plot is a special case of the probability plot. In addition, a straight line can be fit to the points and added as a
We cover the normal probability plot separately due to its importance reference line. The further the points vary from this line, the greater
in many applications. the indication of departures from normality.

Sample Plot Probability plots for distributions other than the normal are computed
in exactly the same way. The normal percent point function (the G) is
simply replaced by the percent point function of the desired
distribution. That is, a probability plot can easily be generated for any
distribution for which you have the percent point function.
One advantage of this method of computing probability plots is that
the intercept and slope estimates of the fitted line are in fact estimates
for the location and scale parameters of the distribution. Although this
is not too important for the normal distribution since the location and
scale are estimated by the mean and standard deviation, respectively, it
can be useful for many other distributions.
The correlation coefficient of the points on the normal probability plot
can be compared to a table of critical values to provide a formal test of
the hypothesis that the data come from a normal distribution.

Questions The normal probability plot is used to answer the following questions.
1. Are the data normally distributed?
The points on this plot form a nearly linear pattern, which indicates
2. What is the nature of the departure from normality (data
that the normal distribution is a good model for this data set.
skewed, shorter than expected tails, longer than expected tails)?

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l.htm (1 of 3) [11/13/2003 5:32:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l.htm (2 of 3) [11/13/2003 5:32:13 PM]


1.3.3.21. Normal Probability Plot 1.3.3.21.1. Normal Probability Plot: Normally Distributed Data

Importance: The underlying assumptions for a measurement process are that the
Check data should behave like:
Normality 1. random drawings;
Assumption 1. Exploratory Data Analysis
2. from a fixed distribution;
1.3. EDA Techniques
3. with fixed location; 1.3.3. Graphical Techniques: Alphabetic
4. with fixed scale. 1.3.3.21. Normal Probability Plot
Probability plots are used to assess the assumption of a fixed
distribution. In particular, most statistical models are of the form:
response = deterministic + random
1.3.3.21.1. Normal Probability Plot:
where the deterministic part is the fit and the random part is error. This Normally Distributed Data
error component in most common statistical models is specifically
assumed to be normally distributed with fixed location and scale. This Normal The following normal probability plot is from the heat flow meter data.
is the most frequent application of normal probability plots. That is, a Probability
model is fit and a normal probability plot is generated for the residuals Plot
from the fitted model. If the residuals from the fitted model are not
normally distributed, then one of the major assumptions of the model
has been violated.

Examples 1. Data are normally distributed


2. Data have fat tails
3. Data have short tails
4. Data are skewed right

Related Histogram
Techniques Probability plots for other distributions (e.g., Weibull)
Probability plot correlation coefficient plot (PPCC plot)
Anderson-Darling Goodness-of-Fit Test
Chi-Square Goodness-of-Fit Test
Kolmogorov-Smirnov Goodness-of-Fit Test
Conclusions We can make the following conclusions from the above plot.
Case Study The normal probability plot is demonstrated in the heat flow meter
data case study. 1. The normal probability plot shows a strongly linear pattern. There
are only minor deviations from the line fit to the points on the
probability plot.
Software Most general purpose statistical software programs can generate a
normal probability plot. Dataplot supports a normal probability plot. 2. The normal distribution appears to be a good model for these
data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l.htm (3 of 3) [11/13/2003 5:32:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l1.htm (1 of 2) [11/13/2003 5:32:13 PM]


1.3.3.21.1. Normal Probability Plot: Normally Distributed Data 1.3.3.21.2. Normal Probability Plot: Data Have Short Tails

Discussion Visually, the probability plot shows a strongly linear pattern. This is
verified by the correlation coefficient of 0.9989 of the line fit to the
probability plot. The fact that the points in the lower and upper extremes
of the plot do not deviate significantly from the straight-line pattern 1. Exploratory Data Analysis
indicates that there are not any significant outliers (relative to a normal 1.3. EDA Techniques
distribution). 1.3.3. Graphical Techniques: Alphabetic
1.3.3.21. Normal Probability Plot
In this case, we can quite reasonably conclude that the normal
distribution provides an excellent model for the data. The intercept and
slope of the fitted line give estimates of 9.26 and 0.023 for the location
and scale parameters of the fitted normal distribution.
1.3.3.21.2. Normal Probability Plot: Data
Have Short Tails
Normal The following is a normal probability plot for 500 random numbers
Probability generated from a Tukey-Lambda distribution with the parameter equal
Plot for to 1.1.
Data with
Short Tails

Conclusions We can make the following conclusions from the above plot.
1. The normal probability plot shows a non-linear pattern.
2. The normal distribution is not a good model for these data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l1.htm (2 of 2) [11/13/2003 5:32:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l2.htm (1 of 2) [11/13/2003 5:32:13 PM]


1.3.3.21.2. Normal Probability Plot: Data Have Short Tails 1.3.3.21.3. Normal Probability Plot: Data Have Long Tails

Discussion For data with short tails relative to the normal distribution, the
non-linearity of the normal probability plot shows up in two ways. First,
the middle of the data shows an S-like pattern. This is common for both
short and long tails. Second, the first few and the last few points show a 1. Exploratory Data Analysis
marked departure from the reference fitted line. In comparing this plot 1.3. EDA Techniques
to the long tail example in the next section, the important difference is 1.3.3. Graphical Techniques: Alphabetic
the direction of the departure from the fitted line for the first few and 1.3.3.21. Normal Probability Plot
last few points. For short tails, the first few points show increasing
departure from the fitted line above the line and last few points show
increasing departure from the fitted line below the line. For long tails, 1.3.3.21.3. Normal Probability Plot: Data
this pattern is reversed.
Have Long Tails
In this case, we can reasonably conclude that the normal distribution
does not provide an adequate fit for this data set. For probability plots Normal The following is a normal probability plot of 500 numbers generated
that indicate short-tailed distributions, the next step might be to generate Probability from a double exponential distribution. The double exponential
a Tukey Lambda PPCC plot. The Tukey Lambda PPCC plot can often Plot for distribution is symmetric, but relative to the normal it declines rapidly
be helpful in identifying an appropriate distributional family. Data with and has longer tails.
Long Tails

Conclusions We can make the following conclusions from the above plot.
1. The normal probability plot shows a reasonably linear pattern in
the center of the data. However, the tails, particularly the lower
tail, show departures from the fitted line.
2. A distribution other than the normal distribution would be a good
model for these data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l2.htm (2 of 2) [11/13/2003 5:32:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l3.htm (1 of 2) [11/13/2003 5:32:13 PM]


1.3.3.21.3. Normal Probability Plot: Data Have Long Tails 1.3.3.21.4. Normal Probability Plot: Data are Skewed Right

Discussion For data with long tails relative to the normal distribution, the
non-linearity of the normal probability plot can show up in two ways.
First, the middle of the data may show an S-like pattern. This is
common for both short and long tails. In this particular case, the S 1. Exploratory Data Analysis
pattern in the middle is fairly mild. Second, the first few and the last few 1.3. EDA Techniques
points show marked departure from the reference fitted line. In the plot 1.3.3. Graphical Techniques: Alphabetic
above, this is most noticeable for the first few data points. In comparing 1.3.3.21. Normal Probability Plot
this plot to the short-tail example in the previous section, the important
difference is the direction of the departure from the fitted line for the
first few and the last few points. For long tails, the first few points show 1.3.3.21.4. Normal Probability Plot: Data are
increasing departure from the fitted line below the line and last few
points show increasing departure from the fitted line above the line. For Skewed Right
short tails, this pattern is reversed.
Normal
In this case we can reasonably conclude that the normal distribution can Probability
be improved upon as a model for these data. For probability plots that Plot for
indicate long-tailed distributions, the next step might be to generate a Data that
Tukey Lambda PPCC plot. The Tukey Lambda PPCC plot can often be are Skewed
helpful in identifying an appropriate distributional family. Right

Conclusions We can make the following conclusions from the above plot.
1. The normal probability plot shows a strongly non-linear pattern.
Specifically, it shows a quadratic pattern in which all the points
are below a reference line drawn between the first and last points.
2. The normal distribution is not a good model for these data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l3.htm (2 of 2) [11/13/2003 5:32:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l4.htm (1 of 2) [11/13/2003 5:32:14 PM]


1.3.3.21.4. Normal Probability Plot: Data are Skewed Right 1.3.3.22. Probability Plot

Discussion This quadratic pattern in the normal probability plot is the signature of a
significantly right-skewed data set. Similarly, if all the points on the
normal probability plot fell above the reference line connecting the first
and last points, that would be the signature pattern for a significantly 1. Exploratory Data Analysis
left-skewed data set. 1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
In this case we can quite reasonably conclude that we need to model
these data with a right skewed distribution such as the Weibull or
lognormal. 1.3.3.22. Probability Plot
Purpose: The probability plot (Chambers 1983) is a graphical technique for
Check If assessing whether or not a data set follows a given distribution such as
Data Follow the normal or Weibull.
a Given
Distribution The data are plotted against a theoretical distribution in such a way that
the points should form approximately a straight line. Departures from
this straight line indicate departures from the specified distribution.
The correlation coefficient associated with the linear fit to the data in
the probability plot is a measure of the goodness of the fit. Estimates of
the location and scale parameters of the distribution are given by the
intercept and slope. Probability plots can be generated for several
competing distributions to see which provides the best fit, and the
probability plot generating the highest correlation coefficient is the best
choice since it generates the straightest probability plot.
For distributions with shape parameters (not counting location and
scale parameters), the shape parameters must be known in order to
generate the probability plot. For distributions with a single shape
parameter, the probability plot correlation coefficient (PPCC) plot
provides an excellent method for estimating the shape parameter.
We cover the special case of the normal probability plot separately due
to its importance in many statistical applications.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33l4.htm (2 of 2) [11/13/2003 5:32:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33m.htm (1 of 4) [11/13/2003 5:32:14 PM]


1.3.3.22. Probability Plot 1.3.3.22. Probability Plot

indication of a departure from the specified distribution.


Sample Plot This definition implies that a probability plot can be easily generated
for any distribution for which the percent point function can be
computed.
One advantage of this method of computing proability plots is that the
intercept and slope estimates of the fitted line are in fact estimates for
the location and scale parameters of the distribution. Although this is
not too important for the normal distribution (the location and scale are
estimated by the mean and standard deviation, respectively), it can be
useful for many other distributions.

Questions The probability plot is used to answer the following questions:


● Does a given distribution, such as the Weibull, provide a good fit
to my data?
● What distribution best fits my data?

● What are good estimates for the location and scale parameters of
the chosen distribution?
This data is a set of 500 Weibull random numbers with a shape
parameter = 2, location parameter = 0, and scale parameter = 1. The Importance: The discussion for the normal probability plot covers the use of
Weibull probability plot indicates that the Weibull distribution does in Check probability plots for checking the fixed distribution assumption.
fact fit these data well. distributional
assumption Some statistical models assume data have come from a population with
a specific type of distribution. For example, in reliability applications,
Definition: The probability plot is formed by: the Weibull, lognormal, and exponential are commonly used
Ordered ● Vertical axis: Ordered response values distributional models. Probability plots can be useful for checking this
Response distributional assumption.
● Horizontal axis: Order statistic medians for the given distribution
Values
Versus Order The order statistic medians are defined as: Related Histogram
Statistic N(i) = G(U(i)) Techniques Probability Plot Correlation Coefficient (PPCC) Plot
Medians for
the Given where the U(i) are the uniform order statistic medians (defined below) Hazard Plot
Distribution and G is the percent point function for the desired distribution. The Quantile-Quantile Plot
percent point function is the inverse of the cumulative distribution Anderson-Darling Goodness of Fit
function (probability that x is less than or equal to some value). That is, Chi-Square Goodness of Fit
given a probability, we want the corresponding x of the cumulative Kolmogorov-Smirnov Goodness of Fit
distribution function.
The uniform order statistic medians are defined as: Case Study The probability plot is demonstrated in the airplane glass failure time
m(i) = 1 - m(n) for i = 1 data case study.
m(i) = (i - 0.3175)/(n + 0.365) for i = 2, 3, ..., n-1
m(i) = 0.5**(1/n) for i = n Software Most general purpose statistical software programs support probability
plots for at least a few common distributions. Dataplot supports
In addition, a straight line can be fit to the points and added as a probability plots for a large number of distributions.
reference line. The further the points vary from this line, the greater the

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33m.htm (2 of 4) [11/13/2003 5:32:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33m.htm (3 of 4) [11/13/2003 5:32:14 PM]


1.3.3.22. Probability Plot 1.3.3.23. Probability Plot Correlation Coefficient Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.23. Probability Plot Correlation


Coefficient Plot
Purpose: The probability plot correlation coefficient (PPCC) plot (Filliben
Graphical 1975) is a graphical technique for identifying the shape parameter for
Technique for a distributional family that best describes the data set. This technique
Finding the is appropriate for families, such as the Weibull, that are defined by a
Shape single shape parameter and location and scale parameters, and it is not
Parameter of appropriate for distributions, such as the normal, that are defined only
a by location and scale parameters.
Distributional
Family that The PPCC plot is generated as follows. For a series of values for the
Best Fits a shape parameter, the correlation coefficient is computed for the
Data Set probability plot associated with a given value of the shape parameter.
These correlation coefficients are plotted against their corresponding
shape parameters. The maximum correlation coefficient corresponds
to the optimal value of the shape parameter. For better precision, two
iterations of the PPCC plot can be generated; the first is for finding
the right neighborhood and the second is for fine tuning the estimate.
The PPCC plot is used first to find a good value of the shape
parameter. The probability plot is then generated to find estimates of
the location and scale parameters and in addition to provide a
graphical assessment of the adequacy of the distributional fit.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33m.htm (4 of 4) [11/13/2003 5:32:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33n.htm (1 of 4) [11/13/2003 5:32:14 PM]


1.3.3.23. Probability Plot Correlation Coefficient Plot 1.3.3.23. Probability Plot Correlation Coefficient Plot

Compare In addition to finding a good choice for estimating the shape Use When comparing distributional models, do not simply choose the one
Distributions parameter of a given distribution, the PPCC plot can be useful in Judgement with the maximum PPCC value. In many cases, several distributional
deciding which distributional family is most appropriate. For example, When fits provide comparable PPCC values. For example, a lognormal and
given a set of reliabilty data, you might generate PPCC plots for a Selecting An Weibull may both fit a given set of reliability data quite well.
Weibull, lognormal, gamma, and inverse Gaussian distributions, and Appropriate Typically, we would consider the complexity of the distribution. That
possibly others, on a single page. This one page would show the best Distributional is, a simpler distribution with a marginally smaller PPCC value may
value for the shape parameter for several distributions and would Family be preferred over a more complex distribution. Likewise, there may be
additionally indicate which of these distributional families provides theoretical justification in terms of the underlying scientific model for
the best fit (as measured by the maximum probability plot correlation preferring a distribution with a marginally smaller PPCC value in
coefficient). That is, if the maximum PPCC value for the Weibull is some cases. In other cases, we may not need to know if the
0.99 and only 0.94 for the lognormal, then we could reasonably distributional model is optimal, only that it is adequate for our
conclude that the Weibull family is the better choice. purposes. That is, we may be able to use techniques designed for
normally distributed data even if other distributions fit the data
Tukey-Lambda The Tukey Lambda PPCC plot, with shape parameter , is somewhat better.
PPCC Plot for particularly useful for symmetric distributions. It indicates whether a
Symmetric distribution is short or long tailed and it can further indicate several Sample Plot The following is a PPCC plot of 100 normal random numbers. The
Distributions common distributions. Specifically, maximum value of the correlation coefficient = 0.997 at = 0.099.
1. = -1: distribution is approximately Cauchy
2. = 0: distribution is exactly logistic
3. = 0.14: distribution is approximately normal
4. = 0.5: distribution is U-shaped
5. = 1: distribution is exactly uniform
If the Tukey Lambda PPCC plot gives a maximum value near 0.14,
we can reasonably conclude that the normal distribution is a good
model for the data. If the maximum value is less than 0.14, a
long-tailed distribution such as the double exponential or logistic
would be a better choice. If the maximum value is near -1, this implies
the selection of very long-tailed distribution, such as the Cauchy. If
the maximum value is greater than 0.14, this implies a short-tailed
distribution such as the Beta or uniform.
The Tukey-Lambda PPCC plot is used to suggest an appropriate
distribution. You should follow-up with PPCC and probability plots of
the appropriate alternatives.
This PPCC plot shows that:
1. the best-fit symmetric distribution is nearly normal;
2. the data are not long tailed;
3. the sample mean would be an appropriate estimator of location.
We can follow-up this PPCC plot with a normal probability plot to
verify the normality model for the data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33n.htm (2 of 4) [11/13/2003 5:32:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33n.htm (3 of 4) [11/13/2003 5:32:14 PM]


1.3.3.23. Probability Plot Correlation Coefficient Plot 1.3.3.24. Quantile-Quantile Plot

Definition: The PPCC plot is formed by:


● Vertical axis: Probability plot correlation coefficient;

● Horizontal axis: Value of shape parameter.

Questions The PPCC plot answers the following questions:


1. What is the best-fit member within a distributional family? 1. Exploratory Data Analysis
1.3. EDA Techniques
2. Does the best-fit member provide a good fit (in terms of 1.3.3. Graphical Techniques: Alphabetic
generating a probability plot with a high correlation
coefficient)?
3. Does this distributional family provide a good fit compared to 1.3.3.24. Quantile-Quantile Plot
other distributions?
4. How sensitive is the choice of the shape parameter? Purpose: The quantile-quantile (q-q) plot is a graphical technique for determining
Check If if two data sets come from populations with a common distribution.
Importance Many statistical analyses are based on distributional assumptions Two Data
about the population from which the data have been obtained. Sets Can Be A q-q plot is a plot of the quantiles of the first data set against the
However, distributional families can have radically different shapes Fit With the quantiles of the second data set. By a quantile, we mean the fraction (or
depending on the value of the shape parameter. Therefore, finding a Same percent) of points below the given value. That is, the 0.3 (or 30%)
reasonable choice for the shape parameter is a necessary step in the Distribution quantile is the point at which 30% percent of the data fall below and
analysis. In many analyses, finding a good distributional model for the 70% fall above that value.
data is the primary focus of the analysis. In both of these cases, the A 45-degree reference line is also plotted. If the two sets come from a
PPCC plot is a valuable tool. population with the same distribution, the points should fall
approximately along this reference line. The greater the departure from
Related Probability Plot this reference line, the greater the evidence for the conclusion that the
Techniques Maximum Likelihood Estimation two data sets have come from populations with different distributions.
Least Squares Estimation
The advantages of the q-q plot are:
Method of Moments Estimation
1. The sample sizes do not need to be equal.
Case Study The PPCC plot is demonstrated in the airplane glass failure data case 2. Many distributional aspects can be simultaneously tested. For
example, shifts in location, shifts in scale, changes in symmetry,
study.
and the presence of outliers can all be detected from this plot. For
example, if the two data sets come from populations whose
Software PPCC plots are currently not available in most common general distributions differ only by a shift in location, the points should lie
purpose statistical software programs. However, the underlying along a straight line that is displaced either up or down from the
technique is based on probability plots and correlation coefficients, so 45-degree reference line.
it should be possible to write macros for PPCC plots in statistical
programs that support these capabilities. Dataplot supports PPCC The q-q plot is similar to a probability plot. For a probability plot, the
plots. quantiles for one of the data samples are replaced with the quantiles of a
theoretical distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33n.htm (4 of 4) [11/13/2003 5:32:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33o.htm (1 of 3) [11/13/2003 5:32:14 PM]


1.3.3.24. Quantile-Quantile Plot 1.3.3.24. Quantile-Quantile Plot

Sample Plot Questions The q-q plot is used to answer the following questions:
● Do two data sets come from populations with a common
distribution?
● Do two data sets have common location and scale?

● Do two data sets have similar distributional shapes?

● Do two data sets have similar tail behavior?

Importance: When there are two data samples, it is often desirable to know if the
Check for assumption of a common distribution is justified. If so, then location and
Common scale estimators can pool both data sets to obtain estimates of the
Distribution common location and scale. If two samples do differ, it is also useful to
gain some understanding of the differences. The q-q plot can provide
more insight into the nature of the difference than analytical methods
such as the chi-square and Kolmogorov-Smirnov 2-sample tests.

Related Bihistogram
Techniques T Test
This q-q plot shows that F Test
1. These 2 batches do not appear to have come from populations 2-Sample Chi-Square Test
with a common distribution. 2-Sample Kolmogorov-Smirnov Test
2. The batch 1 values are significantly higher than the corresponding
batch 2 values. Case Study The quantile-quantile plot is demonstrated in the ceramic strength data
3. The differences are increasing from values 525 to 625. Then the case study.
values for the 2 batches get closer again.
Software Q-Q plots are available in some general purpose statistical software
Definition: The q-q plot is formed by: programs, including Dataplot. If the number of data points in the two
Quantiles ● Vertical axis: Estimated quantiles from data set 1 samples are equal, it should be relatively easy to write a macro in
for Data Set statistical programs that do not support the q-q plot. If the number of
● Horizontal axis: Estimated quantiles from data set 2
1 Versus points are not equal, writing a macro for a q-q plot may be difficult.
Quantiles of Both axes are in units of their respective data sets. That is, the actual
Data Set 2 quantile level is not plotted. For a given point on the q-q plot, we know
that the quantile level is the same for both points, but not what that
quantile level actually is.
If the data sets have the same size, the q-q plot is essentially a plot of
sorted data set 1 against sorted data set 2. If the data sets are not of equal
size, the quantiles are usually picked to correspond to the sorted values
from the smaller data set and then the quantiles for the larger data set are
interpolated.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33o.htm (2 of 3) [11/13/2003 5:32:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33o.htm (3 of 3) [11/13/2003 5:32:14 PM]


1.3.3.25. Run-Sequence Plot 1.3.3.25. Run-Sequence Plot

Definition: Run sequence plots are formed by:


y(i) Versus i ● Vertical axis: Response variable Y(i)

● Horizontal axis: Index i (i = 1, 2, 3, ... )

Questions The run sequence plot can be used to answer the following questions
1. Exploratory Data Analysis 1. Are there any shifts in location?
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic 2. Are there any shifts in variation?
3. Are there any outliers?
The run sequence plot can also give the analyst an excellent feel for the
1.3.3.25. Run-Sequence Plot data.

Purpose: Run sequence plots (Chambers 1983) are an easy way to graphically Importance: For univariate data, the default model is
Check for summarize a univariate data set. A common assumption of univariate Check Y = constant + error
Shifts in data sets is that they behave like: Univariate
Location where the error is assumed to be random, from a fixed distribution, and
1. random drawings; Assumptions
and Scale with constant location and scale. The validity of this model depends on
and Outliers 2. from a fixed distribution; the validity of these assumptions. The run sequence plot is useful for
3. with a common location; and checking for constant location and scale.
4. with a common scale. Even for more complex models, the assumptions on the error term are
With run sequence plots, shifts in location and scale are typically quite still often the same. That is, a run sequence plot of the residuals (even
evident. Also, outliers can easily be detected. from very complex models) is still vital for checking for outliers and for
detecting shifts in location and scale.
Sample
Plot: Related Scatter Plot
Last Third Techniques Histogram
of Data Autocorrelation Plot
Shows a Lag Plot
Shift of
Location Case Study The run sequence plot is demonstrated in the Filter transmittance data
case study.

Software Run sequence plots are available in most general purpose statistical
software programs, including Dataplot.

This sample run sequence plot shows that the location shifts up for the
last third of the data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33p.htm (1 of 2) [11/13/2003 5:32:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33p.htm (2 of 2) [11/13/2003 5:32:15 PM]


1.3.3.26. Scatter Plot 1.3.3.26. Scatter Plot

Questions Scatter plots can provide answers to the following questions:


1. Are variables X and Y related?
2. Are variables X and Y linearly related?
1. Exploratory Data Analysis
3. Are variables X and Y non-linearly related?
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic 4. Does the variation in Y change depending on X?
5. Are there outliers?

1.3.3.26. Scatter Plot Examples 1. No relationship


2. Strong linear (positive correlation)
Purpose: A scatter plot (Chambers 1983) reveals relationships or association
Check for between two variables. Such relationships manifest themselves by any 3. Strong linear (negative correlation)
Relationship non-random structure in the plot. Various common types of patterns are 4. Exact linear (positive correlation)
demonstrated in the examples. 5. Quadratic relationship
6. Exponential relationship
Sample
Plot: 7. Sinusoidal relationship (damped)
Linear 8. Variation of Y doesn't depend on X (homoscedastic)
Relationship
9. Variation of Y does depend on X (heteroscedastic)
Between
Variables Y 10. Outlier
and X
Combining Scatter plots can also be combined in multiple plots per page to help
Scatter Plots understand higher-level structure in data sets with more than two
variables.
The scatterplot matrix generates all pairwise scatter plots on a single
page. The conditioning plot, also called a co-plot or subset plot,
generates scatter plots of Y versus X dependent on the value of a third
variable.

Causality Is The scatter plot uncovers relationships in data. "Relationships" means


Not Proved that there is some structured association (linear, quadratic, etc.) between
This sample plot reveals a linear relationship between the two variables By X and Y. Note, however, that even though
indicating that a linear regression model might be appropriate. Association
causality implies association
Definition: A scatter plot is a plot of the values of Y versus the corresponding association does NOT imply causality.
Y Versus X values of X: Scatter plots are a useful diagnostic tool for determining association, but
● Vertical axis: variable Y--usually the response variable if such association exists, the plot may or may not suggest an underlying
● Horizontal axis: variable X--usually some variable we suspect
cause-and-effect mechanism. A scatter plot can never "prove" cause and
may ber related to the response effect--it is ultimately only the researcher (relying on the underlying
science/engineering) who can conclude that causality actually exists.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q.htm (1 of 3) [11/13/2003 5:32:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q.htm (2 of 3) [11/13/2003 5:32:15 PM]


1.3.3.26. Scatter Plot 1.3.3.26.1. Scatter Plot: No Relationship

Appearance The most popular rendition of a scatter plot is


1. some plot character (e.g., X) at the data points, and
2. no line connecting data points.
1. Exploratory Data Analysis
Other scatter plot format variants include 1.3. EDA Techniques
1. an optional plot character (e.g, X) at the data points, but 1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot
2. a solid line connecting data points.
In both cases, the resulting plot is referred to as a scatter plot, although
the former (discrete and disconnected) is the author's personal 1.3.3.26.1. Scatter Plot: No Relationship
preference since nothing makes it onto the screen except the data--there
are no interpolative artifacts to bias the interpretation. Scatter Plot
with No
Related Run Sequence Plot Relationship
Techniques Box Plot
Block Plot

Case Study The scatter plot is demonstrated in the load cell calibration data case
study.

Software Scatter plots are a fundamental technique that should be available in any
general purpose statistical software program, including Dataplot. Scatter
plots are also available in most graphics and spreadsheet programs as
well.

Discussion Note in the plot above how for a given value of X (say X = 0.5), the
corresponding values of Y range all over the place from Y = -2 to Y = +2.
The same is true for other values of X. This lack of predictablility in
determining Y from a given value of X, and the associated amorphous,
non-structured appearance of the scatter plot leads to the summary
conclusion: no relationship.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q.htm (3 of 3) [11/13/2003 5:32:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q1.htm [11/13/2003 5:32:15 PM]


1.3.3.26.2. Scatter Plot: Strong Linear (positive correlation) Relationship 1.3.3.26.3. Scatter Plot: Strong Linear (negative correlation) Relationship

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.3. EDA Techniques 1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic 1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot 1.3.3.26. Scatter Plot

1.3.3.26.2. Scatter Plot: Strong Linear 1.3.3.26.3. Scatter Plot: Strong Linear
(positive correlation) (negative correlation)
Relationship Relationship
Scatter Plot Scatter Plot
Showing Showing a
Strong Strong
Positive Negative
Linear Correlation
Correlation

Discussion Note in the plot above how a straight line comfortably fits through the Discussion Note in the plot above how a straight line comfortably fits through the
data; hence a linear relationship exists. The scatter about the line is quite data; hence there is a linear relationship. The scatter about the line is
small, so there is a strong linear relationship. The slope of the line is quite small, so there is a strong linear relationship. The slope of the line
positive (small values of X correspond to small values of Y; large values is negative (small values of X correspond to large values of Y; large
of X correspond to large values of Y), so there is a positive co-relation values of X correspond to small values of Y), so there is a negative
(that is, a positive correlation) between X and Y. co-relation (that is, a negative correlation) between X and Y.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q2.htm [11/13/2003 5:32:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q3.htm [11/13/2003 5:32:16 PM]


1.3.3.26.4. Scatter Plot: Exact Linear (positive correlation) Relationship 1.3.3.26.4. Scatter Plot: Exact Linear (positive correlation) Relationship

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot

1.3.3.26.4. Scatter Plot: Exact Linear


(positive correlation)
Relationship
Scatter Plot
Showing an
Exact
Linear
Relationship

Discussion Note in the plot above how a straight line comfortably fits through the
data; hence there is a linear relationship. The scatter about the line is
zero--there is perfect predictability between X and Y), so there is an
exact linear relationship. The slope of the line is positive (small values
of X correspond to small values of Y; large values of X correspond to
large values of Y), so there is a positive co-relation (that is, a positive
correlation) between X and Y.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q4.htm (1 of 2) [11/13/2003 5:32:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q4.htm (2 of 2) [11/13/2003 5:32:16 PM]


1.3.3.26.5. Scatter Plot: Quadratic Relationship 1.3.3.26.5. Scatter Plot: Quadratic Relationship

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot

1.3.3.26.5. Scatter Plot: Quadratic


Relationship
Scatter Plot
Showing
Quadratic
Relationship

Discussion Note in the plot above how no imaginable simple straight line could
ever adequately describe the relationship between X and Y--a curved (or
curvilinear, or non-linear) function is needed. The simplest such
curvilinear function is a quadratic model

for some A, B, and C. Many other curvilinear functions are possible, but
the data analysis principle of parsimony suggests that we try fitting a
quadratic function first.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q5.htm (1 of 2) [11/13/2003 5:32:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q5.htm (2 of 2) [11/13/2003 5:32:16 PM]


1.3.3.26.6. Scatter Plot: Exponential Relationship 1.3.3.26.6. Scatter Plot: Exponential Relationship

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot

1.3.3.26.6. Scatter Plot: Exponential


Relationship
Scatter Plot
Showing
Exponential
Relationship

Discussion Note that a simple straight line is grossly inadequate in describing the
relationship between X and Y. A quadratic model would prove lacking,
especially for large values of X. In this example, the large values of X
correspond to nearly constant values of Y, and so a non-linear function
beyond the quadratic is needed. Among the many other non-linear
functions available, one of the simpler ones is the exponential model

for some A, B, and C. In this case, an exponential function would, in


fact, fit well, and so one is led to the summary conclusion of an
exponential relationship.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q6.htm (1 of 2) [11/13/2003 5:32:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q6.htm (2 of 2) [11/13/2003 5:32:16 PM]


1.3.3.26.7. Scatter Plot: Sinusoidal Relationship (damped) 1.3.3.26.7. Scatter Plot: Sinusoidal Relationship (damped)

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot

1.3.3.26.7. Scatter Plot: Sinusoidal


Relationship (damped)
Scatter Plot
Showing a
Sinusoidal
Relationship

Discussion The complex relationship between X and Y appears to be basically


oscillatory, and so one is naturally drawn to the trigonometric sinusoidal
model:

Closer inspection of the scatter plot reveals that the amount of swing
(the amplitude in the model) does not appear to be constant but rather
is decreasing (damping) as X gets large. We thus would be led to the
conclusion: damped sinusoidal relationship, with the simplest
corresponding model being

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q7.htm (1 of 2) [11/13/2003 5:32:17 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q7.htm (2 of 2) [11/13/2003 5:32:17 PM]


1.3.3.26.8. Scatter Plot: Variation of Y Does Not Depend on X (homoscedastic) 1.3.3.26.8. Scatter Plot: Variation of Y Does Not Depend on X (homoscedastic)

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot

1.3.3.26.8. Scatter Plot: Variation of Y Does


Not Depend on X
(homoscedastic)
Scatter Plot
Showing
Homoscedastic
Variability

Discussion This scatter plot reveals a linear relationship between X and Y: for a
given value of X, the predicted value of Y will fall on a line. The plot
further reveals that the variation in Y about the predicted value is
about the same (+- 10 units), regardless of the value of X.
Statistically, this is referred to as homoscedasticity. Such
homoscedasticity is very important as it is an underlying assumption
for regression, and its violation leads to parameter estimates with
inflated variances. If the data are homoscedastic, then the usual
regression estimates can be used. If the data are not homoscedastic,
then the estimates can be improved using weighting procedures as
shown in the next example.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q8.htm (1 of 2) [11/13/2003 5:32:22 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q8.htm (2 of 2) [11/13/2003 5:32:22 PM]


1.3.3.26.9. Scatter Plot: Variation of Y Does Depend on X (heteroscedastic) 1.3.3.26.9. Scatter Plot: Variation of Y Does Depend on X (heteroscedastic)

2. performing a Y variable transformation to achieve


homoscedasticity. The Box-Cox normality plot can help
determine a suitable transformation.

1. Exploratory Data Analysis Impact of Fortunately, unweighted regression analyses on heteroscedastic data
1.3. EDA Techniques Ignoring produce estimates of the coefficients that are unbiased. However, the
1.3.3. Graphical Techniques: Alphabetic Unequal coefficients will not be as precise as they would be with proper
1.3.3.26. Scatter Plot Variability in weighting.
the Data
Note further that if heteroscedasticity does exist, it is frequently
1.3.3.26.9. Scatter Plot: Variation of Y Does useful to plot and model the local variation as a
function of X, as in . This modeling has
Depend on X (heteroscedastic) two advantages:
1. it provides additional insight and understanding as to how the
Scatter Plot
response Y relates to X; and
Showing
Heteroscedastic 2. it provides a convenient means of forming weights for a
Variability weighted regression by simply using

The topic of non-constant variation is discussed in some detail in the


process modeling chapter.

Discussion This scatter plot reveals an approximate linear relationship between


X and Y, but more importantly, it reveals a statistical condition
referred to as heteroscedasticity (that is, nonconstant variation in Y
over the values of X). For a heteroscedastic data set, the variation in
Y differs depending on the value of X. In this example, small values
of X yield small scatter in Y while large values of X result in large
scatter in Y.
Heteroscedasticity complicates the analysis somewhat, but its effects
can be overcome by:
1. proper weighting of the data with noisier data being weighted
less, or by

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q9.htm (1 of 2) [11/13/2003 5:32:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33q9.htm (2 of 2) [11/13/2003 5:32:23 PM]


1.3.3.26.10. Scatter Plot: Outlier 1.3.3.26.10. Scatter Plot: Outlier

the outlying point).

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.26. Scatter Plot

1.3.3.26.10. Scatter Plot: Outlier


Scatter Plot
Showing
Outliers

Discussion The scatter plot here reveals


1. a basic linear relationship between X and Y for most of the data,
and
2. a single outlier (at X = 375).
An outlier is defined as a data point that emanates from a different
model than do the rest of the data. The data here appear to come from a
linear model with a given slope and variation except for the outlier
which appears to have been generated from some other model.
Outlier detection is important for effective modeling. Outliers should be
excluded from such model fitting. If all the data here are included in a
linear regression, then the fitted model will be poor virtually
everywhere. If the outlier is omitted from the fitting process, then the
resulting fit will be excellent almost everywhere (for all points except

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33qa.htm (1 of 2) [11/13/2003 5:32:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33qa.htm (2 of 2) [11/13/2003 5:32:23 PM]


1.3.3.26.11. Scatterplot Matrix 1.3.3.26.11. Scatterplot Matrix

name.
5. Some analysts prefer to connect the scatter plots. Others prefer to
leave a little gap between each plot.
6. Although this plot type is most commonly used for scatter plots,
1. Exploratory Data Analysis the basic concept is both simple and powerful and extends easily
1.3. EDA Techniques to other plot formats that involve pairwise plots such as the
1.3.3. Graphical Techniques: Alphabetic quantile-quantile plot and the bihistogram.
1.3.3.26. Scatter Plot
Sample Plot
1.3.3.26.11. Scatterplot Matrix
Purpose: Given a set of variables X1, X2, ... , Xk, the scatterplot matrix contains
Check all the pairwise scatter plots of the variables on a single page in a
Pairwise matrix format. That is, if there are k variables, the scatterplot matrix
Relationships will have k rows and k columns and the ith row and jth column of this
Between matrix is a plot of Xi versus Xj.
Variables
Although the basic concept of the scatterplot matrix is simple, there are
numerous alternatives in the details of the plots.
1. The diagonal plot is simply a 45-degree line since we are plotting
Xi versus Xi. Although this has some usefulness in terms of
showing the univariate distribution of the variable, other
alternatives are common. Some users prefer to use the diagonal
to print the variable label. Another alternative is to plot the
univariate histogram on the diagonal. Alternatively, we could
simply leave the diagonal blank. This sample plot was generated from pollution data collected by NIST
2. Since Xi versus Xj is equivalent to Xj versus Xi with the axes chemist Lloyd Currie.
reversed, some prefer to omit the plots below the diagonal. There are a number of ways to view this plot. If we are primarily
3. It can be helpful to overlay some type of fitted curve on the interested in a particular variable, we can scan the row and column for
scatter plot. Although a linear or quadratic fit can be used, the that variable. If we are interested in finding the strongest relationship,
most common alternative is to overlay a lowess curve. we can scan all the plots and then determine which variables are
related.
4. Due to the potentially large number of plots, it can be somewhat
tricky to provide the axes labels in a way that is both informative
and visually pleasing. One alternative that seems to work well is Definition Given k variables, scatter plot matrices are formed by creating k rows
to provide axis labels on alternating rows and columns. That is, and k columns. Each row and column defines a single scatter plot
row one will have tic marks and axis labels on the left vertical The individual plot for row i and column j is defined as
axis for the first plot only while row two will have the tic marks
● Vertical axis: Variable Xi
and axis labels for the right vertical axis for the last plot in the
row only. This alternating pattern continues for the remaining ● Horizontal axis: Variable Xj
rows. A similar pattern is used for the columns and the horizontal
axes labels. Another alternative is to put the minimum and
maximum scale value in the diagonal plot with the variable

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33qb.htm (1 of 3) [11/13/2003 5:32:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33qb.htm (2 of 3) [11/13/2003 5:32:23 PM]


1.3.3.26.11. Scatterplot Matrix 1.3.3.26.12. Conditioning Plot

Questions The scatterplot matrix can provide answers to the following questions:
1. Are there pairwise relationships between the variables?
2. If there are relationships, what is the nature of these
relationships? 1. Exploratory Data Analysis
1.3. EDA Techniques
3. Are there outliers in the data? 1.3.3. Graphical Techniques: Alphabetic
4. Is there clustering by groups in the data? 1.3.3.26. Scatter Plot

Linking and The scatterplot matrix serves as the foundation for the concepts of
Brushing linking and brushing. 1.3.3.26.12. Conditioning Plot
By linking, we mean showing how a point, or set of points, behaves in
each of the plots. This is accomplished by highlighting these points in Purpose: A conditioning plot, also known as a coplot or subset plot, is a plot of
some fashion. For example, the highlighted points could be drawn as a Check two variables conditional on the value of a third variable (called the
filled circle while the remaining points could be drawn as unfilled pairwise conditioning variable). The conditioning variable may be either a
circles. A typical application of this would be to show how an outlier relationship variable that takes on only a few discrete values or a continuous variable
shows up in each of the individual pairwise plots. Brushing extends this between two that is divided into a limited number of subsets.
concept a bit further. In brushing, the points to be highlighted are variables
conditional One limitation of the scatterplot matrix is that it cannot show interaction
interactively selected by a mouse and the scatterplot matrix is effects with another variable. This is the strength of the conditioning
dynamically updated (ideally in real time). That is, we can select a on a third
variable plot. It is also useful for displaying scatter plots for groups in the data.
rectangular region of points in one plot and see how those points are Although these groups can also be plotted on a single plot with different
reflected in the other plots. Brushing is discussed in detail by Becker, plot symbols, it can often be visually easier to distinguish the groups
Cleveland, and Wilks in the paper "Dynamic Graphics for Data using the conditioning plot.
Analysis" (Cleveland and McGill, 1988).
Although the basic concept of the conditioning plot matrix is simple,
Related Star plot there are numerous alternatives in the details of the plots.
Techniques Scatter plot 1. It can be helpful to overlay some type of fitted curve on the
Conditioning plot scatter plot. Although a linear or quadratic fit can be used, the
Locally weighted least squares most common alternative is to overlay a lowess curve.
2. Due to the potentially large number of plots, it can be somewhat
Software Scatterplot matrices are becoming increasingly common in general tricky to provide the axis labels in a way that is both informative
purpose statistical software programs, including Dataplot. If a software and visually pleasing. One alternative that seems to work well is
program does not generate scatterplot matrices, but it does provide to provide axis labels on alternating rows and columns. That is,
multiple plots per page and scatter plots, it should be possible to write a row one will have tic marks and axis labels on the left vertical
macro to generate a scatterplot matrix. Brushing is available in a few of axis for the first plot only while row two will have the tic marks
the general purpose statistical software programs that emphasize and axis labels for the right vertical axis for the last plot in the
graphical approaches. row only. This alternating pattern continues for the remaining
rows. A similar pattern is used for the columns and the horizontal
axis labels. Note that this approach only works if the axes limits
are fixed to common values for all of the plots.
3. Some analysts prefer to connect the scatter plots. Others prefer to
leave a little gap between each plot. Alternatively, each plot can
have its own labeling with the plots not connected.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33qb.htm (3 of 3) [11/13/2003 5:32:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33qc.htm (1 of 3) [11/13/2003 5:32:23 PM]


1.3.3.26.12. Conditioning Plot 1.3.3.26.12. Conditioning Plot

4. Although this plot type is most commonly used for scatter plots,
the basic concept is both simple and powerful and extends easily Questions The conditioning plot can provide answers to the following questions:
to other plot formats.
1. Is there a relationship between two variables?
Sample Plot 2. If there is a relationship, does the nature of the relationship
depend on the value of a third variable?
3. Are groups in the data similar?
4. Are there outliers in the data?

Related Scatter plot


Techniques Scatterplot matrix
Locally weighted least squares

Software Scatter plot matrices are becoming increasingly common in general


purpose statistical software programs, including Dataplot. If a software
program does not generate conditioning plots, but it does provide
multiple plots per page and scatter plots, it should be possible to write a
macro to generate a conditioning plot.

In this case, temperature has six distinct values. We plot torque versus
time for each of these temperatures. This example is discussed in more
detail in the process modeling chapter.

Definition Given the variables X, Y, and Z, the conditioning plot is formed by


dividing the values of Z into k groups. There are several ways that these
groups may be formed. There may be a natural grouping of the data, the
data may be divided into several equal sized groups, the grouping may
be determined by clusters in the data, and so on. The page will be
divided into n rows and c columns where . Each row and
column defines a single scatter plot.
The individual plot for row i and column j is defined as
● Vertical axis: Variable Y

● Horizontal axis: Variable X

where only the points in the group corresponding to the ith row and jth
column are used.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33qc.htm (2 of 3) [11/13/2003 5:32:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33qc.htm (3 of 3) [11/13/2003 5:32:23 PM]


1.3.3.27. Spectral Plot 1.3.3.27. Spectral Plot

Sample Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.27. Spectral Plot


Purpose: A spectral plot ( Jenkins and Watts 1968 or Bloomfield 1976) is a
Examine graphical technique for examining cyclic structure in the frequency
Cyclic domain. It is a smoothed Fourier transform of the autocovariance
Structure function.
The frequency is measured in cycles per unit time where unit time is
defined to be the distance between 2 points. A frequency of 0
corresponds to an infinite cycle while a frequency of 0.5 corresponds to
a cycle of 2 data points. Equi-spaced time series are inherently limited to This spectral plot shows one dominant frequency of approximately 0.3
detecting frequencies between 0 and 0.5. cycles per observation.
Trends should typically be removed from the time series before
applying the spectral plot. Trends can be detected from a run sequence Definition: The spectral plot is formed by:
Variance ● Vertical axis: Smoothed variance (power)
plot. Trends are typically removed by differencing the series or by
Versus
fitting a straight line (or some other polynomial curve) and applying the ● Horizontal axis: Frequency (cycles per observation)
Frequency
spectral analysis to the residuals. The computations for generating the smoothed variances can be
involved and are not discussed further here. The details can be found in
Spectral plots are often used to find a starting value for the frequency, the Jenkins and Bloomfield references and in most texts that discuss the
, in the sinusoidal model frequency analysis of time series.

See the beam deflection case study for an example of this. Questions The spectral plot can be used to answer the following questions:
1. How many cyclic components are there?
2. Is there a dominant cyclic frequency?
3. If there is a dominant cyclic frequency, what is it?

Importance The spectral plot is the primary technique for assessing the cyclic nature
Check of univariate time series in the frequency domain. It is almost always the
Cyclic second plot (after a run sequence plot) generated in a frequency domain
Behavior of analysis of a time series.
Time Series

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r.htm (1 of 3) [11/13/2003 5:32:24 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r.htm (2 of 3) [11/13/2003 5:32:24 PM]


1.3.3.27. Spectral Plot 1.3.3.27.1. Spectral Plot: Random Data

Examples 1. Random (= White Noise)


2. Strong autocorrelation and autoregressive model
3. Sinusoidal model
1. Exploratory Data Analysis
1.3. EDA Techniques
Related Autocorrelation Plot 1.3.3. Graphical Techniques: Alphabetic
Techniques Complex Demodulation Amplitude Plot 1.3.3.27. Spectral Plot
Complex Demodulation Phase Plot

Case Study The spectral plot is demonstrated in the beam deflection data case study.
1.3.3.27.1. Spectral Plot: Random Data
Software Spectral plots are a fundamental technique in the frequency analysis of Spectral
time series. They are available in many general purpose statistical Plot of 200
software programs, including Dataplot. Normal
Random
Numbers

Conclusions We can make the following conclusions from the above plot.
1. There are no dominant peaks.
2. There is no identifiable pattern in the spectrum.
3. The data are random.

Discussion For random data, the spectral plot should show no dominant peaks or
distinct pattern in the spectrum. For the sample plot above, there are no
clearly dominant peaks and the peaks seem to fluctuate at random. This
type of appearance of the spectral plot indicates that there are no
significant cyclic patterns in the data.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r.htm (3 of 3) [11/13/2003 5:32:24 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r1.htm (1 of 2) [11/13/2003 5:32:24 PM]


1.3.3.27.1. Spectral Plot: Random Data 1.3.3.27.2. Spectral Plot: Strong Autocorrelation and Autoregressive Model

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
1.3.3.27. Spectral Plot

1.3.3.27.2. Spectral Plot: Strong


Autocorrelation and
Autoregressive Model
Spectral Plot
for Random
Walk Data

Conclusions We can make the following conclusions from the above plot.
1. Strong dominant peak near zero.
2. Peak decays rapidly towards zero.
3. An autoregressive model is an appropriate model.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r1.htm (2 of 2) [11/13/2003 5:32:24 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r2.htm (1 of 2) [11/13/2003 5:32:25 PM]


1.3.3.27.2. Spectral Plot: Strong Autocorrelation and Autoregressive Model 1.3.3.27.3. Spectral Plot: Sinusoidal Model

Discussion This spectral plot starts with a dominant peak near zero and rapidly
decays to zero. This is the spectral plot signature of a process with
strong positive autocorrelation. Such processes are highly non-random
in that there is high association between an observation and a 1. Exploratory Data Analysis
succeeding observation. In short, if you know Yi you can make a 1.3. EDA Techniques
strong guess as to what Yi+1 will be. 1.3.3. Graphical Techniques: Alphabetic
1.3.3.27. Spectral Plot
Recommended The next step would be to determine the parameters for the
Next Step autoregressive model:
1.3.3.27.3. Spectral Plot: Sinusoidal Model
Such estimation can be done by linear regression or by fitting a Spectral Plot
Box-Jenkins autoregressive (AR) model. for Sinusoidal
Model
The residual standard deviation for this autoregressive model will be
much smaller than the residual standard deviation for the default
model

Then the system should be reexamined to find an explanation for the


strong autocorrelation. Is it due to the
1. phenomenon under study; or
2. drifting in the environment; or
3. contamination from the data acquisition system (DAS)?
Oftentimes the source of the problem is item (3) above where
contamination and carry-over from the data acquisition system result
because the DAS does not have time to electronically recover before
collecting the next data point. If this is the case, then consider slowing
down the sampling rate to re-achieve randomness.
Conclusions We can make the following conclusions from the above plot.
1. There is a single dominant peak at approximately 0.3.
2. There is an underlying single-cycle sinusoidal model.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r2.htm (2 of 2) [11/13/2003 5:32:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r3.htm (1 of 2) [11/13/2003 5:32:25 PM]


1.3.3.27.3. Spectral Plot: Sinusoidal Model 1.3.3.28. Standard Deviation Plot

Discussion This spectral plot shows a single dominant frequency. This indicates
that a single-cycle sinusoidal model might be appropriate.
If one were to naively assume that the data represented by the graph
could be fit by the model 1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic
and then estimate the constant by the sample mean, the analysis would
be incorrect because
● the sample mean is biased;
1.3.3.28. Standard Deviation Plot
● the confidence interval for the mean, which is valid only for
random data, is meaningless and too small. Purpose: Standard deviation plots are used to see if the standard deviation varies
Detect between different groups of the data. The grouping is determined by the
On the other hand, the choice of the proper model
Changes in analyst. In most cases, the data provide a specific grouping variable. For
Scale example, the groups may be the levels of a factor variable. In the sample
where is the amplitude, is the frequency (between 0 and .5 cycles Between plot below, the months of the year provide the grouping.
per observation), and is the phase can be fit by non-linear least Groups
Standard deviation plots can be used with ungrouped data to determine
squares. The beam deflection data case study demonstrates fitting this if the standard deviation is changing over time. In this case, the data are
type of model. broken into an arbitrary number of equal-sized groups. For example, a
data series with 400 points can be divided into 10 groups of 40 points
Recommended The recommended next steps are to: each. A standard deviation plot can then be generated with these groups
Next Steps 1. Estimate the frequency from the spectral plot. This will be to see if the standard deviation is increasing or decreasing over time.
helpful as a starting value for the subsequent non-linear fitting. Although the standard deviation is the most commonly used measure of
A complex demodulation phase plot can be used to fine tune the scale, the same concept applies to other measures of scale. For example,
estimate of the frequency before performing the non-linear fit. instead of plotting the standard deviation of each group, the median
2. Do a complex demodulation amplitude plot to obtain an initial absolute deviation or the average absolute deviation might be plotted
estimate of the amplitude and to determine if a constant instead. This might be done if there were significant outliers in the data
amplitude is justified. and a more robust measure of scale than the standard deviation was
3. Carry out a non-linear fit of the model desired.
Standard deviation plots are typically used in conjunction with mean
plots. The mean plot would be used to check for shifts in location while
the standard deviation plot would be used to check for shifts in scale.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33r3.htm (2 of 2) [11/13/2003 5:32:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33s.htm (1 of 3) [11/13/2003 5:32:25 PM]


1.3.3.28. Standard Deviation Plot 1.3.3.28. Standard Deviation Plot

Sample Plot Related Mean Plot


Techniques Dex Standard Deviation Plot

Software Most general purpose statistical software programs do not support a


standard deviation plot. However, if the statistical program can generate
the standard deviation for a group, it should be feasible to write a macro
to generate this plot. Dataplot supports a standard deviation plot.

This sample standard deviation plot shows


1. there is a shift in variation;
2. greatest variation is during the summer months.

Definition: Standard deviation plots are formed by:


Group ● Vertical axis: Group standard deviations
Standard
● Horizontal axis: Group identifier
Deviations
Versus A reference line is plotted at the overall standard deviation.
Group ID

Questions The standard deviation plot can be used to answer the following
questions.
1. Are there any shifts in variation?
2. What is the magnitude of the shifts in variation?
3. Is there a distinct pattern in the shifts in variation?

Importance: A common assumption in 1-factor analyses is that of equal variances.


Checking That is, the variance is the same for different levels of the factor
Assumptions variable. The standard deviation plot provides a graphical check for that
assumption. A common assumption for univariate data is that the
variance is constant. By grouping the data into equi-sized intervals, the
standard deviation plot can provide a graphical test of this assumption.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33s.htm (2 of 3) [11/13/2003 5:32:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33s.htm (3 of 3) [11/13/2003 5:32:25 PM]


1.3.3.29. Star Plot 1.3.3.29. Star Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.29. Star Plot


Purpose: The star plot (Chambers 1983) is a method of displaying multivariate
Display data. Each star represents a single observation. Typically, star plots are
Multivariate generated in a multi-plot format with many stars on each page and each
Data star representing one observation.
Star plots are used to examine the relative values for a single data point
We can look at these plots individually or we can use them to identify
(e.g., point 3 is large for variables 2 and 4, small for variables 1, 3, 5,
clusters of cars with similar features. For example, we can look at the
and 6) and to locate similar points or dissimilar points.
star plot of the Cadillac Seville and see that it is one of the most
expensive cars, gets below average (but not among the worst) gas
Sample Plot The plot below contains the star plots of 16 cars. The data file actually
mileage, has an average repair record, and has average-to-above-average
contains 74 cars, but we restrict the plot to what can reasonably be
roominess and size. We can then compare the Cadillac models (the last
shown on one page. The variable list for the sample star plot is
three plots) with the AMC models (the first three plots). This
1 Price comparison shows distinct patterns. The AMC models tend to be
2 Mileage (MPG) inexpensive, have below average gas mileage, and are small in both
3 1978 Repair Record (1 = Worst, 5 = Best) height and weight and in roominess. The Cadillac models are expensive,
4 1977 Repair Record (1 = Worst, 5 = Best) have poor gas mileage, and are large in both size and roominess.
5 Headroom
6 Rear Seat Room Definition The star plot consists of a sequence of equi-angular spokes, called radii,
7 Trunk Space with each spoke representing one of the variables. The data length of a
8 Weight spoke is proportional to the magnitude of the variable for the data point
9 Length relative to the maximum magnitude of the variable across all data
points. A line is drawn connecting the data values for each spoke. This
gives the plot a star-like appearance and the origin of the name of this
plot.

Questions The star plot can be used to answer the following questions:
1. What variables are dominant for a given observation?
2. Which observations are most similar, i.e., are there clusters of
observations?
3. Are there outliers?

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33t.htm (1 of 3) [11/13/2003 5:32:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33t.htm (2 of 3) [11/13/2003 5:32:25 PM]


1.3.3.29. Star Plot 1.3.3.30. Weibull Plot

Weakness in Star plots are helpful for small-to-moderate-sized multivariate data sets.
Technique Their primary weakness is that their effectiveness is limited to data sets
with less than a few hundred points. After that, they tend to be
overwhelming. 1. Exploratory Data Analysis
1.3. EDA Techniques
Graphical techniques suited for large data sets are discussed by Scott.
1.3.3. Graphical Techniques: Alphabetic

Related Alternative ways to plot multivariate data are discussed in Chambers, du


Techniques Toit, and Everitt. 1.3.3.30. Weibull Plot
Software Star plots are available in some general purpose statistical software Purpose: The Weibull plot (Nelson 1982) is a graphical technique for
progams, including Dataplot. Graphical determining if a data set comes from a population that would logically
Check To See be fit by a 2-parameter Weibull distribution (the location is assumed to
If Data Come be zero).
From a
Population The Weibull plot has special scales that are designed so that if the data
That Would do in fact follow a Weibull distribution, the points will be linear (or
Be Fit by a nearly linear). The least squares fit of this line yields estimates for the
Weibull shape and scale parameters of the Weibull distribution. Weibull
Distribution distribution (the location is assumed to be zero).

Sample Plot

This Weibull plot shows that:


1. the assumption of a Weibull distribution is reasonable;
2. the shape parameter estimate is computed to be 33.32;
3. the scale parameter estimate is computed to be 5.28; and

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33t.htm (3 of 3) [11/13/2003 5:32:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33u.htm (1 of 3) [11/13/2003 5:32:26 PM]


1.3.3.30. Weibull Plot 1.3.3.30. Weibull Plot

4. there are no outliers.

Definition: The Weibull plot is formed by:


Weibull ● Vertical axis: Weibull cumulative probability expressed as a
Cumulative percentage
Probability
● Horizontal axis: LN of ordered response
Versus
LN(Ordered The vertical scale is ln-ln(1-p) where p=(i-0.3)/(n+0.4) and i is the rank
Response) of the observation. This scale is chosen in order to linearize the
resulting plot for Weibull data.

Questions The Weibull plot can be used to answer the following questions:
1. Do the data follow a 2-parameter Weibull distribution?
2. What is the best estimate of the shape parameter for the
2-parameter Weibull distribution?
3. What is the best estimate of the scale (= variation) parameter for
the 2-parameter Weibull distribution?

Importance: Many statistical analyses, particularly in the field of reliability, are


Check based on the assumption that the data follow a Weibull distribution. If
Distributional the analysis assumes the data follow a Weibull distribution, it is
Assumptions important to verify this assumption and, if verified, find good estimates
of the Weibull parameters.

Related Weibull Probability Plot


Techniques Weibull PPCC Plot
Weibull Hazard Plot
The Weibull probability plot (in conjunction with the Weibull PPCC
plot), the Weibull hazard plot, and the Weibull plot are all similar
techniques that can be used for assessing the adequacy of the Weibull
distribution as a model for the data, and additionally providing
estimation for the shape, scale, or location parameters.
The Weibull hazard plot and Weibull plot are designed to handle
censored data (which the Weibull probability plot does not).

Case Study The Weibull plot is demonstrated in the airplane glass failure data case
study.

Software Weibull plots are generally available in statistical software programs


that are designed to analyze reliability data. Dataplot supports the
Weibull plot.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33u.htm (2 of 3) [11/13/2003 5:32:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33u.htm (3 of 3) [11/13/2003 5:32:26 PM]


1.3.3.31. Youden Plot 1.3.3.31. Youden Plot

Definition: Youden plots are formed by:


Response 1 1. Vertical axis: Response variable 1 (i.e., run 1 or product 1
Versus response value)
Response 2
2. Horizontal axis: Response variable 2 (i.e., run 2 or product 2
Coded by
1. Exploratory Data Analysis response value)
Lab
1.3. EDA Techniques In addition, the plot symbol is the lab id (typically an integer from 1 to k
1.3.3. Graphical Techniques: Alphabetic where k is the number of labs). Sometimes a 45-degree reference line is
drawn. Ideally, a lab generating two runs of the same product should
produce reasonably similar results. Departures from this reference line
1.3.3.31. Youden Plot indicate inconsistency from the lab. If two different products are being
tested, then a 45-degree line may not be appropriate. However, if the
Purpose: Youden plots are a graphical technique for analyzing interlab data when labs are consistent, the points should lie near some fitted straight line.
Interlab each lab has made two runs on the same product or one run on two
Comparisons different products. Questions The Youden plot can be used to answer the following questions:
The Youden plot is a simple but effective method for comparing both 1. Are all labs equivalent?
the within-laboratory variability and the between-laboratory variability. 2. What labs have between-lab problems (reproducibility)?
3. What labs have within-lab problems (repeatability)?
Sample Plot
4. What labs are outliers?

Importance In interlaboratory studies or in comparing two runs from the same lab, it
is useful to know if consistent results are generated. Youden plots
should be a routine plot for analyzing this type of data.

DEX Youden The dex Youden plot is a specialized Youden plot used in the design of
Plot experiments. In particular, it is useful for full and fractional designs.

Related Scatter Plot


Techniques

Software The Youden plot is essentially a scatter plot, so it should be feasible to


write a macro for a Youden plot in any general purpose statistical
program that supports scatter plots. Dataplot supports a Youden plot.

This plot shows:


1. Not all labs are equivalent.
2. Lab 4 is biased low.
3. Lab 3 has within-lab variability problems.
4. Lab 5 has an outlying run.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3331.htm (1 of 2) [11/13/2003 5:32:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3331.htm (2 of 2) [11/13/2003 5:32:26 PM]


1.3.3.31.1. DEX Youden Plot 1.3.3.31.1. DEX Youden Plot

"-1" or "+1".
In summary, the dex Youden plot is a plot of the mean of the response
variable for the high level of a factor or interaction term against the
mean of the response variable for the low level of that factor or
1. Exploratory Data Analysis interaction term.
1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic For unimportant factors and interaction terms, these mean values
1.3.3.31. Youden Plot should be nearly the same. For important factors and interaction terms,
these mean values should be quite different. So the interpretation of the
plot is that unimportant factors should be clustered together near the
1.3.3.31.1. DEX Youden Plot grand mean. Points that stand apart from this cluster identify important
factors that should be included in the model.
DEX Youden The dex (Design of Experiments) Youden plot is a specialized Youden
Plot: plot used in the analysis of full and fractional experiment designs. In Sample DEX The following is a dex Youden plot for the data used in the Eddy
Introduction particular, it is used in support of a Yates analysis. These designs may Youden Plot current case study. The analysis in that case study demonstrated that
have a low level, coded as "-1" or "-", and a high level, coded as "+1" X1 and X2 were the most important factors.
or "+", for each factor. In addition, there can optionally be one or more
center points. Center points are at the midpoint between the low and
high levels for each factor and are coded as "0".
The Yates analysis and the the dex Youden plot only use the "-1" and
"+1" points. The Yates analysis is used to estimate factor effects. The
dex Youden plot can be used to help determine the approriate model to
use from the Yates analysis.

Construction The following are the primary steps in the construction of the dex
of DEX Youden plot.
Youden Plot
1. For a given factor or interaction term, compute the mean of the
response variable for the low level of the factor and for the high
level of the factor. Any center points are omitted from the
computation.
2. Plot the point where the y-coordinate is the mean for the high
level of the factor and the x-coordinate is the mean for the low
level of the factor. The character used for the plot point should
identify the factor or interaction term (e.g., "1" for factor 1, "13"
Interpretation From the above dex Youden plot, we see that factors 1 and 2 stand out
for the interaction between factors 1 and 3).
of the Sample from the others. That is, the mean response values for the low and high
3. Repeat steps 1 and 2 for each factor and interaction term of the DEX Youden levels of factor 1 and factor 2 are quite different. For factor 3 and the 2
data. Plot and 3-term interactions, the mean response values for the low and high
The high and low values of the interaction terms are obtained by levels are similar.
multiplying the corresponding values of the main level factors. For
example, the interaction term X13 is obtained by multiplying the values We would conclude from this plot that factors 1 and 2 are important
and should be included in our final model while the remaining factors
for X1 with the corresponding values of X3. Since the values for X1 and
and interactions should be omitted from the final model.
X3 are either "-1" or "+1", the resulting values for X13 are also either

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33311.htm (1 of 3) [11/13/2003 5:32:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda33311.htm (2 of 3) [11/13/2003 5:32:26 PM]


1.3.3.31.1. DEX Youden Plot 1.3.3.32. 4-Plot

Case Study The Eddy current case study demonstrates the use of the dex Youden
plot in the context of the analysis of a full factorial design.

Software DEX Youden plots are not typically available as built-in plots in 1. Exploratory Data Analysis
statistical software programs. However, it should be relatively 1.3. EDA Techniques
straightforward to write a macro to generate this plot in most general 1.3.3. Graphical Techniques: Alphabetic
purpose statistical software programs.

1.3.3.32. 4-Plot
Purpose: The 4-plot is a collection of 4 specific EDA graphical techniques
Check whose purpose is to test the assumptions that underlie most
Underlying measurement processes. A 4-plot consists of a
Statistical 1. run sequence plot;
Assumptions
2. lag plot;
3. histogram;
4. normal probability plot.
If the 4 underlying assumptions of a typical measurement process
hold, then the above 4 plots will have a characteristic appearance (see
the normal random numbers case study below); if any of the
underlying assumptions fail to hold, then it will be revealed by an
anomalous appearance in one or more of the plots. Several commonly
encountered situations are demonstrated in the case studies below.
Although the 4-plot has an obvious use for univariate and time series
data, its usefulness extends far beyond that. Many statistical models of
the form

have the same underlying assumptions for the error term. That is, no
matter how complicated the functional fit, the assumptions on the
underlying error term are still the same. The 4-plot can and should be
routinely applied to the residuals when fitting models regardless of
whether the model is simple or complicated.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda33311.htm (3 of 3) [11/13/2003 5:32:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3332.htm (1 of 5) [11/13/2003 5:32:26 PM]


1.3.3.32. 4-Plot 1.3.3.32. 4-Plot

Sample Plot: Definition: The 4-plot consists of the following:


Process Has 1. Run 1. Run sequence plot to test fixed location and variation.
Fixed Sequence
❍ Vertically: Yi
Location, Plot;
Fixed 2. Lag Plot; ❍ Horizontally: i
Variation, 3. Histogram; 2. Lag Plot to test randomness.
Non-Random 4. Normal ❍ Vertically: Yi
(Oscillatory), Probability
Non-Normal Plot ❍ Horizontally: Yi-1
U-Shaped 3. Histogram to test (normal) distribution.
Distribution,
❍ Vertically: Counts
and Has 3
Outliers. ❍ Horizontally: Y

4. Normal probability plot to test normal distribution.


❍ Vertically: Ordered Yi

❍ Horizontally: Theoretical values from a normal N(0,1)


distribution for ordered Yi
This 4-plot reveals the following:
Questions 4-plots can provide answers to many questions:
1. the fixed location assumption is justified as shown by the run
sequence plot in the upper left corner. 1. Is the process in-control, stable, and predictable?
2. the fixed variation assumption is justified as shown by the run 2. Is the process drifting with respect to location?
sequence plot in the upper left corner. 3. Is the process drifting with respect to variation?
3. the randomness assumption is violated as shown by the 4. Are the data random?
non-random (oscillatory) lag plot in the upper right corner. 5. Is an observation related to an adjacent observation?
4. the assumption of a common, normal distribution is violated as 6. If the data are a time series, is is white noise?
shown by the histogram in the lower left corner and the normal
7. If the data are a time series and not white noise, is it sinusoidal,
probability plot in the lower right corner. The distribution is
autoregressive, etc.?
non-normal and is a U-shaped distribution.
8. If the data are non-random, what is a better model?
5. there are several outliers apparent in the lag plot in the upper
right corner. 9. Does the process follow a normal distribution?
10. If non-normal, what distribution does the process follow?
11. Is the model

valid and sufficient?


12. If the default model is insufficient, what is a better model?
13. Is the formula valid?
14. Is the sample mean a good estimator of the process location?
15. If not, what would be a better estimator?
16. Are there any outliers?

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3332.htm (2 of 5) [11/13/2003 5:32:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3332.htm (3 of 5) [11/13/2003 5:32:26 PM]


1.3.3.32. 4-Plot 1.3.3.32. 4-Plot

Importance: There are 4 assumptions that typically underlie all measurement Related Run Sequence Plot
Testing processes; namely, that the data from the process at hand "behave Techniques Lag Plot
Underlying like": Histogram
Assumptions 1. random drawings; Normal Probability Plot
Helps Ensure
2. from a fixed distribution;
the Validity of
3. with that distribution having a fixed location; and Autocorrelation Plot
the Final
Scientific and 4. with that distribution having fixed variation. Spectral Plot
Engineering PPCC Plot
Predictability is an all-important goal in science and engineering. If
Conclusions the above 4 assumptions hold, then we have achieved probabilistic
predictability--the ability to make probability statements not only Case Studies The 4-plot is used in most of the case studies in this chapter:
about the process in the past, but also about the process in the future. 1. Normal random numbers (the ideal)
In short, such processes are said to be "statistically in control". If the 4 2. Uniform random numbers
assumptions do not hold, then we have a process that is drifting (with
respect to location, variation, or distribution), is unpredictable, and is 3. Random walk
out of control. A simple characterization of such processes by a 4. Josephson junction cryothermometry
location estimate, a variation estimate, or a distribution "estimate"
5. Beam deflections
inevitably leads to optimistic and grossly invalid engineering
conclusions. 6. Filter transmittance

Inasmuch as the validity of the final scientific and engineering 7. Standard resistor
conclusions is inextricably linked to the validity of these same 4 8. Heat flow meter 1
underlying assumptions, it naturally follows that there is a real
necessity for all 4 assumptions to be routinely tested. The 4-plot (run Software It should be feasible to write a macro for the 4-plot in any general
sequence plot, lag plot, histogram, and normal probability plot) is seen purpose statistical software program that supports the capability for
as a simple, efficient, and powerful way of carrying out this routine multiple plots per page and supports the underlying plot techniques.
checking. Dataplot supports the 4-plot.

Interpretation: Of the 4 underlying assumptions:


Flat, 1. If the fixed location assumption holds, then the run sequence
Equi-Banded, plot will be flat and non-drifting.
Random,
2. If the fixed variation assumption holds, then the vertical spread
Bell-Shaped,
in the run sequence plot will be approximately the same over
and Linear
the entire horizontal axis.
3. If the randomness assumption holds, then the lag plot will be
structureless and random.
4. If the fixed distribution assumption holds (in particular, if the
fixed normal distribution assumption holds), then the histogram
will be bell-shaped and the normal probability plot will be
approximatelylinear.
If all 4 of the assumptions hold, then the process is "statistically in
control". In practice, many processes fall short of achieving this ideal.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3332.htm (4 of 5) [11/13/2003 5:32:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3332.htm (5 of 5) [11/13/2003 5:32:26 PM]


1.3.3.33. 6-Plot 1.3.3.33. 6-Plot

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.3. Graphical Techniques: Alphabetic

1.3.3.33. 6-Plot
Purpose: The 6-plot is a collection of 6 specific graphical techniques whose
Graphical purpose is to assess the validity of a Y versus X fit. The fit can be a
Model linear fit, a non-linear fit, a LOWESS (locally weighted least squares)
Validation fit, a spline fit, or any other fit utilizing a single independent variable.
The 6 plots are:
This 6-plot, which followed a linear fit, shows that the linear model is
1. Scatter plot of the response and predicted values versus the
not adequate. It suggests that a quadratic model would be a better
independent variable; model.
2. Scatter plot of the residuals versus the independent variable;
3. Scatter plot of the residuals versus the predicted values; Definition: The 6-plot consists of the following:
6 1. Response and predicted values
4. Lag plot of the residuals; Component
❍ Vertical axis: Response variable, predicted values
5. Histogram of the residuals; Plots
❍ Horizontal axis: Independent variable
6. Normal probability plot of the residuals.
2. Residuals versus independent variable
❍ Vertical axis: Residuals
Sample Plot
❍ Horizontal axis: Independent variable

3. Residuals versus predicted values


❍ Vertical axis: Residuals

❍ Horizontal axis: Predicted values

4. Lag plot of residuals


❍ Vertical axis: RES(I)

❍ Horizontal axis: RES(I-1)

5. Histogram of residuals
❍ Vertical axis: Counts

❍ Horizontal axis: Residual values

6. Normal probability plot of residuals


❍ Vertical axis: Ordered residuals

❍ Horizontal axis: Theoretical values from a normal N(0,1)

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3333.htm (1 of 4) [11/13/2003 5:32:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3333.htm (2 of 4) [11/13/2003 5:32:27 PM]


1.3.3.33. 6-Plot 1.3.3.33. 6-Plot

distribution for ordered residuals


Related Linear Least Squares
Questions The 6-plot can be used to answer the following questions: Techniques Non-Linear Least Squares
1. Are the residuals approximately normally distributed with a fixed Scatter Plot
location and scale? Run Sequence Plot
2. Are there outliers? Lag Plot
3. Is the fit adequate? Normal Probability Plot
4. Do the residuals suggest a better fit? Histogram

Importance: A model involving a response variable and a single independent variable Case Study The 6-plot is used in the Alaska pipeline data case study.
Validating has the form:
Model Software It should be feasible to write a macro for the 6-plot in any general
purpose statistical software program that supports the capability for
where Y is the response variable, X is the independent variable, f is the
multiple plots per page and supports the underlying plot techniques.
linear or non-linear fit function, and E is the random component. For a Dataplot supports the 6-plot.
good model, the error component should behave like:
1. random drawings (i.e., independent);
2. from a fixed distribution;
3. with fixed location; and
4. with fixed variation.
In addition, for fitting models it is usually further assumed that the fixed
distribution is normal and the fixed location is zero. For a good model
the fixed variation should be as small as possible. A necessary
component of fitting models is to verify these assumptions for the error
component and to assess whether the variation for the error component
is sufficiently small. The histogram, lag plot, and normal probability
plot are used to verify the fixed distribution, location, and variation
assumptions on the error component. The plot of the response variable
and the predicted values versus the independent variable is used to
assess whether the variation is sufficiently small. The plots of the
residuals versus the independent variable and the predicted values is
used to assess the independence assumption.
Assessing the validity and quality of the fit in terms of the above
assumptions is an absolutely vital part of the model-fitting process. No
fit should be considered complete without an adequate model validation
step.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3333.htm (3 of 4) [11/13/2003 5:32:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3333.htm (4 of 4) [11/13/2003 5:32:27 PM]


1.3.4. Graphical Techniques: By Problem Category 1.3.4. Graphical Techniques: By Problem Category

1. Exploratory Data Analysis


1.3. EDA Techniques
Box-Cox Bootstrap Plot:
Normality Plot: 1.3.3.4
1.3.4. Graphical 1.3.3.6
Techniques: By
Problem
Time Series
Category y = f(t) + e

Univariate
y=c+e
Run Sequence Spectral Plot: Autocorrelation
Plot: 1.3.3.25 1.3.3.27 Plot: 1.3.3.1

Run Sequence Lag Plot: Histogram:


Plot: 1.3.3.25 1.3.3.15 1.3.3.14

Complex Complex
Demodulation Demodulation
Amplitude Plot: Phase Plot:
Normal 4-Plot: 1.3.3.32 PPCC Plot: 1.3.3.8 1.3.3.9
Probability Plot: 1.3.3.23
1.3.3.21

1 Factor
y = f(x) + e

Weibull Plot: Probability Plot: Box-Cox Scatter Plot: Box Plot: 1.3.3.7 Bihistogram:
1.3.3.30 1.3.3.22 Linearity Plot: 1.3.3.26 1.3.3.2
1.3.3.5

http://www.itl.nist.gov/div898/handbook/eda/section3/eda34.htm (1 of 4) [11/13/2003 5:32:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda34.htm (2 of 4) [11/13/2003 5:32:27 PM]


1.3.4. Graphical Techniques: By Problem Category 1.3.4. Graphical Techniques: By Problem Category

Regression
y = f(x1,x2,x3,...,xk) + e

Quantile-Quantile Mean Plot: Standard Scatter Plot: 6-Plot: 1.3.3.33 Linear


Plot: 1.3.3.24 1.3.3.20 Deviation Plot: 1.3.3.26 Correlation Plot:
1.3.3.28 1.3.3.16

Multi-Factor/Comparative
y = f(xp, x1,x2,...,xk) + e

Linear Intercept Linear Slope Linear Residual


Plot: 1.3.3.17 Plot: 1.3.3.18 Standard
Block Plot:
Deviation
1.3.3.3
Plot:1.3.3.19

Multi-Factor/Screening
Interlab
y = f(x1,x2,x3,...,xk) + e
(y1,y2) = f(x) + e

DEX Scatter DEX Mean Plot: DEX Standard


Youden Plot:
Plot: 1.3.3.11 1.3.3.12 Deviation Plot:
1.3.3.31
1.3.3.13

Multivariate
(y1,y2,...,yp)

Contour Plot:
1.3.3.10 Star Plot:
1.3.3.29

http://www.itl.nist.gov/div898/handbook/eda/section3/eda34.htm (3 of 4) [11/13/2003 5:32:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda34.htm (4 of 4) [11/13/2003 5:32:27 PM]


1.3.5. Quantitative Techniques 1.3.5. Quantitative Techniques

values of an interval which will, with a given level of confidence (i.e.,


probability), contain the population parameter.

Hypothesis Hypothesis tests also address the uncertainty of the sample estimate.
1. Exploratory Data Analysis
Tests However, instead of providing an interval, a hypothesis test attempts to
1.3. EDA Techniques refute a specific claim about a population parameter based on the
sample data. For example, the hypothesis might be one of the
following:
1.3.5. Quantitative Techniques ● the population mean is equal to 10

● the population standard deviation is equal to 5


Confirmatory The techniques discussed in this section are classical statistical methods ● the means from two populations are equal
Statistics as opposed to EDA techniques. EDA and classical techniques are not ● the standard deviations from 5 populations are equal
mutually exclusive and can be used in a complamentary fashion. For
example, the analysis can start with some simple graphical techniques To reject a hypothesis is to conclude that it is false. However, to accept
such as the 4-plot followed by the classical confirmatory methods a hypothesis does not mean that it is true, only that we do not have
discussed herein to provide more rigorous statments about the evidence to believe otherwise. Thus hypothesis tests are usually stated
conclusions. If the classical methods yield different conclusions than in terms of both a condition that is doubted (null hypothesis) and a
the graphical analysis, then some effort should be invested to explain condition that is believed (alternative hypothesis).
why. Often this is an indication that some of the assumptions of the
classical techniques are violated. A common format for a hypothesis test is:
H0: A statement of the null hypothesis, e.g., two
Many of the quantitative techniques fall into two broad categories: population means are equal.
1. Interval estimation Ha: A statement of the alternative hypothesis, e.g., two
2. Hypothesis tests population means are not equal.
Test Statistic: The test statistic is based on the specific
Interval hypothesis test.
It is common in statistics to estimate a parameter from a sample of data.
Estimates The value of the parameter using all of the possible data, not just the Significance Level: The significance level, , defines the sensitivity of
sample data, is called the population parameter or true value of the the test. A value of = 0.05 means that we
parameter. An estimate of the true parameter value is made using the inadvertently reject the null hypothesis 5% of the
sample data. This is called a point estimate or a sample estimate. time when it is in fact true. This is also called the
type I error. The choice of is somewhat
For example, the most commonly used measure of location is the mean. arbitrary, although in practice values of 0.1, 0.05,
The population, or true, mean is the sum of all the members of the and 0.01 are commonly used.
given population divided by the number of members in the population.
As it is typically impractical to measure every member of the The probability of rejecting the null hypothesis
population, a random sample is drawn from the population. The sample when it is in fact false is called the power of the
mean is calculated by summing the values in the sample and dividing test and is denoted by 1 - . Its complement, the
by the number of values in the sample. This sample mean is then used probability of accepting the null hypothesis when
as the point estimate of the population mean. the alternative hypothesis is, in fact, true (type II
error), is called and can only be computed for a
Interval estimates expand on point estimates by incorporating the specific alternative hypothesis.
uncertainty of the point estimate. In the example for the mean above,
different samples from the same population will generate different
values for the sample mean. An interval estimate quantifies this
uncertainty in the sample estimate by computing lower and upper

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm (1 of 4) [11/13/2003 5:32:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm (2 of 4) [11/13/2003 5:32:27 PM]


1.3.5. Quantitative Techniques 1.3.5. Quantitative Techniques

Critical Region: The critical region encompasses those values of 3. Chi-Square Test
the test statistic that lead to a rejection of the null 4. F-Test
hypothesis. Based on the distribution of the test
statistic and the significance level, a cut-off value 5. Levene Test
for the test statistic is computed. Values either ● Skewness and Kurtosis
above or below or both (depending on the 1. Measures of Skewness and Kurtosis
direction of the test) this cut-off define the critical
region. ● Randomness
1. Autocorrelation
Practical It is important to distinguish between statistical significance and 2. Runs Test
Versus practical significance. Statistical significance simply means that we ● Distributional Measures
Statistical reject the null hypothesis. The ability of the test to detect differences
Significance that lead to rejection of the null hypothesis depends on the sample size. 1. Anderson-Darling Test
For example, for a particularly large sample, the test may reject the null 2. Chi-Square Goodness-of-Fit Test
hypothesis that two process means are equivalent. However, in practice 3. Kolmogorov-Smirnov Test
the difference between the two means may be relatively small to the
point of having no real engineering significance. Similarly, if the ● Outliers
sample size is small, a difference that is large in engineering terms may 1. Grubbs Test
not lead to rejection of the null hypothesis. The analyst should not just ● 2-Level Factorial Designs
blindly apply the tests, but should combine engineering judgement with
statistical analysis. 1. Yates Analysis

Bootstrap In some cases, it is possible to mathematically derive appropriate


Uncertainty uncertainty intervals. This is particularly true for intervals based on the
Estimates assumption of a normal distribution. However, there are many cases in
which it is not possible to mathematically derive the uncertainty. In
these cases, the bootstrap provides a method for empirically
determining an appropriate interval.

Table of Some of the more common classical quantitative techniques are listed
Contents below. This list of quantitative techniques is by no means meant to be
exhaustive. Additional discussions of classical statistical techniques are
contained in the product comparisons chapter.
● Location
1. Measures of Location
2. Confidence Limits for the Mean and One Sample t-Test
3. Two Sample t-Test for Equal Means
4. One Factor Analysis of Variance
5. Multi-Factor Analysis of Variance
● Scale (or variability or spread)
1. Measures of Scale
2. Bartlett's Test

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm (3 of 4) [11/13/2003 5:32:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35.htm (4 of 4) [11/13/2003 5:32:27 PM]


1.3.5.1. Measures of Location 1.3.5.1. Measures of Location

specific value may not occur more than once if the data are
continuous. What may be a more meaningful, if less exact
measure, is the midpoint of the class interval of the histogram
with the highest peak.
1. Exploratory Data Analysis
1.3. EDA Techniques Why A natural question is why we have more than one measure of the typical
1.3.5. Quantitative Techniques Different value. The following example helps to explain why these alternative
Measures definitions are useful and necessary.
This plot shows histograms for 10,000 random numbers generated from
1.3.5.1. Measures of Location a normal, an exponential, a Cauchy, and a lognormal distribution.

Location A fundamental task in many statistical analyses is to estimate a location


parameter for the distribution; i.e., to find a typical or central value that
best describes the data.

Definition of The first step is to define what we mean by a typical value. For
Location univariate data, there are three common definitions:
1. mean - the mean is the sum of the data points divided by the
number of data points. That is,

The mean is that value that is most commonly referred to as the


average. We will use the term average as a synonym for the mean
and the term typical value to refer generically to measures of
location.
2. median - the median is the value of the point which has half the
data smaller than that point and half the data larger than that
point. That is, if X1, X2, ... ,XN is a random sample sorted from Normal The first histogram is a sample from a normal distribution. The mean is
Distribution 0.005, the median is -0.010, and the mode is -0.144 (the mode is
smallest value to largest value, then the median is defined as:
computed as the midpoint of the histogram interval with the highest
peak).
The normal distribution is a symmetric distribution with well-behaved
tails and a single peak at the center of the distribution. By symmetric,
3. mode - the mode is the value of the random sample that occurs we mean that the distribution can be folded about an axis so that the 2
with the greatest frequency. It is not necessarily unique. The sides coincide. That is, it behaves the same to the left and right of some
mode is typically used in a qualitative fashion. For example, there center point. For a normal distribution, the mean, median, and mode are
may be a single dominant hump in the data perhaps two or more actually equivalent. The histogram above generates similar estimates for
smaller humps in the data. This is usually evident from a the mean, median, and mode. Therefore, if a histogram or normal
histogram of the data. probability plot indicates that your data are approximated well by a
normal distribution, then it is reasonable to use the mean as the location
When taking samples from continuous populations, we need to be estimator.
somewhat careful in how we define the mode. That is, any

http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm (1 of 5) [11/13/2003 5:32:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm (2 of 5) [11/13/2003 5:32:28 PM]


1.3.5.1. Measures of Location 1.3.5.1. Measures of Location

Exponential The second histogram is a sample from an exponential distribution. The Lognormal The fourth histogram is a sample from a lognormal distribution. The
Distribution mean is 1.001, the median is 0.684, and the mode is 0.254 (the mode is Distribution mean is 1.677, the median is 0.989, and the mode is 0.680 (the mode is
computed as the midpoint of the histogram interval with the highest computed as the midpoint of the histogram interval with the highest
peak). peak).
The exponential distribution is a skewed, i. e., not symmetric, The lognormal is also a skewed distribution. Therefore the mean and
distribution. For skewed distributions, the mean and median are not the median do not provide similar estimates for the location. As with the
same. The mean will be pulled in the direction of the skewness. That is, exponential distribution, there is no obvious answer to the question of
if the right tail is heavier than the left tail, the mean will be greater than which is the more meaningful measure of location.
the median. Likewise, if the left tail is heavier than the right tail, the
mean will be less than the median. Robustness There are various alternatives to the mean and median for measuring
location. These alternatives were developed to address non-normal data
For skewed distributions, it is not at all obvious whether the mean, the
since the mean is an optimal estimator if in fact your data are normal.
median, or the mode is the more meaningful measure of the typical
value. In this case, all three measures are useful. Tukey and Mosteller defined two types of robustness where robustness
is a lack of susceptibility to the effects of nonnormality.
Cauchy The third histogram is a sample from a Cauchy distribution. The mean is 1. Robustness of validity means that the confidence intervals for the
Distribution 3.70, the median is -0.016, and the mode is -0.362 (the mode is population location have a 95% chance of covering the population
computed as the midpoint of the histogram interval with the highest location regardless of what the underlying distribution is.
peak).
2. Robustness of efficiency refers to high effectiveness in the face of
For better visual comparison with the other data sets, we restricted the non-normal tails. That is, confidence intervals for the population
histogram of the Cauchy distribution to values between -10 and 10. The location tend to be almost as narrow as the best that could be done
full Cauchy data set in fact has a minimum of approximately -29,000 if we knew the true shape of the distributuion.
and a maximum of approximately 89,000. The mean is an example of an estimator that is the best we can do if the
underlying distribution is normal. However, it lacks robustness of
The Cauchy distribution is a symmetric distribution with heavy tails and
validity. That is, confidence intervals based on the mean tend not to be
a single peak at the center of the distribution. The Cauchy distribution
precise if the underlying distribution is in fact not normal.
has the interesting property that collecting more data does not provide a
more accurate estimate of the mean. That is, the sampling distribution of The median is an example of a an estimator that tends to have
the mean is equivalent to the sampling distribution of the original data. robustness of validity but not robustness of efficiency.
This means that for the Cauchy distribution the mean is useless as a
measure of the typical value. For this histogram, the mean of 3.7 is well The alternative measures of location try to balance these two concepts of
above the vast majority of the data. This is caused by a few very robustness. That is, the confidence intervals for the case when the data
extreme values in the tail. However, the median does provide a useful are normal should be almost as narrow as the confidence intervals based
measure for the typical value. on the mean. However, they should maintain their validity even if the
underlying data are not normal. In particular, these alternatives address
Although the Cauchy distribution is an extreme case, it does illustrate the problem of heavy-tailed distributions.
the importance of heavy tails in measuring the mean. Extreme values in
the tails distort the mean. However, these extreme values do not distort
the median since the median is based on ranks. In general, for data with
extreme values in the tails, the median provides a better estimate of
location than does the mean.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm (3 of 5) [11/13/2003 5:32:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm (4 of 5) [11/13/2003 5:32:28 PM]


1.3.5.1. Measures of Location 1.3.5.2. Confidence Limits for the Mean

Alternative A few of the more common alternative location measures are:


Measures of 1. Mid-Mean - computes a mean using the data between the 25th
Location and 75th percentiles.
2. Trimmed Mean - similar to the mid-mean except different
percentile values are used. A common choice is to trim 5% of the
1. Exploratory Data Analysis
points in both the lower and upper tails, i.e., calculate the mean
1.3. EDA Techniques
for data between the 5th and 95th percentiles. 1.3.5. Quantitative Techniques
3. Winsorized Mean - similar to the trimmed mean. However,
instead of trimming the points, they are set to the lowest (or
highest) value. For example, all data below the 5th percentile are 1.3.5.2. Confidence Limits for the Mean
set equal to the value of the 5th percentile and all data greater
than the 95th percentile are set equal to the 95th percentile. Purpose: Confidence limits for the mean (Snedecor and Cochran, 1989) are an interval estimate
4. Mid-range = (smallest + largest)/2. Interval for the mean. Interval estimates are often desirable because the estimate of the mean
Estimate for varies from sample to sample. Instead of a single estimate for the mean, a confidence
The first three alternative location estimators defined above have the
Mean interval generates a lower and upper limit for the mean. The interval estimate gives an
advantage of the median in the sense that they are not unduly affected
indication of how much uncertainty there is in our estimate of the true mean. The
by extremes in the tails. However, they generate estimates that are closer
narrower the interval, the more precise is our estimate.
to the mean for data that are normal (or nearly so).
Confidence limits are expressed in terms of a confidence coefficient. Although the
The mid-range, since it is based on the two most extreme points, is not choice of confidence coefficient is somewhat arbitrary, in practice 90%, 95%, and
robust. Its use is typically restricted to situations in which the behavior 99% intervals are often used, with 95% being the most commonly used.
at the extreme points is relevant.
As a technical note, a 95% confidence interval does not mean that there is a 95%
Case Study The uniform random numbers case study compares the performance of probability that the interval contains the true mean. The interval computed from a
given sample either contains the true mean or it does not. Instead, the level of
several different location estimators for a particular non-normal
confidence is associated with the method of calculating the interval. The confidence
distribution. coefficient is simply the proportion of samples of a given size that may be expected to
contain the true mean. That is, for a 95% confidence interval, if many samples are
Software Most general purpose statistical software programs, including Dataplot, collected and the confidence interval computed, in the long run about 95% of these
can compute at least some of the measures of location discussed above. intervals would contain the true mean.

Definition: Confidence limits are defined as:


Confidence
Interval
where is the sample mean, s is the sample standard deviation, N is the sample size,
is the desired significance level, and is the upper critical value of the t
distribution with N - 1 degrees of freedom. Note that the confidence coefficient is 1 -
.
From the formula, it is clear that the width of the interval is controlled by two factors:

1. As N increases, the interval gets narrower from the term.


That is, one way to obtain more precise estimates for the mean is to increase the
sample size.
2. The larger the sample standard deviation, the larger the confidence interval.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda351.htm (5 of 5) [11/13/2003 5:32:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm (1 of 4) [11/13/2003 5:32:28 PM]


1.3.5.2. Confidence Limits for the Mean 1.3.5.2. Confidence Limits for the Mean

This simply means that noisy data, i.e., data with a large standard deviation, are
going to generate wider intervals than data with a smaller standard deviation. Interpretation The first few lines print the sample statistics used in calculating the confidence
of the Sample interval. The table shows the confidence interval for several different significance
Definition: To test whether the population mean has a specific value, , against the two-sided Output levels. The first column lists the confidence level (which is 1 - expressed as a
Hypothesis alternative that it does not have a value , the confidence interval is converted to percent), the second column lists the t-value (i.e., ), the third column lists
Test hypothesis-test form. The test is a one-sample t-test, and it is defined as:
the t-value times the standard error (the standard error is ), the fourth column
H0:
lists the lower confidence limit, and the fifth column lists the upper confidence limit.
Ha: For example, for a 95% confidence interval, we go to the row identified by 95.000 in
Test Statistic: the first column and extract an interval of (9.25824, 9.26468) from the last two
columns.
where , N, and are defined as above.
Significance Level: . The most commonly used value for is 0.05. Output from other statistical software may look somewhat different from the above
Critical Region: Reject the null hypothesis that the mean is a specified value, , output.
if
Sample Dataplot generated the following output for a one-sample t-test from the
Output for t ZARR13.DAT data set:
or Test
T TEST
(1-SAMPLE)
Sample Dataplot generated the following output for a confidence interval from the MU0 = 5.000000
Output for ZARR13.DAT data set: NULL HYPOTHESIS UNDER TEST--MEAN MU = 5.000000
Confidence
Interval SAMPLE:
NUMBER OF OBSERVATIONS = 195
CONFIDENCE LIMITS FOR MEAN MEAN = 9.261460
(2-SIDED) STANDARD DEVIATION = 0.2278881E-01
STANDARD DEVIATION OF MEAN = 0.1631940E-02
NUMBER OF OBSERVATIONS = 195
MEAN = 9.261460 TEST:
STANDARD DEVIATION = 0.2278881E-01 MEAN-MU0 = 4.261460
STANDARD DEVIATION OF MEAN = 0.1631940E-02 T TEST STATISTIC VALUE = 2611.284
DEGREES OF FREEDOM = 194.0000
CONFIDENCE T T X SD(MEAN) LOWER UPPER T TEST STATISTIC CDF VALUE = 1.000000
VALUE (%) VALUE LIMIT LIMIT
--------------------------------------------------------- ALTERNATIVE- ALTERNATIVE-
50.000 0.676 0.110279E-02 9.26036 9.26256 ALTERNATIVE- HYPOTHESIS HYPOTHESIS
75.000 1.154 0.188294E-02 9.25958 9.26334 HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
90.000 1.653 0.269718E-02 9.25876 9.26416 MU <> 5.000000 (0,0.025) (0.975,1) ACCEPT
95.000 1.972 0.321862E-02 9.25824 9.26468 MU < 5.000000 (0,0.05) REJECT
99.000 2.601 0.424534E-02 9.25721 9.26571 MU > 5.000000 (0.95,1) ACCEPT
99.900 3.341 0.545297E-02 9.25601 9.26691
99.990 3.973 0.648365E-02 9.25498 9.26794
99.999 4.536 0.740309E-02 9.25406 9.26886

http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm (2 of 4) [11/13/2003 5:32:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm (3 of 4) [11/13/2003 5:32:28 PM]


1.3.5.2. Confidence Limits for the Mean 1.3.5.3. Two-Sample t-Test for Equal Means

Interpretation We are testing the hypothesis that the population mean is 5. The output is divided into
of Sample three sections.
Output 1. The first section prints the sample statistics used in the computation of the t-test.
2. The second section prints the t-test statistic value, the degrees of freedom, and 1. Exploratory Data Analysis
the cumulative distribution function (cdf) value of the t-test statistic. The t-test 1.3. EDA Techniques
statistic cdf value is an alternative way of expressing the critical value. This cdf 1.3.5. Quantitative Techniques
value is compared to the acceptance intervals printed in section three. For an
upper one-tailed test, the alternative hypothesis acceptance interval is (1 - ,1),
the alternative hypothesis acceptance interval for a lower one-tailed test is (0, 1.3.5.3. Two-Sample t-Test for Equal Means
), and the alternative hypothesis acceptance interval for a two-tailed test is (1 -
/2,1) or (0, /2). Note that accepting the alternative hypothesis is equivalent to Purpose: The two-sample t-test (Snedecor and Cochran, 1989) is used to determine if two
rejecting the null hypothesis. Test if two population means are equal. A common application of this is to test if a new
3. The third section prints the conclusions for a 95% test since this is the most population process or treatment is superior to a current process or treatment.
common case. Results are given in terms of the alternative hypothesis for the means are
two-tailed test and for the one-tailed test in both directions. The alternative equal There are several variations on this test.
hypothesis acceptance interval column is stated in terms of the cdf value printed 1. The data may either be paired or not paired. By paired, we mean that there
in section two. The last column specifies whether the alternative hypothesis is is a one-to-one correspondence between the values in the two samples. That
accepted or rejected. For a different significance level, the appropriate is, if X1, X2, ..., Xn and Y1, Y2, ... , Yn are the two samples, then Xi
conclusion can be drawn from the t-test statistic cdf value printed in section
corresponds to Yi. For paired samples, the difference Xi - Yi is usually
two. For example, for a significance level of 0.10, the corresponding alternative
hypothesis acceptance intervals are (0,0.05) and (0.95,1), (0, 0.10), and (0.90,1). calculated. For unpaired samples, the sample sizes for the two samples may
or may not be equal. The formulas for paired data are somewhat simpler
Output from other statistical software may look somewhat different from the above
than the formulas for unpaired data.
output.
2. The variances of the two samples may be assumed to be equal or unequal.
Questions Confidence limits for the mean can be used to answer the following questions: Equal variances yields somewhat simpler formulas, although with
computers this is no longer a significant issue.
1. What is a reasonable estimate for the mean?
2. How much variability is there in the estimate of the mean? 3. The null hypothesis might be that the two population means are not equal (
). If so, this must be converted to the form that the difference
3. Does a given target value fall within the confidence limits?
between the two population means is equal to some constant (
Related Two-Sample T-Test ). This form might be preferred if you only want to adopt a
Techniques new process or treatment if it exceeds the current treatment by some
Confidence intervals for other location estimators such as the median or mid-mean threshold value.
tend to be mathematically difficult or intractable. For these cases, confidence intervals
can be obtained using the bootstrap.

Case Study Heat flow meter data.

Software Confidence limits for the mean and one-sample t-tests are available in just about all
general purpose statistical software programs, including Dataplot.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda352.htm (4 of 4) [11/13/2003 5:32:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (1 of 5) [11/13/2003 5:32:29 PM]


1.3.5.3. Two-Sample t-Test for Equal Means 1.3.5.3. Two-Sample t-Test for Equal Means

Definition The two sample t test for unpaired data is defined as: Sample Dataplot generated the following output for the t test from the AUTO83B.DAT
H0: Output data set:
Ha:
T TEST
Test
(2-SAMPLE)
Statistic:
NULL HYPOTHESIS UNDER TEST--POPULATION MEANS MU1 = MU2

SAMPLE 1:
where N1 and N2 are the sample sizes, and are the sample
NUMBER OF OBSERVATIONS = 249
means, and and are the sample variances. MEAN = 20.14458
STANDARD DEVIATION = 6.414700
If equal variances are assumed, then the formula reduces to: STANDARD DEVIATION OF MEAN = 0.4065151

SAMPLE 2:
NUMBER OF OBSERVATIONS = 79
MEAN = 30.48101
where STANDARD DEVIATION = 6.107710
STANDARD DEVIATION OF MEAN = 0.6871710

IF ASSUME SIGMA1 = SIGMA2:


Significance . POOLED STANDARD DEVIATION = 6.342600
Level: DIFFERENCE (DEL) IN MEANS = -10.33643
Critical Reject the null hypothesis that the two means are equal if STANDARD DEVIATION OF DEL = 0.8190135
Region: T TEST STATISTIC VALUE = -12.62059
DEGREES OF FREEDOM = 326.0000
or T TEST STATISTIC CDF VALUE = 0.000000

where is the critical value of the t distribution with IF NOT ASSUME SIGMA1 = SIGMA2:
STANDARD DEVIATION SAMPLE 1 = 6.414700
degrees of freedom where
STANDARD DEVIATION SAMPLE 2 = 6.107710
BARTLETT CDF VALUE = 0.402799
DIFFERENCE (DEL) IN MEANS = -10.33643
STANDARD DEVIATION OF DEL = 0.7984100
If equal variances are assumed, then T TEST STATISTIC VALUE = -12.94627
EQUIVALENT DEG. OF FREEDOM = 136.8750
T TEST STATISTIC CDF VALUE = 0.000000

ALTERNATIVE- ALTERNATIVE-
ALTERNATIVE- HYPOTHESIS HYPOTHESIS
HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
MU1 <> MU2 (0,0.025) (0.975,1) ACCEPT
MU1 < MU2 (0,0.05) ACCEPT
MU1 > MU2 (0.95,1) REJECT

http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (2 of 5) [11/13/2003 5:32:29 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (3 of 5) [11/13/2003 5:32:29 PM]


1.3.5.3. Two-Sample t-Test for Equal Means 1.3.5.3. Two-Sample t-Test for Equal Means

Interpretation We are testing the hypothesis that the population mean is equal for the two Questions Two-sample t-tests can be used to answer the following questions:
of Sample samples. The output is divided into five sections. 1. Is process 1 equivalent to process 2?
Output 1. The first section prints the sample statistics for sample one used in the 2. Is the new process better than the current process?
computation of the t-test.
3. Is the new process better than the current process by at least some
2. The second section prints the sample statistics for sample two used in the pre-determined threshold amount?
computation of the t-test.
3. The third section prints the pooled standard deviation, the difference in the Related Confidence Limits for the Mean
means, the t-test statistic value, the degrees of freedom, and the cumulative Techniques Analysis of Variance
distribution function (cdf) value of the t-test statistic under the assumption
that the standard deviations are equal. The t-test statistic cdf value is an Case Study Ceramic strength data.
alternative way of expressing the critical value. This cdf value is compared
to the acceptance intervals printed in section five. For an upper one-tailed Software Two-sample t-tests are available in just about all general purpose statistical
test, the acceptance interval is (0,1 - ), the acceptance interval for a software programs, including Dataplot.
two-tailed test is ( /2, 1 - /2), and the acceptance interval for a lower
one-tailed test is ( ,1).
4. The fourth section prints the pooled standard deviation, the difference in
the means, the t-test statistic value, the degrees of freedom, and the
cumulative distribution function (cdf) value of the t-test statistic under the
assumption that the standard deviations are not equal. The t-test statistic cdf
value is an alternative way of expressing the critical value. cdf value is
compared to the acceptance intervals printed in section five. For an upper
one-tailed test, the alternative hypothesis acceptance interval is (1 - ,1),
the alternative hypothesis acceptance interval for a lower one-tailed test is
(0, ), and the alternative hypothesis acceptance interval for a two-tailed
test is (1 - /2,1) or (0, /2). Note that accepting the alternative hypothesis
is equivalent to rejecting the null hypothesis.
5. The fifth section prints the conclusions for a 95% test under the assumption
that the standard deviations are not equal since a 95% test is the most
common case. Results are given in terms of the alternative hypothesis for
the two-tailed test and for the one-tailed test in both directions. The
alternative hypothesis acceptance interval column is stated in terms of the
cdf value printed in section four. The last column specifies whether the
alternative hypothesis is accepted or rejected. For a different significance
level, the appropriate conclusion can be drawn from the t-test statistic cdf
value printed in section four. For example, for a significance level of 0.10,
the corresponding alternative hypothesis acceptance intervals are (0,0.05)
and (0.95,1), (0, 0.10), and (0.90,1).
Output from other statistical software may look somewhat different from the
above output.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (4 of 5) [11/13/2003 5:32:29 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm (5 of 5) [11/13/2003 5:32:29 PM]


1.3.5.3.1. Data Used for Two-Sample t-Test 1.3.5.3.1. Data Used for Two-Sample t-Test

18 19
14 32
14 34
14 26
14 30
1. Exploratory Data Analysis
12 22
1.3. EDA Techniques
1.3.5. Quantitative Techniques
13 22
1.3.5.3. Two-Sample t-Test for Equal Means 13 33
18 39
22 36
1.3.5.3.1. Data Used for Two-Sample t-Test 19 28
18 27
23 21
Data Used The following is the data used for the two-sample t-test example. The 26 24
for first column is miles per gallon for U.S. cars and the second column is 25 30
Two-Sample miles per gallon for Japanese cars. For the t-test example, rows with the 20 34
t-Test second column equal to -999 were deleted. 21 32
Example 13 38
18 24 14 37
15 27 15 30
18 27 14 31
16 25 17 37
17 31 11 32
15 35 13 47
14 24 12 41
14 19 13 45
14 28 15 34
15 23 13 33
15 27 13 24
14 20 14 32
15 22 22 39
14 18 28 35
22 20 13 32
18 31 14 37
21 32 13 38
21 31 14 34
10 32 15 34
10 24 12 32
11 26 13 33
9 29 13 32
28 24 14 25
25 24 13 24
19 33 12 37
16 33 13 31
17 32 18 36
19 28 16 36

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm (1 of 6) [11/13/2003 5:32:29 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm (2 of 6) [11/13/2003 5:32:29 PM]


1.3.5.3.1. Data Used for Two-Sample t-Test 1.3.5.3.1. Data Used for Two-Sample t-Test

18 34 20 -999
18 38 23 -999
23 32 18 -999
11 38 19 -999
12 32 25 -999
13 -999 26 -999
12 -999 18 -999
18 -999 16 -999
21 -999 16 -999
19 -999 15 -999
21 -999 22 -999
15 -999 22 -999
16 -999 24 -999
15 -999 23 -999
11 -999 29 -999
20 -999 25 -999
21 -999 20 -999
19 -999 18 -999
15 -999 19 -999
26 -999 18 -999
25 -999 27 -999
16 -999 13 -999
16 -999 17 -999
18 -999 13 -999
16 -999 13 -999
13 -999 13 -999
14 -999 30 -999
14 -999 26 -999
14 -999 18 -999
28 -999 17 -999
19 -999 16 -999
18 -999 15 -999
15 -999 18 -999
15 -999 21 -999
16 -999 19 -999
15 -999 19 -999
16 -999 16 -999
14 -999 16 -999
17 -999 16 -999
16 -999 16 -999
15 -999 25 -999
18 -999 26 -999
21 -999 31 -999
20 -999 34 -999
13 -999 36 -999
23 -999 20 -999

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm (3 of 6) [11/13/2003 5:32:29 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm (4 of 6) [11/13/2003 5:32:29 PM]


1.3.5.3.1. Data Used for Two-Sample t-Test 1.3.5.3.1. Data Used for Two-Sample t-Test

19 -999 24 -999
20 -999 19 -999
19 -999 28 -999
21 -999 24 -999
20 -999 27 -999
25 -999 27 -999
21 -999 26 -999
19 -999 24 -999
21 -999 30 -999
21 -999 39 -999
19 -999 35 -999
18 -999 34 -999
19 -999 30 -999
18 -999 22 -999
18 -999 27 -999
18 -999 20 -999
30 -999 18 -999
31 -999 28 -999
23 -999 27 -999
24 -999 34 -999
22 -999 31 -999
20 -999 29 -999
22 -999 27 -999
20 -999 24 -999
21 -999 23 -999
17 -999 38 -999
18 -999 36 -999
17 -999 25 -999
18 -999 38 -999
17 -999 26 -999
16 -999 22 -999
19 -999 36 -999
19 -999 27 -999
36 -999 27 -999
27 -999 32 -999
23 -999 28 -999
24 -999 31 -999
34 -999
35 -999
28 -999
29 -999
27 -999
34 -999
32 -999
28 -999
26 -999

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm (5 of 6) [11/13/2003 5:32:29 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3531.htm (6 of 6) [11/13/2003 5:32:29 PM]


1.3.5.4. One-Factor ANOVA 1.3.5.4. One-Factor ANOVA
the cell mean into an overall mean and the effect of the ith factor level.
This second model makes the factor effect more explicit, so we will
emphasize this approach.

Model Note that the ANOVA model assumes that the error term, Eij, should
Validation follow the assumptions for a univariate measurement process. That is,
1. Exploratory Data Analysis
1.3. EDA Techniques after performing an analysis of variance, the model should be validated
1.3.5. Quantitative Techniques by analyzing the residuals.

1.3.5.4. One-Factor ANOVA Sample Dataplot generated the following output for the one-way analysis of variance from the GEAR.DAT data set.
Output
Purpose: One factor analysis of variance (Snedecor and Cochran, 1989) is a
Test for special case of analysis of variance (ANOVA), for one factor of interest, NUMBER OF OBSERVATIONS = 100
Equal and a generalization of the two-sample t-test. The two-sample t-test is NUMBER OF FACTORS = 1
Means used to decide whether two groups (levels) of a factor have the same NUMBER OF LEVELS FOR FACTOR 1 = 10
Across mean. One-way analysis of variance generalizes this to levels where k, BALANCED CASE
Groups the number of levels, is greater than or equal to 2. RESIDUAL STANDARD DEVIATION = 0.59385783970E-02
RESIDUAL DEGREES OF FREEDOM = 90
For example, data collected on, say, five instruments have one factor REPLICATION CASE
(instruments) at five levels. The ANOVA tests whether instruments REPLICATION STANDARD DEVIATION = 0.59385774657E-02
have a significant effect on the results. REPLICATION DEGREES OF FREEDOM = 90
NUMBER OF DISTINCT CELLS = 10
Definition The Product and Process Comparisons chapter (chapter 7) contains a
more extensive discussion of 1-factor ANOVA, including the details for *****************
the mathematical computations of one-way analysis of variance. * ANOVA TABLE *
*****************
The model for the analysis of variance can be stated in two
mathematically equivalent ways. In the following discussion, each level SOURCE DF SUM OF SQUARES MEAN SQUARE F STATISTIC F CDF SIG
of each factor is called a cell. For the one-way case, a cell and a level -------------------------------------------------------------------------------
are equivalent since there is only one factor. In the following, the TOTAL (CORRECTED) 99 0.003903 0.000039
subscript i refers to the level and the subscript j refers to the observation -------------------------------------------------------------------------------
within a level. For example, Y23 refers to the third observation in the FACTOR 1 9 0.000729 0.000081 2.2969 97.734% *
second level. -------------------------------------------------------------------------------
The first model is RESIDUAL 90 0.003174 0.000035

RESIDUAL STANDARD DEVIATION = 0.00593857840


This model decomposes the response into a mean for each cell and an RESIDUAL DEGREES OF FREEDOM = 90
error term. The analysis of variance provides estimates for each cell REPLICATION STANDARD DEVIATION = 0.00593857747
mean. These estimated cell means are the predicted values of the model REPLICATION DEGREES OF FREEDOM = 90
and the differences between the response variable and the estimated cell
means are the residuals. That is ****************
* ESTIMATION *
****************

GRAND MEAN = 0.99764001369E+00


The second model is GRAND STANDARD DEVIATION = 0.62789078802E-02

This model decomposes the response into an overall (grand) mean, the LEVEL-ID NI MEAN EFFECT SD(EFFECT)
effect of the ith factor level, and an error term. The analysis of variance --------------------------------------------------------------------
provides estimates of the grand mean and the effect of the ith factor FACTOR 1-- 1.00000 10. 0.99800 0.00036 0.00178
level. The predicted values and the residuals of the model are -- 2.00000 10. 0.99910 0.00146 0.00178
-- 3.00000 10. 0.99540 -0.00224 0.00178
-- 4.00000 10. 0.99820 0.00056 0.00178
-- 5.00000 10. 0.99190 -0.00574 0.00178
-- 6.00000 10. 0.99880 0.00116 0.00178
The distinction between these models is that the second model divides -- 7.00000 10. 1.00150 0.00386 0.00178

http://www.itl.nist.gov/div898/handbook/eda/section3/eda354.htm (1 of 4) [11/13/2003 5:32:30 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda354.htm (2 of 4) [11/13/2003 5:32:30 PM]


1.3.5.4. One-Factor ANOVA 1.3.5.4. One-Factor ANOVA
-- 8.00000 10. 1.00040 0.00276 0.00178 For these data, including the factor effect reduces the residual
-- 9.00000 10. 0.99830 0.00066 0.00178 standard deviation from 0.00623 to 0.0059. That is, although the
-- 10.00000 10. 0.99480 -0.00284 0.00178 factor is statistically significant, it has minimal improvement
over a simple constant model. This is because the factor is just
barely significant.
MODEL RESIDUAL STANDARD DEVIATION Output from other statistical software may look somewhat different
------------------------------------------------------- from the above output.
CONSTANT ONLY-- 0.0062789079
CONSTANT & FACTOR 1 ONLY-- 0.0059385784 In addition to the quantitative ANOVA output, it is recommended that
any analysis of variance be complemented with model validation. At a
minimum, this should include
1. A run sequence plot of the residuals.
2. A normal probability plot of the residuals.
Interpretation The output is divided into three sections. 3. A scatter plot of the predicted values against the residuals.
of Sample 1. The first section prints the number of observations (100), the
Output number of factors (10), and the number of levels for each factor Question The analysis of variance can be used to answer the following question
(10 levels for factor 1). It also prints some overall summary ● Are means the same across groups in the data?
statistics. In particular, the residual standard deviation is 0.0059.
The smaller the residual standard deviation, the more we have Importance The analysis of uncertainty depends on whether the factor significantly
accounted for the variance in the data. affects the outcome.
2. The second section prints an ANOVA table. The ANOVA table
decomposes the variance into the following component sum of Related Two-sample t-test
squares: Techniques Multi-factor analysis of variance
❍ Total sum of squares. The degrees of freedom for this Regression
entry is the number of observations minus one. Box plot
❍ Sum of squares for the factor. The degrees of freedom for
this entry is the number of levels minus one. The mean Software Most general purpose statistical software programs, including Dataplot,
square is the sum of squares divided by the number of can generate an analysis of variance.
degrees of freedom.
❍ Residual sum of squares. The degrees of freedom is the
total degrees of freedom minus the factor degrees of
freedom. The mean square is the sum of squares divided
by the number of degrees of freedom.
That is, it summarizes how much of the variance in the data
(total sum of squares) is accounted for by the factor effect (factor
sum of squares) and how much is random error (residual sum of
squares). Ideally, we would like most of the variance to be
explained by the factor effect. The ANOVA table provides a
formal F test for the factor effect. The F-statistic is the mean
square for the factor divided by the mean square for the error.
This statistic follows an F distribution with (k-1) and (N-k)
degrees of freedom. If the F CDF column for the factor effect is
greater than 95%, then the factor is significant at the 5% level.
3. The third section prints an estimation section. It prints an overall
mean and overall standard deviation. Then for each level of each
factor, it prints the number of observations, the mean for the
observations of each cell ( in the above terminology), the
factor effect ( in the above terminology), and the standard
deviation of the factor effect. Finally, it prints the residual
standard deviation for the various possible models. For the
one-way ANOVA, the two models are the constant model, i.e.,

and the model with a factor effect

http://www.itl.nist.gov/div898/handbook/eda/section3/eda354.htm (3 of 4) [11/13/2003 5:32:30 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda354.htm (4 of 4) [11/13/2003 5:32:30 PM]


1.3.5.5. Multi-factor Analysis of Variance 1.3.5.5. Multi-factor Analysis of Variance
factor and the jth level of the second factor, respectively), and an error
term. The analysis of variance provides estimates of the grand mean and
the factor effects. The predicted values and the residuals of the model
are

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.5. Quantitative Techniques
The distinction between these models is that the second model divides
the cell mean into an overall mean and factor effects. This second model
1.3.5.5. Multi-factor Analysis of Variance makes the factor effect more explicit, so we will emphasize this
approach.
Purpose: The analysis of variance (ANOVA) (Neter, Wasserman, and Kunter,
Detect 1990) is used to detect significant factors in a multi-factor model. In the Model Note that the ANOVA model assumes that the error term, Eijk, should
significant Validation follow the assumptions for a univariate measurement process. That is,
multi-factor model, there is a response (dependent) variable and one or
factors more factor (independent) variables. This is a common model in after performing an analysis of variance, the model should be validated
designed experiments where the experimenter sets the values for each of by analyzing the residuals.
the factor variables and then measures the response variable.
Each factor can take on a certain number of values. These are referred to
Sample Dataplot generated the following ANOVA output for the JAHANMI2.DAT data set:
as the levels of a factor. The number of levels can vary betweeen
Output
factors. For designed experiments, the number of levels for a given
factor tends to be small. Each factor and level combination is a cell.
Balanced designs are those in which the cells have an equal number of **********************************
observations and unbalanced designs are those in which the number of **********************************
observations varies among cells. It is customary to use balanced designs ** 4-WAY ANALYSIS OF VARIANCE **
in designed experiments. **********************************
**********************************
Definition The Product and Process Comparisons chapter (chapter 7) contains a
more extensive discussion of 2-factor ANOVA, including the details for NUMBER OF OBSERVATIONS = 480
NUMBER OF FACTORS = 4
the mathematical computations.
NUMBER OF LEVELS FOR FACTOR 1 = 2
The model for the analysis of variance can be stated in two NUMBER OF LEVELS FOR FACTOR 2 = 2
mathematically equivalent ways. We explain the model for a two-way NUMBER OF LEVELS FOR FACTOR 3 = 2
ANOVA (the concepts are the same for additional factors). In the NUMBER OF LEVELS FOR FACTOR 4 = 2
following discussion, each combination of factors and levels is called a BALANCED CASE
cell. In the following, the subscript i refers to the level of factor 1, j RESIDUAL STANDARD DEVIATION = 0.63057727814E+02
refers to the level of factor 2, and the subscript k refers to the kth RESIDUAL DEGREES OF FREEDOM = 475
observation within the (i,j)th cell. For example, Y235 refers to the fifth REPLICATION CASE
observation in the second level of factor 1 and the third level of factor 2. REPLICATION STANDARD DEVIATION = 0.61890106201E+02
REPLICATION DEGREES OF FREEDOM = 464
The first model is NUMBER OF DISTINCT CELLS = 16

*****************
This model decomposes the response into a mean for each cell and an * ANOVA TABLE *
error term. The analysis of variance provides estimates for each cell *****************
mean. These cell means are the predicted values of the model and the
differences between the response variable and the estimated cell means SOURCE DF SUM OF SQUARES MEAN SQUARE F STATISTIC F CDF SIG
are the residuals. That is -------------------------------------------------------------------------------
TOTAL (CORRECTED) 479 2668446.000000 5570.868652
-------------------------------------------------------------------------------
FACTOR 1 1 26672.726562 26672.726562 6.7080 99.011% **
FACTOR 2 1 11524.053711 11524.053711 2.8982 91.067%
The second model is FACTOR 3 1 14380.633789 14380.633789 3.6166 94.219%
FACTOR 4 1 727143.125000 727143.125000 182.8703 100.000% **
-------------------------------------------------------------------------------
This model decomposes the response into an overall (grand) mean, RESIDUAL 475 1888731.500000 3976.276855
factor effects ( and represent the effects of the ith level of the first
RESIDUAL STANDARD DEVIATION = 63.05772781

http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm (1 of 5) [11/13/2003 5:32:30 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm (2 of 5) [11/13/2003 5:32:30 PM]


1.3.5.5. Multi-factor Analysis of Variance 1.3.5.5. Multi-factor Analysis of Variance
RESIDUAL DEGREES OF FREEDOM = 475 That is, it summarizes how much of the variance in the data
REPLICATION STANDARD DEVIATION = 61.89010620 (total sum of squares) is accounted for by the factor effects
REPLICATION DEGREES OF FREEDOM = 464 (factor sum of squares) and how much is random error (residual
LACK OF FIT F RATIO = 2.6447 = THE 99.7269% POINT OF THE sum of squares). Ideally, we would like most of the variance to
F DISTRIBUTION WITH 11 AND 464 DEGREES OF FREEDOM be explained by the factor effects. The ANOVA table provides a
formal F test for the factor effects. The F-statistic is the mean
**************** square for the factor divided by the mean square for the error.
* ESTIMATION * This statistic follows an F distribution with (k-1) and (N-k)
**************** degrees of freedom where k is the number of levels for the given
factor. If the F CDF column for the factor effect is greater than
GRAND MEAN = 0.65007739258E+03 95%, then the factor is significant at the 5% level. Here, we see
GRAND STANDARD DEVIATION = 0.74638252258E+02 that the size of the effect of factor 4 dominates the size of the
other effects. The F test shows that factors one and four are
significant at the 1% level while factors two and three are not
LEVEL-ID NI MEAN EFFECT SD(EFFECT) significant at the 5% level.
-------------------------------------------------------------------- 3. The third section is an estimation section. It prints an overall
FACTOR 1-- -1.00000 240. 657.53168 7.45428 2.87818 mean and overall standard deviation. Then for each level of each
-- 1.00000 240. 642.62286 -7.45453 2.87818 factor, it prints the number of observations, the mean for the
FACTOR 2-- -1.00000 240. 645.17755 -4.89984 2.87818
observations of each cell ( in the above terminology), the
-- 1.00000 240. 654.97723 4.89984 2.87818
FACTOR 3-- -1.00000 240. 655.55084 5.47345 2.87818 factor effects ( and in the above terminology), and the
-- 1.00000 240. 644.60376 -5.47363 2.87818 standard deviation of the factor effect. Finally, it prints the
FACTOR 4-- 1.00000 240. 688.99890 38.92151 2.87818 residual standard deviation for the various possible models. For
-- 2.00000 240. 611.15594 -38.92145 2.87818 the four-way ANOVA here, it prints the constant model

MODEL RESIDUAL STANDARD DEVIATION a model with each factor individually, and the model with all
------------------------------------------------------- four factors included.
CONSTANT ONLY-- 74.6382522583
CONSTANT & FACTOR 1 ONLY-- 74.3419036865 For these data, we see that including factor 4 has a significant
CONSTANT & FACTOR 2 ONLY-- 74.5548019409 impact on the residual standard deviation (63.73 when only the
CONSTANT & FACTOR 3 ONLY-- 74.5147094727 factor 4 effect is included compared to 63.058 when all four
CONSTANT & FACTOR 4 ONLY-- 63.7284545898 factors are included).
CONSTANT & ALL 4 FACTORS -- 63.0577278137 Output from other statistical software may look somewhat different
from the above output.
In addition to the quantitative ANOVA output, it is recommended that
any analysis of variance be complemented with model validation. At a
Interpretation The output is divided into three sections. minimum, this should include
of Sample
Output 1. The first section prints the number of observations (480), the 1. A run sequence plot of the residuals.
number of factors (4), and the number of levels for each factor (2 2. A normal probability plot of the residuals.
levels for each factor). It also prints some overall summary
statistics. In particular, the residual standard deviation is 63.058. 3. A scatter plot of the predicted values against the residuals.
The smaller the residual standard deviation, the more we have
accounted for the variance in the data. Questions The analysis of variance can be used to answer the following
2. The second section prints an ANOVA table. The ANOVA table questions:
decomposes the variance into the following component sum of 1. Do any of the factors have a significant effect?
squares: 2. Which is the most important factor?
❍ Total sum of squares. The degrees of freedom for this 3. Can we account for most of the variability in the data?
entry is the number of observations minus one.
❍ Sum of squares for each of the factors. The degrees of Related One-factor analysis of variance
freedom for these entries are the number of levels for the Techniques Two-sample t-test
factor minus one. The mean square is the sum of squares Box plot
divided by the number of degrees of freedom. Block plot
❍ Residual sum of squares. The degrees of freedom is the Dex mean plot
total degrees of freedom minus the sum of the factor
degrees of freedom. The mean square is the sum of Case Study The quantitative ANOVA approach can be contrasted with the more
squares divided by the number of degrees of freedom. graphical EDA approach in the ceramic strength case study.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm (3 of 5) [11/13/2003 5:32:30 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm (4 of 5) [11/13/2003 5:32:30 PM]


1.3.5.5. Multi-factor Analysis of Variance 1.3.5.6. Measures of Scale

Software Most general purpose statistical software programs, including Dataplot,


can perform multi-factor analysis of variance.

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.5. Quantitative Techniques

1.3.5.6. Measures of Scale


Scale, A fundamental task in many statistical analyses is to characterize the
Variability, or spread, or variability, of a data set. Measures of scale are simply
Spread attempts to estimate this variability.
When assessing the variability of a data set, there are two key
components:
1. How spread out are the data values near the center?
2. How spread out are the tails?
Different numerical summaries will give different weight to these two
elements. The choice of scale estimator is often driven by which of
these components you want to emphasize.
The histogram is an effective graphical technique for showing both of
these components of the spread.

Definitions of For univariate data, there are several common numerical measures of
Variability the spread:
1. variance - the variance is defined as

where is the mean of the data.


The variance is roughly the arithmetic average of the squared
distance from the mean. Squaring the distance from the mean
has the effect of giving greater weight to values that are further
from the mean. For example, a point 2 units from the mean
adds 4 to the above sum while a point 10 units from the mean
adds 100 to the sum. Although the variance is intended to be an
overall measure of spread, it can be greatly affected by the tail
behavior.
2. standard deviation - the standard deviation is the square root of
the variance. That is,

http://www.itl.nist.gov/div898/handbook/eda/section3/eda355.htm (5 of 5) [11/13/2003 5:32:30 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm (1 of 6) [11/13/2003 5:32:31 PM]


1.3.5.6. Measures of Scale 1.3.5.6. Measures of Scale

Why Different The following example helps to clarify why these alternative
Measures? defintions of spread are useful and necessary.

The standard deviation restores the units of the spread to the This plot shows histograms for 10,000 random numbers generated
original data units (the variance squares the units). from a normal, a double exponential, a Cauchy, and a Tukey-Lambda
distribution.
3. range - the range is the largest value minus the smallest value in
a data set. Note that this measure is based only on the lowest
and highest extreme values in the sample. The spread near the
center of the data is not captured at all.
4. average absolute deviation - the average absolute deviation
(AAD) is defined as

where is the mean of the data and |Y| is the absolute value of
Y. This measure does not square the distance from the mean, so
it is less affected by extreme observations than are the variance
and standard deviation.
5. median absolute deviation - the median absolute deviation
(MAD) is defined as

where is the median of the data and |Y| is the absolute value
Normal The first histogram is a sample from a normal distribution. The
of Y. This is a variation of the average absolute deviation that is
Distribution standard deviation is 0.997, the median absolute deviation is 0.681,
even less affected by extremes in the tail because the data in the
tails have less influence on the calculation of the median than and the range is 7.87.
they do on the mean. The normal distribution is a symmetric distribution with well-behaved
6. interquartile range - this is the value of the 75th percentile tails and a single peak at the center of the distribution. By symmetric,
minus the value of the 25th percentile. This measure of scale we mean that the distribution can be folded about an axis so that the
attempts to measure the variability of points near the center. two sides coincide. That is, it behaves the same to the left and right of
In summary, the variance, standard deviation, average absolute some center point. In this case, the median absolute deviation is a bit
deviation, and median absolute deviation measure both aspects of the less than the standard deviation due to the downweighting of the tails.
variability; that is, the variability near the center and the variability in The range of a little less than 8 indicates the extreme values fall
the tails. They differ in that the average absolute deviation and median within about 4 standard deviations of the mean. If a histogram or
absolute deviation do not give undue weight to the tail behavior. On normal probability plot indicates that your data are approximated well
the other hand, the range only uses the two most extreme points and by a normal distribution, then it is reasonable to use the standard
the interquartile range only uses the middle portion of the data. deviation as the spread estimator.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm (2 of 6) [11/13/2003 5:32:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm (3 of 6) [11/13/2003 5:32:31 PM]


1.3.5.6. Measures of Scale 1.3.5.6. Measures of Scale

Double The second histogram is a sample from a double exponential Tukey-Lambda The fourth histogram is a sample from a Tukey lambda distribution
Exponential distribution. The standard deviation is 1.417, the median absolute Distribution with shape parameter = 1.2. The standard deviation is 0.49, the
Distribution deviation is 0.706, and the range is 17.556. median absolute deviation is 0.427, and the range is 1.666.
Comparing the double exponential and the normal histograms shows The Tukey lambda distribution has a range limited to .
that the double exponential has a stronger peak at the center, decays That is, it has truncated tails. In this case the standard deviation and
more rapidly near the center, and has much longer tails. Due to the median absolute deviation have closer values than for the other three
longer tails, the standard deviation tends to be inflated compared to examples which have significant tails.
the normal. On the other hand, the median absolute deviation is only
slightly larger than it is for the normal data. The longer tails are Robustness
clearly reflected in the value of the range, which shows that the Tukey and Mosteller defined two types of robustness where
extremes fall about 12 standard deviations from the mean compared to robustness is a lack of susceptibility to the effects of nonnormality.
about 4 for the normal data. 1. Robustness of validity means that the confidence intervals for a
measure of the population spread (e.g., the standard deviation)
Cauchy The third histogram is a sample from a Cauchy distribution. The have a 95% chance of covering the true value (i.e., the
Distribution standard deviation is 998.389, the median absolute deviation is 1.16, population value) of that measure of spread regardless of the
and the range is 118,953.6. underlying distribution.
The Cauchy distribution is a symmetric distribution with heavy tails 2. Robustness of efficiency refers to high effectiveness in the face
and a single peak at the center of the distribution. The Cauchy of non-normal tails. That is, confidence intervals for the
distribution has the interesting property that collecting more data does measure of spread tend to be almost as narrow as the best that
not provide a more accurate estimate for the mean or standard could be done if we knew the true shape of the distribution.
deviation. That is, the sampling distribution of the means and standard The standard deviation is an example of an estimator that is the best
deviation are equivalent to the sampling distribution of the original we can do if the underlying distribution is normal. However, it lacks
data. That means that for the Cauchy distribution the standard robustness of validity. That is, confidence intervals based on the
deviation is useless as a measure of the spread. From the histogram, it standard deviation tend to lack precision if the underlying distribution
is clear that just about all the data are between about -5 and 5. is in fact not normal.
However, a few very extreme values cause both the standard deviation
The median absolute deviation and the interquartile range are
and range to be extremely large. However, the median absolute
estimates of scale that have robustness of validity. However, they are
deviation is only slightly larger than it is for the normal distribution.
not particularly strong for robustness of efficiency.
In this case, the median absolute deviation is clearly the better
measure of spread. If histograms and probability plots indicate that your data are in fact
reasonably approximated by a normal distribution, then it makes sense
Although the Cauchy distribution is an extreme case, it does illustrate
to use the standard deviation as the estimate of scale. However, if your
the importance of heavy tails in measuring the spread. Extreme values
data are not normal, and in particular if there are long tails, then using
in the tails can distort the standard deviation. However, these extreme
an alternative measure such as the median absolute deviation, average
values do not distort the median absolute deviation since the median
absolute deviation, or interquartile range makes sense. The range is
absolute deviation is based on ranks. In general, for data with extreme
used in some applications, such as quality control, for its simplicity. In
values in the tails, the median absolute deviation or interquartile range
addition, comparing the range to the standard deviation gives an
can provide a more stable estimate of spread than the standard
indication of the spread of the data in the tails.
deviation.
Since the range is determined by the two most extreme points in the
data set, we should be cautious about its use for large values of N.
Tukey and Mosteller give a scale estimator that has both robustness of

http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm (4 of 6) [11/13/2003 5:32:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm (5 of 6) [11/13/2003 5:32:31 PM]


1.3.5.6. Measures of Scale 1.3.5.7. Bartlett's Test

validity and robustness of efficiency. However, it is more complicated


and we do not give the formula here.

Software Most general purpose statistical software programs, including


Dataplot, can generate at least some of the measures of scale 1. Exploratory Data Analysis
discusssed above. 1.3. EDA Techniques
1.3.5. Quantitative Techniques

1.3.5.7. Bartlett's Test


Purpose: Bartlett's test ( Snedecor and Cochran, 1983) is used to test if k samples have equal
Test for variances. Equal variances across samples is called homogeneity of variances. Some
Homogeneity statistical tests, for example the analysis of variance, assume that variances are equal
of Variances across groups or samples. The Bartlett test can be used to verify that assumption.
Bartlett's test is sensitive to departures from normality. That is, if your samples come
from non-normal distributions, then Bartlett's test may simply be testing for
non-normality. The Levene test is an alternative to the Bartlett test that is less sensitive to
departures from normality.

Definition The Bartlett test is defined as:


H0:
Ha: for at least one pair (i,j).
Test The Bartlett test statistic is designed to test for equality of variances across
Statistic: groups against the alternative that variances are unequal for at least two
groups.

In the above, si2 is the variance of the ith group, N is the total sample size,
Ni is the sample size of the ith group, k is the number of groups, and sp2 is
the pooled variance. The pooled variance is a weighted average of the
group variances and is defined as:

Significance
Level:

http://www.itl.nist.gov/div898/handbook/eda/section3/eda356.htm (6 of 6) [11/13/2003 5:32:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm (1 of 3) [11/13/2003 5:32:31 PM]


1.3.5.7. Bartlett's Test 1.3.5.7. Bartlett's Test

Critical The variances are judged to be unequal if, Interpretation We are testing the hypothesis that the group variances are all equal.
Region: of Sample The output is divided into two sections.
Output 1. The first section prints the value of the Bartlett test statistic, the
degrees of freedom (k-1), the upper critical value of the
where is the upper critical value of the chi-square distribution chi-square distribution corresponding to significance levels of
with k - 1 degrees of freedom and a significance level of . 0.05 (the 95% percent point) and 0.01 (the 99% percent point).
In the above formulas for the critical regions, the Handbook follows the We reject the null hypothesis at that significance level if the
value of the Bartlett test statistic is greater than the
convention that is the upper critical value from the chi-square corresponding critical value.
distribution and is the lower critical value from the chi-square 2. The second section prints the conclusion for a 95% test.
distribution. Note that this is the opposite of some texts and software Output from other statistical software may look somewhat different
programs. In particular, Dataplot uses the opposite convention. from the above output.
An alternate definition (Dixon and Massey, 1969) is based on an approximation to the F
Question Bartlett's test can be used to answer the following question:
distribution. This definition is given in the Product and Process Comparisons chapter
● Is the assumption of equal variances valid?
(chapter 7).
Importance Bartlett's test is useful whenever the assumption of equal variances is
made. In particular, this assumption is made for the frequently used
Sample Dataplot generated the following output for Bartlett's test using the GEAR.DAT one-way analysis of variance. In this case, Bartlett's or Levene's test
Output data set: should be applied to verify the assumption.

BARTLETT TEST Related Standard Deviation Plot


(STANDARD DEFINITION) Techniques Box Plot
NULL HYPOTHESIS UNDER TEST--ALL SIGMA(I) ARE EQUAL Levene Test
Chi-Square Test
TEST:
Analysis of Variance
DEGREES OF FREEDOM = 9.000000

TEST STATISTIC VALUE = 20.78580 Case Study Heat flow meter data
CUTOFF: 95% PERCENT POINT = 16.91898
CUTOFF: 99% PERCENT POINT = 21.66600 Software The Bartlett test is available in many general purpose statistical
software programs, including Dataplot.
CHI-SQUARE CDF VALUE = 0.986364

NULL NULL HYPOTHESIS NULL HYPOTHESIS


HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
ALL SIGMA EQUAL (0.000,0.950) REJECT

http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm (2 of 3) [11/13/2003 5:32:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda357.htm (3 of 3) [11/13/2003 5:32:31 PM]


1.3.5.8. Chi-Square Test for the Standard Deviation 1.3.5.8. Chi-Square Test for the Standard Deviation

In the above formulas for the critical regions, the Handbook follows the
convention that is the upper critical value from the chi-square
distribution and is the lower critical value from the chi-square
distribution. Note that this is the opposite of some texts and software
programs. In particular, Dataplot uses the opposite convention.
1. Exploratory Data Analysis The formula for the hypothesis test can easily be converted to form an interval estimate for the
1.3. EDA Techniques standard deviation:
1.3.5. Quantitative Techniques

1.3.5.8. Chi-Square Test for the Standard Deviation


Purpose: A chi-square test ( Snedecor and Cochran, 1983) can be used to test if the standard deviation of Sample Dataplot generated the following output for a chi-square test from the GEAR.DAT data set:
Test if a population is equal to a specified value. This test can be either a two-sided test or a one-sided Output
standard test. The two-sided version tests against the alternative that the true standard deviation is either CHI-SQUARED TEST
deviation is less than or greater than the specified value. The one-sided version only tests in one direction. SIGMA0 = 0.1000000
equal to a The choice of a two-sided or one-sided test is determined by the problem. For example, if we NULL HYPOTHESIS UNDER TEST--STANDARD DEVIATION SIGMA = .1000000
specified are testing a new process, we may only be concerned if its variability is greater than the
value variability of the current process. SAMPLE:
NUMBER OF OBSERVATIONS = 100
Definition The chi-square hypothesis test is defined as: MEAN = 0.9976400
H0: STANDARD DEVIATION S = 0.6278908E-02
Ha: for a lower one-tailed test TEST:
S/SIGMA0 = 0.6278908E-01
for an upper one-tailed test
CHI-SQUARED STATISTIC = 0.3903044
DEGREES OF FREEDOM = 99.00000
for a two-tailed test CHI-SQUARED CDF VALUE = 0.000000
Test Statistic: T=
ALTERNATIVE- ALTERNATIVE-
where N is the sample size and is the sample standard deviation. The key ALTERNATIVE- HYPOTHESIS HYPOTHESIS
element of this formula is the ratio which compares the ratio of the HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
sample standard deviation to the target standard deviation. The more this SIGMA <> .1000000 (0,0.025), (0.975,1) ACCEPT
ratio deviates from 1, the more likely we are to reject the null hypothesis. SIGMA < .1000000 (0,0.05) ACCEPT
Significance Level: . SIGMA > .1000000 (0.95,1) REJECT
Critical Region: Reject the null hypothesis that the standard deviation is a specified value,
, if Interpretation We are testing the hypothesis that the population standard deviation is 0.1. The output is
of Sample divided into three sections.
for an upper one-tailed alternative Output 1. The first section prints the sample statistics used in the computation of the chi-square
test.
for a lower one-tailed alternative
2. The second section prints the chi-square test statistic value, the degrees of freedom, and
the cumulative distribution function (cdf) value of the chi-square test statistic. The
for a two-tailed test
chi-square test statistic cdf value is an alternative way of expressing the critical value.
or This cdf value is compared to the acceptance intervals printed in section three. For an
upper one-tailed test, the alternative hypothesis acceptance interval is (1 - ,1), the
alternative hypothesis acceptance interval for a lower one-tailed test is (0, ), and the
where is the critical value of the chi-square distribution with N - alternative hypothesis acceptance interval for a two-tailed test is (1 - /2,1) or (0, /2).
Note that accepting the alternative hypothesis is equivalent to rejecting the null
1 degrees of freedom. hypothesis.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm (1 of 3) [11/13/2003 5:32:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm (2 of 3) [11/13/2003 5:32:32 PM]


1.3.5.8. Chi-Square Test for the Standard Deviation 1.3.5.8.1. Data Used for Chi-Square Test for the Standard Deviation

3. The third section prints the conclusions for a 95% test since this is the most common
case. Results are given in terms of the alternative hypothesis for the two-tailed test and
for the one-tailed test in both directions. The alternative hypothesis acceptance interval
column is stated in terms of the cdf value printed in section two. The last column
specifies whether the alternative hypothesis is accepted or rejected. For a different
significance level, the appropriate conclusion can be drawn from the chi-square test 1. Exploratory Data Analysis
statistic cdf value printed in section two. For example, for a significance level of 0.10, 1.3. EDA Techniques
the corresponding alternative hypothesis acceptance intervals are (0,0.05) and (0.95,1), 1.3.5. Quantitative Techniques
(0, 0.10), and (0.90,1).
1.3.5.8. Chi-Square Test for the Standard Deviation
Output from other statistical software may look somewhat different from the above output.

Questions The chi-square test can be used to answer the following questions:
1. Is the standard deviation equal to some pre-determined threshold value?
1.3.5.8.1. Data Used for Chi-Square Test for
2. Is the standard deviation greater than some pre-determined threshold value? the Standard Deviation
3. Is the standard deviation less than some pre-determined threshold value?
Data Used The following are the data used for the chi-square test for the standard
Related F Test for deviation example. The first column is gear diameter and the second
Techniques Bartlett Test Chi-Square column is batch number. Only the first column is used for this example.
Levene Test Test for the
Standard
Software The chi-square test for the standard deviation is available in many general purpose statistical 1.006 1.000
Deviation
software programs, including Dataplot. 0.996 1.000
Example
0.998 1.000
1.000 1.000
0.992 1.000
0.993 1.000
1.002 1.000
0.999 1.000
0.994 1.000
1.000 1.000
0.998 2.000
1.006 2.000
1.000 2.000
1.002 2.000
0.997 2.000
0.998 2.000
0.996 2.000
1.000 2.000
1.006 2.000
0.988 2.000
0.991 3.000
0.987 3.000
0.997 3.000
0.999 3.000
0.995 3.000
0.994 3.000
1.000 3.000

http://www.itl.nist.gov/div898/handbook/eda/section3/eda358.htm (3 of 3) [11/13/2003 5:32:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm (1 of 3) [11/13/2003 5:32:32 PM]


1.3.5.8.1. Data Used for Chi-Square Test for the Standard Deviation 1.3.5.8.1. Data Used for Chi-Square Test for the Standard Deviation

0.999 3.000 1.000 8.000


0.996 3.000 1.002 8.000
0.996 3.000 0.996 8.000
1.005 4.000 0.998 8.000
1.002 4.000 0.996 8.000
0.994 4.000 1.002 8.000
1.000 4.000 1.006 8.000
0.995 4.000 1.002 9.000
0.994 4.000 0.998 9.000
0.998 4.000 0.996 9.000
0.996 4.000 0.995 9.000
1.002 4.000 0.996 9.000
0.996 4.000 1.004 9.000
0.998 5.000 1.004 9.000
0.998 5.000 0.998 9.000
0.982 5.000 0.999 9.000
0.990 5.000 0.991 9.000
1.002 5.000 0.991 10.000
0.984 5.000 0.995 10.000
0.996 5.000 0.984 10.000
0.993 5.000 0.994 10.000
0.980 5.000 0.997 10.000
0.996 5.000 0.997 10.000
1.009 6.000 0.991 10.000
1.013 6.000 0.998 10.000
1.009 6.000 1.004 10.000
0.997 6.000 0.997 10.000
0.988 6.000
1.002 6.000
0.995 6.000
0.998 6.000
0.981 6.000
0.996 6.000
0.990 7.000
1.004 7.000
0.996 7.000
1.001 7.000
0.998 7.000
1.000 7.000
1.018 7.000
1.010 7.000
0.996 7.000
1.002 7.000
0.998 8.000
1.000 8.000
1.006 8.000

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm (2 of 3) [11/13/2003 5:32:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3581.htm (3 of 3) [11/13/2003 5:32:32 PM]


1.3.5.9. F-Test for Equality of Two Standard Deviations 1.3.5.9. F-Test for Equality of Two Standard Deviations

Critical The hypothesis that the two standard deviations are equal is rejected if
Region:
for an upper one-tailed test

for a lower one-tailed test

for a two-tailed test


1. Exploratory Data Analysis
1.3. EDA Techniques or
1.3.5. Quantitative Techniques

where is the critical value of the F distribution with and


1.3.5.9. F-Test for Equality of Two Standard degrees of freedom and a significance level of .
Deviations In the above formulas for the critical regions, the Handbook follows the
convention that is the upper critical value from the F distribution and
Purpose: An F-test ( Snedecor and Cochran, 1983) is used to test if the standard deviations of two is the lower critical value from the F distribution. Note that this is
Test if populations are equal. This test can be a two-tailed test or a one-tailed test. The the opposite of the designation used by some texts and software programs.
standard two-tailed version tests against the alternative that the standard deviations are not equal. In particular, Dataplot uses the opposite convention.
deviations The one-tailed version only tests in one direction, that is the standard deviation from the
from two first population is either greater than or less than (but not both) the second population
populations standard deviation . The choice is determined by the problem. For example, if we are Sample Dataplot generated the following output for an F-test from the JAHANMI2.DAT data
are equal testing a new process, we may only be interested in knowing if the new process is less Output set:
variable than the old process.
F TEST
Definition The F hypothesis test is defined as: NULL HYPOTHESIS UNDER TEST--SIGMA1 = SIGMA2
ALTERNATIVE HYPOTHESIS UNDER TEST--SIGMA1 NOT EQUAL SIGMA2
H0:
Ha: for a lower one tailed test SAMPLE 1:
NUMBER OF OBSERVATIONS = 240
for an upper one tailed test MEAN = 688.9987
STANDARD DEVIATION = 65.54909
for a two tailed test
Test F= SAMPLE 2:
Statistic: NUMBER OF OBSERVATIONS = 240
where and are the sample variances. The more this ratio deviates MEAN = 611.1559
from 1, the stronger the evidence for unequal population variances. STANDARD DEVIATION = 61.85425
Significance
Level: TEST:
STANDARD DEV. (NUMERATOR) = 65.54909
STANDARD DEV. (DENOMINATOR) = 61.85425
F TEST STATISTIC VALUE = 1.123037
DEG. OF FREEDOM (NUMER.) = 239.0000
DEG. OF FREEDOM (DENOM.) = 239.0000
F TEST STATISTIC CDF VALUE = 0.814808

NULL NULL HYPOTHESIS NULL HYPOTHESIS


HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
SIGMA1 = SIGMA2 (0.000,0.950) ACCEPT

http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm (1 of 3) [11/13/2003 5:32:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm (2 of 3) [11/13/2003 5:32:33 PM]


1.3.5.9. F-Test for Equality of Two Standard Deviations 1.3.5.10. Levene Test for Equality of Variances

Interpretation We are testing the hypothesis that the standard deviations for sample one and sample
of Sample two are equal. The output is divided into four sections.
Output 1. The first section prints the sample statistics for sample one used in the
computation of the F-test.
2. The second section prints the sample statistics for sample two used in the
computation of the F-test.
1. Exploratory Data Analysis
3. The third section prints the numerator and denominator standard deviations, the 1.3. EDA Techniques
F-test statistic value, the degrees of freedom, and the cumulative distribution 1.3.5. Quantitative Techniques
function (cdf) value of the F-test statistic. The F-test statistic cdf value is an
alternative way of expressing the critical value. This cdf value is compared to the
acceptance interval printed in section four. The acceptance interval for a 1.3.5.10. Levene Test for Equality of
two-tailed test is (0,1 - ).
4. The fourth section prints the conclusions for a 95% test since this is the most Variances
common case. Results are printed for an upper one-tailed test. The acceptance
interval column is stated in terms of the cdf value printed in section three. The Purpose: Levene's test ( Levene 1960) is used to test if k samples have equal
last column specifies whether the null hypothesis is accepted or rejected. For a Test for variances. Equal variances across samples is called homogeneity of
different significance level, the appropriate conclusion can be drawn from the Homogeneity variance. Some statistical tests, for example the analysis of variance,
F-test statistic cdf value printed in section four. For example, for a significance of Variances assume that variances are equal across groups or samples. The Levene test
level of 0.10, the corresponding acceptance interval become (0.000,0.9000).
can be used to verify that assumption.
Output from other statistical software may look somewhat different from the above
output. Levene's test is an alternative to the Bartlett test. The Levene test is less
sensitive than the Bartlett test to departures from normality. If you have
Questions The F-test can be used to answer the following questions: strong evidence that your data do in fact come from a normal, or nearly
1. Do two samples come from populations with equal standard deviations? normal, distribution, then Bartlett's test has better performance.
2. Does a new process, treatment, or test reduce the variability of the current
process? Definition The Levene test is defined as:
H0:
Related Quantile-Quantile Plot
Techniques Bihistogram Ha: for at least one pair (i,j).
Chi-Square Test Test Given a variable Y with sample of size N divided into k
Bartlett's Test Statistic: subgroups, where Ni is the sample size of the ith subgroup,
Levene Test the Levene test statistic is defined as:

Case Study Ceramic strength data.

Software The F-test for equality of two standard deviations is available in many general purpose
where Zij can have one of the following three definitions:
statistical software programs, including Dataplot. 1.

where is the mean of the ith subgroup.


2.

where is the median of the ith subgroup.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda359.htm (3 of 3) [11/13/2003 5:32:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm (1 of 4) [11/13/2003 5:32:33 PM]


1.3.5.10. Levene Test for Equality of Variances 1.3.5.10. Levene Test for Equality of Variances
3.
Critical The Levene test rejects the hypothesis that the variances are
Region: equal if
where is the 10% trimmed mean of the ith
subgroup. where is the upper critical value of the F
are the group means of the Zij and is the overall distribution with k - 1 and N - k degrees of freedom at a
mean of the Zij. significance level of .

The three choices for defining Zij determine the robustness In the above formulas for the critical regions, the Handbook
and power of Levene's test. By robustness, we mean the follows the convention that is the upper critical value
ability of the test to not falsely detect unequal variances from the F distribution and is the lower critical
when the underlying data are not normally distributed and value. Note that this is the opposite of some texts and
the variables are in fact equal. By power, we mean the software programs. In particular, Dataplot uses the opposite
ability of the test to detect unequal variances when the convention.
variances are in fact unequal.
Sample Dataplot generated the following output for Levene's test using the
Levene's original paper only proposed using the mean. Output GEAR.DAT data set:
Brown and Forsythe (1974)) extended Levene's test to use
either the median or the trimmed mean in addition to the
mean. They performed Monte Carlo studies that indicated
that using the trimmed mean performed best when the LEVENE F-TEST FOR SHIFT IN VARIATION
underlying data followed a Cauchy distribution (i.e., (ASSUMPTION: NORMALITY)
heavy-tailed) and the median performed best when the
1. STATISTICS
underlying data followed a (i.e., skewed) distribution. NUMBER OF OBSERVATIONS = 100
Using the mean provided the best power for symmetric, NUMBER OF GROUPS = 10
moderate-tailed, distributions. LEVENE F TEST STATISTIC = 1.705910
Although the optimal choice depends on the underlying
distribution, the definition based on the median is
recommended as the choice that provides good robustness 2. FOR LEVENE TEST STATISTIC
against many types of non-normal data while retaining 0 % POINT = 0.
good power. If you have knowledge of the underlying 50 % POINT = 0.9339308
distribution of the data, this may indicate using one of the 75 % POINT = 1.296365
other choices. 90 % POINT = 1.702053
95 % POINT = 1.985595
Significance
99 % POINT = 2.610880
Level:
99.9 % POINT = 3.478882

90.09152 % Point: 1.705910

3. CONCLUSION (AT THE 5% LEVEL):


THERE IS NO SHIFT IN VARIATION.
THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm (2 of 4) [11/13/2003 5:32:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm (3 of 4) [11/13/2003 5:32:33 PM]


1.3.5.10. Levene Test for Equality of Variances 1.3.5.11. Measures of Skewness and Kurtosis

Interpretation We are testing the hypothesis that the group variances are equal. The
of Sample output is divided into three sections.
Output 1. The first section prints the number of observations (N), the number
of groups (k), and the value of the Levene test statistic. 1. Exploratory Data Analysis
2. The second section prints the upper critical value of the F 1.3. EDA Techniques
1.3.5. Quantitative Techniques
distribution corresponding to various significance levels. The value
in the first column, the confidence level of the test, is equivalent to
100(1- ). We reject the null hypothesis at that significance level if
the value of the Levene F test statistic printed in section one is
1.3.5.11. Measures of Skewness and
greater than the critical value printed in the last column. Kurtosis
3. The third section prints the conclusion for a 95% test. For a
different significance level, the appropriate conclusion can be drawn Skewness A fundamental task in many statistical analyses is to characterize the
from the table printed in section two. For example, for = 0.10, we and Kurtosis location and variability of a data set. A further characterization of the
look at the row for 90% confidence and compare the critical value data includes skewness and kurtosis.
1.702 to the Levene test statistic 1.7059. Since the test statistic is
greater than the critical value, we reject the null hypothesis at the Skewness is a measure of symmetry, or more precisely, the lack of
= 0.10 level. symmetry. A distribution, or data set, is symmetric if it looks the same
Output from other statistical software may look somewhat different from to the left and right of the center point.
the above output. Kurtosis is a measure of whether the data are peaked or flat relative to a
normal distribution. That is, data sets with high kurtosis tend to have a
Question Levene's test can be used to answer the following question: distinct peak near the mean, decline rather rapidly, and have heavy tails.
● Is the assumption of equal variances valid? Data sets with low kurtosis tend to have a flat top near the mean rather
than a sharp peak. A uniform distribution would be the extreme case.
Related Standard Deviation Plot The histogram is an effective graphical technique for showing both the
Techniques Box Plot skewness and kurtosis of data set.
Bartlett Test
Chi-Square Test Definition of For univariate data Y1, Y2, ..., YN, the formula for skewness is:
Analysis of Variance Skewness

Software The Levene test is available in some general purpose statistical software
programs, including Dataplot.
where is the mean, is the standard deviation, and N is the number of
data points. The skewness for a normal distribution is zero, and any
symmetric data should have a skewness near zero. Negative values for
the skewness indicate data that are skewed left and positive values for
the skewness indicate data that are skewed right. By skewed left, we
mean that the left tail is heavier than the right tail. Similarly, skewed
right means that the right tail is heavier than the left tail. Some
measurements have a lower bound and are skewed right. For example,
in reliability studies, failure times cannot be negative.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35a.htm (4 of 4) [11/13/2003 5:32:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm (1 of 4) [11/13/2003 5:32:34 PM]


1.3.5.11. Measures of Skewness and Kurtosis 1.3.5.11. Measures of Skewness and Kurtosis

Definition of For univariate data Y1, Y2, ..., YN, the formula for kurtosis is: Double The second histogram is a sample from a double exponential
Kurtosis Exponential distribution. The double exponential is a symmetric distribution.
Distribution Compared to the normal, it has a stronger peak, more rapid decay, and
heavier tails. That is, we would expect a skewness near zero and a
kurtosis higher than 3. The skewness is 0.06 and the kurtosis is 5.9.
where is the mean, is the standard deviation, and N is the number of
data points.
Cauchy The third histogram is a sample from a Cauchy distribution.
The kurtosis for a standard normal distribution is three. For this reason, Distribution
For better visual comparison with the other data sets, we restricted the
excess kurtosis is defined as
histogram of the Cauchy distribution to values between -10 and 10. The
full data set for the Cauchy data in fact has a minimum of approximately
-29,000 and a maximum of approximately 89,000.

so that the standard normal distribution has a kurtosis of zero. Positive The Cauchy distribution is a symmetric distribution with heavy tails and
kurtosis indicates a "peaked" distribution and negative kurtosis indicates a single peak at the center of the distribution. Since it is symmetric, we
a "flat" distribution. would expect a skewness near zero. Due to the heavier tails, we might
expect the kurtosis to be larger than for a normal distribution. In fact the
skewness is 69.99 and the kurtosis is 6,693. These extremely high
Examples The following example shows histograms for 10,000 random numbers
values can be explained by the heavy tails. Just as the mean and
generated from a normal, a double exponential, a Cauchy, and a Weibull
standard deviation can be distorted by extreme values in the tails, so too
distribution.
can the skewness and kurtosis measures.

Weibull The fourth histogram is a sample from a Weibull distribution with shape
Distribution parameter 1.5. The Weibull distribution is a skewed distribution with the
amount of skewness depending on the value of the shape parameter. The
degree of decay as we move away from the center also depends on the
value of the shape parameter. For this data set, the skewness is 1.08 and
the kurtosis is 4.46, which indicates moderate skewness and kurtosis.

Dealing Many classical statistical tests and intervals depend on normality


with assumptions. Significant skewness and kurtosis clearly indicate that data
Skewness are not normal. If a data set exhibits significant skewness or kurtosis (as
and Kurtosis indicated by a histogram or the numerical measures), what can we do
about it?
One approach is to apply some type of transformation to try to make the
data normal, or more nearly normal. The Box-Cox transformation is a
useful technique for trying to normalize a data set. In particular, taking
the log or square root of a data set is often useful for data that exhibit
moderate right skewness.
Normal The first histogram is a sample from a normal distribution. The normal
Distribution distribution is a symmetric distribution with well-behaved tails. This is Another approach is to use techniques based on distributions other than
indicated by the skewness of 0.03. The kurtosis of 2.96 is near the the normal. For example, in reliability studies, the exponential, Weibull,
expected value of 3. The histogram verifies the symmetry. and lognormal distributions are typically used as a basis for modeling
rather than using the normal distribution. The probability plot

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm (2 of 4) [11/13/2003 5:32:34 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm (3 of 4) [11/13/2003 5:32:34 PM]


1.3.5.11. Measures of Skewness and Kurtosis 1.3.5.12. Autocorrelation

correlation coefficient plot and the probability plot are useful tools for
determining a good distributional model for the data.

Software The skewness and kurtosis coefficients are available in most general
purpose statistical software programs, including Dataplot. 1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.5. Quantitative Techniques

1.3.5.12. Autocorrelation
Purpose: The autocorrelation ( Box and Jenkins, 1976) function can be used for
Detect the following two purposes:
Non-Randomness, 1. To detect non-randomness in data.
Time Series
Modeling 2. To identify an appropriate time series model if the data are not
random.

Definition Given measurements, Y1, Y2, ..., YN at time X1, X2, ..., XN, the lag k
autocorrelation function is defined as

Although the time variable, X, is not used in the formula for


autocorrelation, the assumption is that the observations are equi-spaced.
Autocorrelation is a correlation coefficient. However, instead of
correlation between two different variables, the correlation is between
two values of the same variable at times Xi and Xi+k.

When the autocorrelation is used to detect non-randomness, it is


usually only the first (lag 1) autocorrelation that is of interest. When the
autocorrelation is used to identify an appropriate time series model, the
autocorrelations are usually plotted for many lags.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm (4 of 4) [11/13/2003 5:32:34 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35c.htm (1 of 4) [11/13/2003 5:32:39 PM]


1.3.5.12. Autocorrelation 1.3.5.12. Autocorrelation

41. -0.45
42. -0.28
Sample Output Dataplot generated the following autocorrelation output using the 43. 0.62
LEW.DAT data set: 44. -0.10
45. -0.55
46. 0.45
47. 0.25
THE LAG-ONE AUTOCORRELATION COEFFICIENT OF THE
48. -0.61
49. 0.14
200 OBSERVATIONS = -0.3073048E+00

THE COMPUTED VALUE OF THE CONSTANT A = -0.30730480E+00

Questions The autocorrelation function can be used to answer the following


questions
lag autocorrelation
0. 1.00 1. Was this sample data set generated from a random process?
1. -0.31 2. Would a non-linear or time series model be a more appropriate
2. -0.74
3. 0.77 model for these data than a simple constant plus error model?
4. 0.21
5. -0.90 Importance Randomness is one of the key assumptions in determining if a
6. 0.38
7. 0.63 univariate statistical process is in control. If the assumptions of
8. -0.77 constant location and scale, randomness, and fixed distribution are
9. -0.12 reasonable, then the univariate process can be modeled as:
10. 0.82
11. -0.40
12. -0.55 where Ei is an error term.
13. 0.73
14. 0.07
15. -0.76 If the randomness assumption is not valid, then a different model needs
16. 0.40 to be used. This will typically be either a time series model or a
17. 0.48 non-linear model (with time as the independent variable).
18. -0.70
19. -0.03
20. 0.70 Related Autocorrelation Plot
21. -0.41 Techniques Run Sequence Plot
22. -0.43
23. 0.67 Lag Plot
24. 0.00 Runs Test
25. -0.66
26. 0.42
27. 0.39 Case Study The heat flow meter data demonstrate the use of autocorrelation in
28. -0.65 determining if the data are from a random process.
29. 0.03
30. 0.63 The beam deflection data demonstrate the use of autocorrelation in
31. -0.42
32. -0.36 developing a non-linear sinusoidal model.
33. 0.64
34. -0.05 Software The autocorrelation capability is available in most general purpose
35. -0.60
36. 0.43
statistical software programs, including Dataplot.
37. 0.32
38. -0.64
39. 0.08
40. 0.58

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35c.htm (2 of 4) [11/13/2003 5:32:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35c.htm (3 of 4) [11/13/2003 5:32:39 PM]


1.3.5.12. Autocorrelation 1.3.5.13. Runs Test for Detecting Non-randomness

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.5. Quantitative Techniques

1.3.5.13. Runs Test for Detecting


Non-randomness
Purpose: The runs test ( Bradley, 1968) can be used to decide if a data set is from a
Detect random process.
Non-Randomness
A run is defined as a series of increasing values or a series of decreasing
values. The number of increasing, or decreasing, values is the length of the
run. In a random data set, the probability that the (I+1)th value is larger or
smaller than the Ith value follows a binomial distribution, which forms the
basis of the runs test.

Typical Analysis The first step in the runs test is to compute the sequential differences (Yi -
and Test Yi-1). Positive values indicate an increasing value and negative values
Statistics indicate a decreasing value. A runs test should include information such as
the output shown below from Dataplot for the LEW.DAT data set. The
output shows a table of:
1. runs of length exactly I for I = 1, 2, ..., 10
2. number of runs of length I
3. expected number of runs of length I
4. standard deviation of the number of runs of length I
5. a z-score where the z-score is defined to be

where is the sample mean and s is the sample standard deviation.


The z-score column is compared to a standard normal table. That is, at the
5% significance level, a z-score with an absolute value greater than 1.96
indicates non-randomness.
There are several alternative formulations of the runs test in the literature. For
example, a series of coin tosses would record a series of heads and tails. A

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35c.htm (4 of 4) [11/13/2003 5:32:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm (1 of 5) [11/13/2003 5:32:39 PM]


1.3.5.13. Runs Test for Detecting Non-randomness 1.3.5.13. Runs Test for Detecting Non-randomness

run of length r is r consecutive heads or r consecutive tails. To use the 3 2.0 6.5750 2.1639 -2.11
Dataplot RUNS command, you could code a sequence of the N = 10 coin 4 0.0 1.3625 1.1186 -1.22
tosses HHHHTTHTHH as 5 0.0 0.2323 0.4777 -0.49
1234323234 6 0.0 0.0337 0.1833 -0.18
7 0.0 0.0043 0.0652 -0.07
that is, a heads is coded as an increasing value and a tails is coded as a
8 0.0 0.0005 0.0218 -0.02
decreasing value.
9 0.0 0.0000 0.0069 -0.01
Another alternative is to code values above the median as positive and values 10 0.0 0.0000 0.0021 0.00
below the median as negative. There are other formulations as well. All of
them can be converted to the Dataplot formulation. Just remember that it
ultimately reduces to 2 choices. To use the Dataplot runs test, simply code RUNS DOWN
one choice as an increasing value and the other as a decreasing value as in the
heads/tails example above. If you are using other statistical software, you STATISTIC = NUMBER OF RUNS DOWN
need to check the conventions used by that program. OF LENGTH EXACTLY I

Sample Output Dataplot generated the following runs test output using the LEW.DAT data I STAT EXP(STAT) SD(STAT) Z
set:
1 25.0 41.7083 6.4900 -2.57
2 35.0 18.2167 3.3444 5.02
3 0.0 5.2125 2.0355 -2.56
RUNS UP 4 0.0 1.1302 1.0286 -1.10
5 0.0 0.1986 0.4424 -0.45
STATISTIC = NUMBER OF RUNS UP 6 0.0 0.0294 0.1714 -0.17
OF LENGTH EXACTLY I 7 0.0 0.0038 0.0615 -0.06
8 0.0 0.0004 0.0207 -0.02
I STAT EXP(STAT) SD(STAT) Z 9 0.0 0.0000 0.0066 -0.01
10 0.0 0.0000 0.0020 0.00
1 18.0 41.7083 6.4900 -3.65
2 40.0 18.2167 3.3444 6.51
3 2.0 5.2125 2.0355 -1.58 STATISTIC = NUMBER OF RUNS DOWN
4 0.0 1.1302 1.0286 -1.10 OF LENGTH I OR MORE
5 0.0 0.1986 0.4424 -0.45
6 0.0 0.0294 0.1714 -0.17
7 0.0 0.0038 0.0615 -0.06 I STAT EXP(STAT) SD(STAT) Z
8 0.0 0.0004 0.0207 -0.02
9 0.0 0.0000 0.0066 -0.01 1 60.0 66.5000 4.1972 -1.55
10 0.0 0.0000 0.0020 0.00 2 35.0 24.7917 2.8083 3.63
3 0.0 6.5750 2.1639 -3.04
4 0.0 1.3625 1.1186 -1.22
STATISTIC = NUMBER OF RUNS UP 5 0.0 0.2323 0.4777 -0.49
OF LENGTH I OR MORE 6 0.0 0.0337 0.1833 -0.18
7 0.0 0.0043 0.0652 -0.07
I STAT EXP(STAT) SD(STAT) Z 8 0.0 0.0005 0.0218 -0.02
9 0.0 0.0000 0.0069 -0.01
1 60.0 66.5000 4.1972 -1.55 10 0.0 0.0000 0.0021 0.00
2 42.0 24.7917 2.8083 6.13

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm (2 of 5) [11/13/2003 5:32:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm (3 of 5) [11/13/2003 5:32:39 PM]


1.3.5.13. Runs Test for Detecting Non-randomness 1.3.5.13. Runs Test for Detecting Non-randomness

RUNS TOTAL = RUNS UP + RUNS DOWN Interpretation of Scanning the last column labeled "Z", we note that most of the z-scores for
Sample Output run lengths 1, 2, and 3 have an absolute value greater than 1.96. This is strong
STATISTIC = NUMBER OF RUNS TOTAL evidence that these data are in fact not random.
OF LENGTH EXACTLY I
Output from other statistical software may look somewhat different from the
I STAT EXP(STAT) SD(STAT) Z above output.

1 43.0 83.4167 9.1783 -4.40 Question The runs test can be used to answer the following question:
2 75.0 36.4333 4.7298 8.15 ● Were these sample data generated from a random process?
3 2.0 10.4250 2.8786 -2.93
4 0.0 2.2603 1.4547 -1.55 Importance Randomness is one of the key assumptions in determining if a univariate
5 0.0 0.3973 0.6257 -0.63
statistical process is in control. If the assumptions of constant location and
6 0.0 0.0589 0.2424 -0.24
scale, randomness, and fixed distribution are reasonable, then the univariate
7 0.0 0.0076 0.0869 -0.09
process can be modeled as:
8 0.0 0.0009 0.0293 -0.03
9 0.0 0.0001 0.0093 -0.01
10 0.0 0.0000 0.0028 0.00 where Ei is an error term.

If the randomness assumption is not valid, then a different model needs to be


STATISTIC = NUMBER OF RUNS TOTAL used. This will typically be either a times series model or a non-linear model
OF LENGTH I OR MORE (with time as the independent variable).

I STAT EXP(STAT) SD(STAT) Z Related Autocorrelation


Techniques Run Sequence Plot
1 120.0 133.0000 5.9358 -2.19 Lag Plot
2 77.0 49.5833 3.9716 6.90
3 2.0 13.1500 3.0602 -3.64
4 0.0 2.7250 1.5820 -1.72 Case Study Heat flow meter data
5 0.0 0.4647 0.6756 -0.69
6 0.0 0.0674 0.2592 -0.26 Software Most general purpose statistical software programs, including Dataplot,
7 0.0 0.0085 0.0923 -0.09 support a runs test.
8 0.0 0.0010 0.0309 -0.03
9 0.0 0.0001 0.0098 -0.01
10 0.0 0.0000 0.0030 0.00

LENGTH OF THE LONGEST RUN UP = 3


LENGTH OF THE LONGEST RUN DOWN = 2
LENGTH OF THE LONGEST RUN UP OR DOWN = 3

NUMBER OF POSITIVE DIFFERENCES = 104


NUMBER OF NEGATIVE DIFFERENCES = 95
NUMBER OF ZERO DIFFERENCES = 0

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm (4 of 5) [11/13/2003 5:32:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35d.htm (5 of 5) [11/13/2003 5:32:39 PM]


1.3.5.14. Anderson-Darling Test 1.3.5.14. Anderson-Darling Test

Significance
Level:
Critical The critical values for the Anderson-Darling test are dependent
Region: on the specific distribution that is being tested. Tabulated values
and formulas have been published (Stephens, 1974, 1976, 1977,
1979) for a few specific distributions (normal, lognormal,
1. Exploratory Data Analysis exponential, Weibull, logistic, extreme value type 1). The test is
1.3. EDA Techniques a one-sided test and the hypothesis that the distribution is of a
1.3.5. Quantitative Techniques specific form is rejected if the test statistic, A, is greater than the
critical value.
Note that for a given distribution, the Anderson-Darling statistic
1.3.5.14. Anderson-Darling Test may be multiplied by a constant (which usually depends on the
sample size, n). These constants are given in the various papers
Purpose: The Anderson-Darling test (Stephens, 1974) is used to test if a sample of data by Stephens. In the sample output below, this is the "adjusted
Test for came from a population with a specific distribution. It is a modification of the Anderson-Darling" statistic. This is what should be compared
Distributional Kolmogorov-Smirnov (K-S) test and gives more weight to the tails than does against the critical values. Also, be aware that different constants
Adequacy the K-S test. The K-S test is distribution free in the sense that the critical values (and therefore critical values) have been published. You just
do not depend on the specific distribution being tested. The Anderson-Darling need to be aware of what constant was used for a given set of
test makes use of the specific distribution in calculating critical values. This critical values (the needed constant is typically given with the
has the advantage of allowing a more sensitive test and the disadvantage that critical values).
critical values must be calculated for each distribution. Currently, tables of
critical values are available for the normal, lognormal, exponential, Weibull, Sample Dataplot generated the following output for the Anderson-Darling test. 1,000
extreme value type I, and logistic distributions. We do not provide the tables of Output random numbers were generated for a normal, double exponential, Cauchy,
critical values in this Handbook (see Stephens 1974, 1976, 1977, and 1979) and lognormal distribution. In all four cases, the Anderson-Darling test was
since this test is usually applied with a statistical software program that will applied to test for a normal distribution. When the data were generated using a
print the relevant critical values. normal distribution, the test statistic was small and the hypothesis was
accepted. When the data were generated using the double exponential, Cauchy,
The Anderson-Darling test is an alternative to the chi-square and and lognormal distributions, the statistics were significant, and the hypothesis
Kolmogorov-Smirnov goodness-of-fit tests. of an underlying normal distribution was rejected at significance levels of 0.10,
0.05, and 0.01.
Definition The Anderson-Darling test is defined as: The normal random numbers were stored in the variable Y1, the double
H0: The data follow a specified distribution. exponential random numbers were stored in the variable Y2, the Cauchy
Ha: The data do not follow the specified distribution random numbers were stored in the variable Y3, and the lognormal random
numbers were stored in the variable Y4.
Test The Anderson-Darling test statistic is defined as
Statistic:
***************************************
where ** anderson darling normal test y1 **
***************************************

ANDERSON-DARLING 1-SAMPLE TEST


THAT THE DATA CAME FROM A NORMAL DISTRIBUTION
F is the cumulative distribution function of the specified 1. STATISTICS:
distribution. Note that the Yi are the ordered data. NUMBER OF OBSERVATIONS = 1000
MEAN = 0.4359940E-02

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm (1 of 5) [11/13/2003 5:32:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm (2 of 5) [11/13/2003 5:32:39 PM]


1.3.5.14. Anderson-Darling Test 1.3.5.14. Anderson-Darling Test

STANDARD DEVIATION = 1.001816


2. CRITICAL VALUES:
ANDERSON-DARLING TEST STATISTIC VALUE = 0.2565918 90 % POINT = 0.6560000
ADJUSTED TEST STATISTIC VALUE = 0.2576117 95 % POINT = 0.7870000
97.5 % POINT = 0.9180000
2. CRITICAL VALUES: 99 % POINT = 1.092000
90 % POINT = 0.6560000
95 % POINT = 0.7870000 3. CONCLUSION (AT THE 5% LEVEL):
97.5 % POINT = 0.9180000 THE DATA DO NOT COME FROM A NORMAL DISTRIBUTION.
99 % POINT = 1.092000

3. CONCLUSION (AT THE 5% LEVEL): ***************************************


THE DATA DO COME FROM A NORMAL DISTRIBUTION. ** anderson darling normal test y4 **
***************************************

***************************************
** anderson darling normal test y2 ** ANDERSON-DARLING 1-SAMPLE TEST
*************************************** THAT THE DATA CAME FROM A NORMAL DISTRIBUTION

1. STATISTICS:
ANDERSON-DARLING 1-SAMPLE TEST NUMBER OF OBSERVATIONS = 1000
THAT THE DATA CAME FROM A NORMAL DISTRIBUTION MEAN = 1.518372
STANDARD DEVIATION = 1.719969
1. STATISTICS:
NUMBER OF OBSERVATIONS = 1000 ANDERSON-DARLING TEST STATISTIC VALUE = 83.06335
MEAN = 0.2034888E-01 ADJUSTED TEST STATISTIC VALUE = 83.39352
STANDARD DEVIATION = 1.321627
2. CRITICAL VALUES:
ANDERSON-DARLING TEST STATISTIC VALUE = 5.826050 90 % POINT = 0.6560000
ADJUSTED TEST STATISTIC VALUE = 5.849208 95 % POINT = 0.7870000
97.5 % POINT = 0.9180000
2. CRITICAL VALUES: 99 % POINT = 1.092000
90 % POINT = 0.6560000
95 % POINT = 0.7870000 3. CONCLUSION (AT THE 5% LEVEL):
97.5 % POINT = 0.9180000 THE DATA DO NOT COME FROM A NORMAL DISTRIBUTION.
99 % POINT = 1.092000

3. CONCLUSION (AT THE 5% LEVEL):


THE DATA DO NOT COME FROM A NORMAL DISTRIBUTION. Interpretation The output is divided into three sections.
of the Sample
1. The first section prints the number of observations and estimates for the
Output
*************************************** location and scale parameters.
** anderson darling normal test y3 ** 2. The second section prints the upper critical value for the
***************************************
Anderson-Darling test statistic distribution corresponding to various
significance levels. The value in the first column, the confidence level of
ANDERSON-DARLING 1-SAMPLE TEST the test, is equivalent to 100(1- ). We reject the null hypothesis at that
THAT THE DATA CAME FROM A NORMAL DISTRIBUTION significance level if the value of the Anderson-Darling test statistic
1. STATISTICS:
printed in section one is greater than the critical value printed in the last
NUMBER OF OBSERVATIONS = 1000 column.
MEAN = 1.503854 3. The third section prints the conclusion for a 95% test. For a different
STANDARD DEVIATION = 35.13059
significance level, the appropriate conclusion can be drawn from the
ANDERSON-DARLING TEST STATISTIC VALUE = 287.6429 table printed in section two. For example, for = 0.10, we look at the
ADJUSTED TEST STATISTIC VALUE = 288.7863 row for 90% confidence and compare the critical value 1.062 to the

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm (3 of 5) [11/13/2003 5:32:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm (4 of 5) [11/13/2003 5:32:39 PM]


1.3.5.14. Anderson-Darling Test 1.3.5.15. Chi-Square Goodness-of-Fit Test

Anderson-Darling test statistic (for the normal data) 0.256. Since the test
statistic is less than the critical value, we do not reject the null
hypothesis at the = 0.10 level.
As we would hope, the Anderson-Darling test accepts the hypothesis of
normality for the normal random numbers and rejects it for the 3 non-normal
cases.
1. Exploratory Data Analysis
The output from other statistical software programs may differ somewhat from 1.3. EDA Techniques
the output above. 1.3.5. Quantitative Techniques

Questions The Anderson-Darling test can be used to answer the following questions: 1.3.5.15. Chi-Square Goodness-of-Fit Test
● Are the data from a normal distribution?

● Are the data from a log-normal distribution? Purpose: The chi-square test (Snedecor and Cochran, 1989) is used to test if a sample of data came
● Are the data from a Weibull distribution?
Test for from a population with a specific distribution.
distributional
● Are the data from an exponential distribution? adequacy An attractive feature of the chi-square goodness-of-fit test is that it can be applied to any
● Are the data from a logistic distribution?
univariate distribution for which you can calculate the cumulative distribution function.
The chi-square goodness-of-fit test is applied to binned data (i.e., data put into classes).
This is actually not a restriction since for non-binned data you can simply calculate a
Importance Many statistical tests and procedures are based on specific distributional
histogram or frequency table before generating the chi-square test. However, the value of
assumptions. The assumption of normality is particularly common in classical the chi-square test statistic are dependent on how the data is binned. Another
statistical tests. Much reliability modeling is based on the assumption that the disadvantage of the chi-square test is that it requires a sufficient sample size in order for
data follow a Weibull distribution. the chi-square approximation to be valid.
There are many non-parametric and robust techniques that do not make strong The chi-square test is an alternative to the Anderson-Darling and Kolmogorov-Smirnov
distributional assumptions. However, techniques based on specific goodness-of-fit tests. The chi-square goodness-of-fit test can be applied to discrete
distributional assumptions are in general more powerful than non-parametric distributions such as the binomial and the Poisson. The Kolmogorov-Smirnov and
and robust techniques. Therefore, if the distributional assumptions can be Anderson-Darling tests are restricted to continuous distributions.
validated, they are generally preferred.
Additional discussion of the chi-square goodness-of-fit test is contained in the product
and process comparisons chapter (chapter 7).
Related Chi-Square goodness-of-fit Test
Techniques Kolmogorov-Smirnov Test
Definition The chi-square test is defined for the hypothesis:
Shapiro-Wilk Normality Test
H0: The data follow a specified distribution.
Probability Plot
Ha: The data do not follow the specified distribution.
Probability Plot Correlation Coefficient Plot

Case Study Airplane glass failure time data.

Software The Anderson-Darling goodness-of-fit test is available in some general purpose


statistical software programs, including Dataplot.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm (5 of 5) [11/13/2003 5:32:39 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm (1 of 6) [11/13/2003 5:32:40 PM]


1.3.5.15. Chi-Square Goodness-of-Fit Test 1.3.5.15. Chi-Square Goodness-of-Fit Test

Test Statistic: For the chi-square goodness-of-fit computation, the data are divided
into k bins and the test statistic is defined as Sample Dataplot generated the following output for the chi-square test where 1,000 random
Output numbers were generated for the normal, double exponential, t with 3 degrees of freedom,
and lognormal distributions. In all cases, the chi-square test was applied to test for a
normal distribution. The test statistics show the characteristics of the test; when the data
are from a normal distribution, the test statistic is small and the hypothesis is accepted;
where is the observed frequency for bin i and is the expected when the data are from the double exponential, t, and lognormal distributions, the
frequency for bin i. The expected frequency is calculated by statistics are significant and the hypothesis of an underlying normal distribution is
rejected at significance levels of 0.10, 0.05, and 0.01.
where F is the cumulative Distribution function for the distribution The normal random numbers were stored in the variable Y1, the double exponential
being tested, Yu is the upper limit for class i, Yl is the lower limit for random numbers were stored in the variable Y2, the t random numbers were stored in the
class i, and N is the sample size. variable Y3, and the lognormal random numbers were stored in the variable Y4.

This test is sensitive to the choice of bins. There is no optimal choice *************************************************
for the bin width (since the optimal bin width depends on the ** normal chi-square goodness of fit test y1 **
distribution). Most reasonable choices should produce similar, but *************************************************
not identical, results. Dataplot uses 0.3*s, where s is the sample
standard deviation, for the class width. The lower and upper bins are
at the sample mean plus and minus 6.0*s, respectively. For the CHI-SQUARED GOODNESS-OF-FIT TEST
chi-square approximation to be valid, the expected frequency should
be at least 5. This test is not valid for small samples, and if some of NULL HYPOTHESIS H0: DISTRIBUTION FITS THE DATA
the counts are less than five, you may need to combine some bins in ALTERNATE HYPOTHESIS HA: DISTRIBUTION DOES NOT FIT THE DATA
the tails. DISTRIBUTION: NORMAL
Significance Level: .
Critical Region: The test statistic follows, approximately, a chi-square distribution SAMPLE:
with (k - c) degrees of freedom where k is the number of non-empty NUMBER OF OBSERVATIONS = 1000
cells and c = the number of estimated parameters (including location NUMBER OF NON-EMPTY CELLS = 24
and scale parameters and shape parameters) for the distribution + 1. NUMBER OF PARAMETERS USED = 0
For example, for a 3-parameter Weibull distribution, c = 4.
TEST:
Therefore, the hypothesis that the data are from a population with CHI-SQUARED TEST STATISTIC = 17.52155
the specified distribution is rejected if DEGREES OF FREEDOM = 23
CHI-SQUARED CDF VALUE = 0.217101

where is the chi-square percent point function with k - c ALPHA LEVEL CUTOFF CONCLUSION
degrees of freedom and a significance level of . 10% 32.00690 ACCEPT H0
5% 35.17246 ACCEPT H0
In the above formulas for the critical regions, the Handbook follows 1% 41.63840 ACCEPT H0
the convention that is the upper critical value from the
CELL NUMBER, BIN MIDPOINT, OBSERVED FREQUENCY,
chi-square distribution and is the lower critical value from the
AND EXPECTED FREQUENCY
chi-square distribution. Note that this is the opposite of what is used WRITTEN TO FILE DPST1F.DAT
in some texts and software programs. In particular, Dataplot uses the
opposite convention. *************************************************
** normal chi-square goodness of fit test y2 **
*************************************************

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm (2 of 6) [11/13/2003 5:32:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm (3 of 6) [11/13/2003 5:32:40 PM]


1.3.5.15. Chi-Square Goodness-of-Fit Test 1.3.5.15. Chi-Square Goodness-of-Fit Test

CHI-SQUARED GOODNESS-OF-FIT TEST 1% 42.97982 REJECT H0

NULL HYPOTHESIS H0: DISTRIBUTION FITS THE DATA CELL NUMBER, BIN MIDPOINT, OBSERVED FREQUENCY,
ALTERNATE HYPOTHESIS HA: DISTRIBUTION DOES NOT FIT THE DATA AND EXPECTED FREQUENCY
DISTRIBUTION: NORMAL WRITTEN TO FILE DPST1F.DAT

SAMPLE: *************************************************
NUMBER OF OBSERVATIONS = 1000 ** normal chi-square goodness of fit test y4 **
NUMBER OF NON-EMPTY CELLS = 26 *************************************************
NUMBER OF PARAMETERS USED = 0

TEST: CHI-SQUARED GOODNESS-OF-FIT TEST


CHI-SQUARED TEST STATISTIC = 2030.784
DEGREES OF FREEDOM = 25 NULL HYPOTHESIS H0: DISTRIBUTION FITS THE DATA
CHI-SQUARED CDF VALUE = 1.000000 ALTERNATE HYPOTHESIS HA: DISTRIBUTION DOES NOT FIT THE DATA
DISTRIBUTION: NORMAL
ALPHA LEVEL CUTOFF CONCLUSION
10% 34.38158 REJECT H0 SAMPLE:
5% 37.65248 REJECT H0 NUMBER OF OBSERVATIONS = 1000
1% 44.31411 REJECT H0 NUMBER OF NON-EMPTY CELLS = 10
NUMBER OF PARAMETERS USED = 0
CELL NUMBER, BIN MIDPOINT, OBSERVED FREQUENCY,
AND EXPECTED FREQUENCY TEST:
WRITTEN TO FILE DPST1F.DAT CHI-SQUARED TEST STATISTIC = 1162098.
DEGREES OF FREEDOM = 9
************************************************* CHI-SQUARED CDF VALUE = 1.000000
** normal chi-square goodness of fit test y3 **
************************************************* ALPHA LEVEL CUTOFF CONCLUSION
10% 14.68366 REJECT H0
5% 16.91898 REJECT H0
CHI-SQUARED GOODNESS-OF-FIT TEST 1% 21.66600 REJECT H0

NULL HYPOTHESIS H0: DISTRIBUTION FITS THE DATA CELL NUMBER, BIN MIDPOINT, OBSERVED FREQUENCY,
ALTERNATE HYPOTHESIS HA: DISTRIBUTION DOES NOT FIT THE DATA AND EXPECTED FREQUENCY
DISTRIBUTION: NORMAL WRITTEN TO FILE DPST1F.DAT

SAMPLE: As we would hope, the chi-square test does not reject the normality hypothesis for the
NUMBER OF OBSERVATIONS = 1000 normal distribution data set and rejects it for the three non-normal cases.
NUMBER OF NON-EMPTY CELLS = 25
NUMBER OF PARAMETERS USED = 0 Questions The chi-square test can be used to answer the following types of questions:
● Are the data from a normal distribution?
TEST:
CHI-SQUARED TEST STATISTIC = 103165.4 ● Are the data from a log-normal distribution?
DEGREES OF FREEDOM = 24 ● Are the data from a Weibull distribution?
CHI-SQUARED CDF VALUE = 1.000000 ● Are the data from an exponential distribution?

● Are the data from a logistic distribution?


ALPHA LEVEL CUTOFF CONCLUSION
10% 33.19624 REJECT H0 ● Are the data from a binomial distribution?
5% 36.41503 REJECT H0

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm (4 of 6) [11/13/2003 5:32:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm (5 of 6) [11/13/2003 5:32:40 PM]


1.3.5.15. Chi-Square Goodness-of-Fit Test 1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit Test

Importance Many statistical tests and procedures are based on specific distributional assumptions.
The assumption of normality is particularly common in classical statistical tests. Much
reliability modeling is based on the assumption that the distribution of the data follows a
Weibull distribution.
There are many non-parametric and robust techniques that are not based on strong
distributional assumptions. By non-parametric, we mean a technique, such as the sign
test, that is not based on a specific distributional assumption. By robust, we mean a 1. Exploratory Data Analysis
statistical technique that performs well under a wide range of distributional assumptions. 1.3. EDA Techniques
However, techniques based on specific distributional assumptions are in general more 1.3.5. Quantitative Techniques
powerful than these non-parametric and robust techniques. By power, we mean the ability
to detect a difference when that difference actually exists. Therefore, if the distributional
assumption can be confirmed, the parametric techniques are generally preferred. 1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit
If you are using a technique that makes a normality (or some other type of distributional)
assumption, it is important to confirm that this assumption is in fact justified. If it is, the
Test
more powerful parametric techniques can be used. If the distributional assumption is not
justified, a non-parametric or robust technique may be required. Purpose: The Kolmogorov-Smirnov test (Chakravart, Laha, and Roy, 1967) is used to
Test for decide if a sample comes from a population with a specific distribution.
Related Anderson-Darling Goodness-of-Fit Test Distributional
Techniques Kolmogorov-Smirnov Test Adequacy The Kolmogorov-Smirnov (K-S) test is based on the empirical distribution
function (ECDF). Given N ordered data points Y1, Y2, ..., YN, the ECDF is
Shapiro-Wilk Normality Test
Probability Plots defined as
Probability Plot Correlation Coefficient Plot
where n(i) is the number of points less than Yi and the Yi are ordered from
Case Study Airplane glass failure times data. smallest to largest value. This is a step function that increases by 1/N at the value
of each ordered data point.
Software Some general purpose statistical software programs, including Dataplot, provide a
chi-square goodness-of-fit test for at least some of the common distributions. The graph below is a plot of the empirical distribution function with a normal
cumulative distribution function for 100 normal random numbers. The K-S test is
based on the maximum distance between these two curves.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm (6 of 6) [11/13/2003 5:32:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm (1 of 6) [11/13/2003 5:32:40 PM]


1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit Test 1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit Test

Definition The Kolmogorov-Smirnov test is defined by:


H0: The data follow a specified distribution
Ha: The data do not follow the specified distribution
Test Statistic: The Kolmogorov-Smirnov test statistic is defined as

where F is the theoretical cumulative distribution of the


distribution being tested which must be a continuous
distribution (i.e., no discrete distributions such as the
binomial or Poisson), and it must be fully specified (i.e., the
location, scale, and shape parameters cannot be estimated
from the data).
Significance Level: .
Critical Values: The hypothesis regarding the distributional form is rejected if
the test statistic, D, is greater than the critical value obtained
Characteristics An attractive feature of this test is that the distribution of the K-S test statistic from a table. There are several variations of these tables in
and itself does not depend on the underlying cumulative distribution function being the literature that use somewhat different scalings for the
Limitations of tested. Another advantage is that it is an exact test (the chi-square goodness-of-fit K-S test statistic and critical regions. These alternative
the K-S Test test depends on an adequate sample size for the approximations to be valid). formulations should be equivalent, but it is necessary to
Despite these advantages, the K-S test has several important limitations: ensure that the test statistic is calculated in a way that is
consistent with how the critical values were tabulated.
1. It only applies to continuous distributions.
2. It tends to be more sensitive near the center of the distribution than at the We do not provide the K-S tables in the Handbook since
tails. software programs that perform a K-S test will provide the
relevant critical values.
3. Perhaps the most serious limitation is that the distribution must be fully
specified. That is, if location, scale, and shape parameters are estimated
from the data, the critical region of the K-S test is no longer valid. It Sample Output Dataplot generated the following output for the Kolmogorov-Smirnov test where
typically must be determined by simulation. 1,000 random numbers were generated for a normal, double exponential, t with 3
degrees of freedom, and lognormal distributions. In all cases, the
Due to limitations 2 and 3 above, many analysts prefer to use the Kolmogorov-Smirnov test was applied to test for a normal distribution. The
Anderson-Darling goodness-of-fit test. However, the Anderson-Darling test is Kolmogorov-Smirnov test accepts the normality hypothesis for the case of normal
only available for a few specific distributions. data and rejects it for the double exponential, t, and lognormal data with the
exception of the double exponential data being significant at the 0.01 significance
level.
The normal random numbers were stored in the variable Y1, the double
exponential random numbers were stored in the variable Y2, the t random
numbers were stored in the variable Y3, and the lognormal random numbers were
stored in the variable Y4.

*********************************************************
** normal Kolmogorov-Smirnov goodness of fit test y1 **
*********************************************************

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm (2 of 6) [11/13/2003 5:32:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm (3 of 6) [11/13/2003 5:32:40 PM]


1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit Test 1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit Test

KOLMOGOROV-SMIRNOV GOODNESS-OF-FIT TEST *********************************************************

NULL HYPOTHESIS H0: DISTRIBUTION FITS THE DATA


ALTERNATE HYPOTHESIS HA: DISTRIBUTION DOES NOT FIT THE DATA KOLMOGOROV-SMIRNOV GOODNESS-OF-FIT TEST
DISTRIBUTION: NORMAL
NUMBER OF OBSERVATIONS = 1000 NULL HYPOTHESIS H0: DISTRIBUTION FITS THE DATA
ALTERNATE HYPOTHESIS HA: DISTRIBUTION DOES NOT FIT THE DATA
TEST: DISTRIBUTION: NORMAL
KOLMOGOROV-SMIRNOV TEST STATISTIC = 0.2414924E-01 NUMBER OF OBSERVATIONS = 1000

ALPHA LEVEL CUTOFF CONCLUSION TEST:


10% 0.03858 ACCEPT H0 KOLMOGOROV-SMIRNOV TEST STATISTIC = 0.5354889
5% 0.04301 ACCEPT H0
1% 0.05155 ACCEPT H0 ALPHA LEVEL CUTOFF CONCLUSION
10% 0.03858 REJECT H0
********************************************************* 5% 0.04301 REJECT H0
** normal Kolmogorov-Smirnov goodness of fit test y2 ** 1% 0.05155 REJECT H0
*********************************************************

KOLMOGOROV-SMIRNOV GOODNESS-OF-FIT TEST Questions The Kolmogorov-Smirnov test can be used to answer the following types of
questions:
NULL HYPOTHESIS H0: DISTRIBUTION FITS THE DATA ● Are the data from a normal distribution?
ALTERNATE HYPOTHESIS HA: DISTRIBUTION DOES NOT FIT THE DATA
DISTRIBUTION: NORMAL ● Are the data from a log-normal distribution?
NUMBER OF OBSERVATIONS = 1000 ● Are the data from a Weibull distribution?

TEST: ● Are the data from an exponential distribution?


KOLMOGOROV-SMIRNOV TEST STATISTIC = 0.5140864E-01
● Are the data from a logistic distribution?

ALPHA LEVEL CUTOFF CONCLUSION


10% 0.03858 REJECT H0 Importance Many statistical tests and procedures are based on specific distributional
5% 0.04301 REJECT H0 assumptions. The assumption of normality is particularly common in classical
1% 0.05155 ACCEPT H0
statistical tests. Much reliability modeling is based on the assumption that the
********************************************************* data follow a Weibull distribution.
** normal Kolmogorov-Smirnov goodness of fit test y3 **
********************************************************* There are many non-parametric and robust techniques that are not based on strong
distributional assumptions. By non-parametric, we mean a technique, such as the
sign test, that is not based on a specific distributional assumption. By robust, we
KOLMOGOROV-SMIRNOV GOODNESS-OF-FIT TEST
mean a statistical technique that performs well under a wide range of
NULL HYPOTHESIS H0: DISTRIBUTION FITS THE DATA distributional assumptions. However, techniques based on specific distributional
ALTERNATE HYPOTHESIS HA: DISTRIBUTION DOES NOT FIT THE DATA assumptions are in general more powerful than these non-parametric and robust
DISTRIBUTION: NORMAL
NUMBER OF OBSERVATIONS = 1000
techniques. By power, we mean the ability to detect a difference when that
difference actually exists. Therefore, if the distributional assumptions can be
TEST: confirmed, the parametric techniques are generally preferred.
KOLMOGOROV-SMIRNOV TEST STATISTIC = 0.6119353E-01
If you are using a technique that makes a normality (or some other type of
ALPHA LEVEL CUTOFF CONCLUSION distributional) assumption, it is important to confirm that this assumption is in
10% 0.03858 REJECT H0 fact justified. If it is, the more powerful parametric techniques can be used. If the
5% 0.04301 REJECT H0
1% 0.05155 REJECT H0 distributional assumption is not justified, using a non-parametric or robust
technique may be required.
*********************************************************
** normal Kolmogorov-Smirnov goodness of fit test y4 **

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm (4 of 6) [11/13/2003 5:32:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm (5 of 6) [11/13/2003 5:32:40 PM]


1.3.5.16. Kolmogorov-Smirnov Goodness-of-Fit Test 1.3.5.17. Grubbs' Test for Outliers

Related Anderson-Darling goodness-of-fit Test


Techniques Chi-Square goodness-of-fit Test
Shapiro-Wilk Normality Test
Probability Plots
Probability Plot Correlation Coefficient Plot

Case Study Airplane glass failure times data 1. Exploratory Data Analysis
1.3. EDA Techniques
Software Some general purpose statistical software programs, including Dataplot, support 1.3.5. Quantitative Techniques
the Kolmogorov-Smirnov goodness-of-fit test, at least for some of the more
common distributions.
1.3.5.17. Grubbs' Test for Outliers
Purpose: Grubbs' test (Grubbs 1969 and Stefansky 1972) is used to detect
Detection of outliers in a univariate data set. It is based on the assumption of
Outliers normality. That is, you should first verify that your data can be
reasonably approximated by a normal distribution before applying the
Grubbs' test.
Grubbs' test detects one outlier at a time. This outlier is expunged from
the dataset and the test is iterated until no outliers are detected.
However, multiple iterations change the probabilities of detection, and
the test should not be used for sample sizes of six or less since it
frequently tags most of the points as outliers.
Grubbs' test is also known as the maximum normed residual test.

Definition Grubbs' test is defined for the hypothesis:


H0: There are no outliers in the data set
Ha: There is at least one outlier in the data set
Test The Grubbs' test statistic is defined as:
Statistic:

where and are the sample mean and standard


deviation. The Grubbs test statistic is the largest absolute
deviation from the sample mean in units of the sample
standard deviation.
Significance .
Level:

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm (6 of 6) [11/13/2003 5:32:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm (1 of 4) [11/13/2003 5:32:41 PM]


1.3.5.17. Grubbs' Test for Outliers 1.3.5.17. Grubbs' Test for Outliers

Critical The hypothesis of no outliers is rejected if


Region: Interpretation The output is divided into three sections.
of Sample 1. The first section prints the sample statistics used in the
Output computation of the Grubbs' test and the value of the Grubbs' test
statistic.
where is the critical value of the
2. The second section prints the upper critical value for the Grubbs'
t-distribution with (N-2) degrees of freedom and a
test statistic distribution corresponding to various significance
significance level of /(2N). levels. The value in the first column, the confidence level of the
In the above formulas for the critical regions, the test, is equivalent to 100(1- ). We reject the null hypothesis at
Handbook follows the convention that is the upper that significance level if the value of the Grubbs' test statistic
critical value from the t-distribution and is the printed in section one is greater than the critical value printed in
lower critical value from the t-distribution. Note that this the last column.
is the opposite of what is used in some texts and software 3. The third section prints the conclusion for a 95% test. For a
programs. In particular, Dataplot uses the opposite different significance level, the appropriate conclusion can be
convention. drawn from the table printed in section two. For example, for
= 0.10, we look at the row for 90% confidence and compare the
Sample Dataplot generated the following output for the ZARR13.DAT data set critical value 3.42 to the Grubbs' test statistic 2.92. Since the test
Output showing that Grubbs' test finds no outliers in the dataset: statistic is less than the critical value, we accept the null
hypothesis at the = 0.10 level.
********************* Output from other statistical software may look somewhat different
** grubbs test y ** from the above output.
*********************
Questions Grubbs' test can be used to answer the following questions:
GRUBBS TEST FOR OUTLIERS 1. Does the data set contain any outliers?
(ASSUMPTION: NORMALITY)
2. How many outliers does it contain?
1. STATISTICS:
NUMBER OF OBSERVATIONS = 195 Importance Many statistical techniques are sensitive to the presence of outliers. For
MINIMUM = 9.196848
MEAN = 9.261460 example, simple calculations of the mean and standard deviation may
MAXIMUM = 9.327973 be distorted by a single grossly inaccurate data point.
STANDARD DEVIATION = 0.2278881E-01
Checking for outliers should be a routine part of any data analysis.
GRUBBS TEST STATISTIC = 2.918673 Potential outliers should be examined to see if they are possibly
erroneous. If the data point is in error, it should be corrected if possible
2. PERCENT POINTS OF THE REFERENCE DISTRIBUTION
FOR GRUBBS TEST STATISTIC
and deleted if it is not possible. If there is no reason to believe that the
0 % POINT = 0. outlying point is in error, it should not be deleted without careful
50 % POINT = 2.984294 consideration. However, the use of more robust techniques may be
75 % POINT = 3.181226 warranted. Robust techniques will often downweight the effect of
90 % POINT = 3.424672 outlying points without deleting them.
95 % POINT = 3.597898
99 % POINT = 3.970215

37.59665 % POINT: 2.918673

3. CONCLUSION (AT THE 5% LEVEL):


THERE ARE NO OUTLIERS.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm (2 of 4) [11/13/2003 5:32:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm (3 of 4) [11/13/2003 5:32:41 PM]


1.3.5.17. Grubbs' Test for Outliers 1.3.5.18. Yates Analysis

Related Several graphical techniques can, and should, be used to detect


Techniques outliers. A simple run sequence plot, a box plot, or a histogram should
show any obviously outlying points.
Run Sequence Plot
Histogram 1. Exploratory Data Analysis
Box Plot 1.3. EDA Techniques
Normal Probability Plot 1.3.5. Quantitative Techniques
Lag Plot
1.3.5.18. Yates Analysis
Case Study Heat flow meter data.
Purpose: Full factorial and fractional factorial designs are common in designed experiments for
Software Some general purpose statistical software programs, including Estimate engineering and scientific applications.
Dataplot, support the Grubbs' test. Factor
Effects in In these designs, each factor is assigned two levels. These are typically called the low
a 2-Level and high levels. For computational purposes, the factors are scaled so that the low
Factorial level is assigned a value of -1 and the high level is assigned a value of +1. These are
Design also commonly referred to as "-" and "+".
A full factorial design contains all possible combinations of low/high levels for all the
factors. A fractional factorial design contains a carefully chosen subset of these
combinations. The criterion for choosing the subsets is discussed in detail in the
process improvement chapter.
The Yates analysis exploits the special structure of these designs to generate least
squares estimates for factor effects for all factors and all relevant interactions.
The mathematical details of the Yates analysis are given in chapter 10 of Box, Hunter,
and Hunter (1978).
The Yates analysis is typically complemented by a number of graphical techniques
such as the dex mean plot and the dex contour plot ("dex" represents "design of
experiments"). This is demonstrated in the Eddy current case study.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35h.htm (4 of 4) [11/13/2003 5:32:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm (1 of 5) [11/13/2003 5:32:42 PM]


1.3.5.18. Yates Analysis 1.3.5.18. Yates Analysis

where Xi is the estimate of the ith factor or interaction effect.


Yates Before performing a Yates analysis, the data should be arranged in "Yates order". That 5. The cumulative residual standard deviation that results from the model using the
Order is, given k factors, the kth column consists of 2k-1 minus signs (i.e., the low level of the current term plus all terms preceding that term. That is,
factor) followed by 2k-1 plus signs (i.e., the high level of the factor). For example, for response = constant + 0.5 (all effect estimates down to and including the
a full factorial design with three factors, the design matrix is effect of interest)

- - - This consists of a monotonically decreasing set of residual standard deviations


+ - - (indicating a better fit as the number of terms in the model increases). The first
- + - cumulative residual standard deviation is for the model
+ + - response = constant
- - +
where the constant is the overall mean of the response variable. The last
+ - +
cumulative residual standard deviation is for the model
- + +
+ + + response = constant + 0.5*(all factor and interaction estimates)
This last model will have a residual standard deviation of zero.
Determining the Yates order for fractional factorial designs requires knowledge of the
confounding structure of the fractional factorial design. Sample Dataplot generated the following Yates analysis output for the Eddy current data set:
Output
Yates
A Yates analysis generates the following output.
Output (NOTE--DATA MUST BE IN STANDARD ORDER)
1. A factor identifier (from Yates order). The specific identifier will vary NUMBER OF OBSERVATIONS = 8
depending on the program used to generate the Yates analysis. Dataplot, for NUMBER OF FACTORS = 3
example, uses the following for a 3-factor model. NO REPLICATION CASE
1 = factor 1
2 = factor 2 PSEUDO-REPLICATION STAND. DEV. = 0.20152531564E+00
3 = factor 3 PSEUDO-DEGREES OF FREEDOM = 1
12 = interaction of factor 1 and factor 2 (THE PSEUDO-REP. STAND. DEV. ASSUMES ALL
13 = interaction of factor 1 and factor 3 3, 4, 5, ...-TERM INTERACTIONS ARE NOT REAL,
23 = interaction of factor 2 and factor 3 BUT MANIFESTATIONS OF RANDOM ERROR)
123 =interaction of factors 1, 2, and 3
2. Least squares estimated factor effects ordered from largest in magnitude (most STANDARD DEVIATION OF A COEF. = 0.14249992371E+00
significant) to smallest in magnitude (least significant). (BASED ON PSEUDO-REP. ST. DEV.)
That is, we obtain a ranked list of important factors. GRAND MEAN = 0.26587500572E+01
3. A t-value for the individual factor effect estimates. The t-value is computed as GRAND STANDARD DEVIATION = 0.17410624027E+01

99% CONFIDENCE LIMITS (+-) = 0.90710897446E+01


95% CONFIDENCE LIMITS (+-) = 0.18106349707E+01
where e is the estimated factor effect and is the standard deviation of the 99.5% POINT OF T DISTRIBUTION = 0.63656803131E+02
estimated factor effect. 97.5% POINT OF T DISTRIBUTION = 0.12706216812E+02
4. The residual standard deviation that results from the model with the single term
only. That is, the residual standard deviation from the model IDENTIFIER EFFECT RESSD: RESSD:T VALUE
MEAN + MEAN +
response = constant + 0.5 (Xi) TERM CUM TERMS
----------------------------------------------------------

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm (2 of 5) [11/13/2003 5:32:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm (3 of 5) [11/13/2003 5:32:42 PM]


1.3.5.18. Yates Analysis 1.3.5.18. Yates Analysis

MEAN 2.65875 1.74106 1.74106


1 3.10250 21.8* 0.57272 0.57272 Related Multi-factor analysis of variance
2 -0.86750 -6.1 1.81264 0.30429 Techniques Dex mean plot
23 0.29750 2.1 1.87270 0.26737
Block plot
13 0.24750 1.7 1.87513 0.23341
3 0.21250 1.5 1.87656 0.19121 Dex contour plot
123 0.14250 1.0 1.87876 0.18031
12 0.12750 0.9 1.87912 0.00000 Case Study The Yates analysis is demonstrated in the Eddy current case study.

Software Many general purpose statistical software programs, including


Dataplot, can perform a Yates analysis.
Interpretation In summary, the Yates analysis provides us with the following ranked
of Sample list of important factors along with the estimated effect estimate.
Output 1. X1: effect estimate = 3.1025 ohms
2. X2: effect estimate = -0.8675 ohms
3. X2*X3: effect estimate = 0.2975 ohms
4. X1*X3: effect estimate = 0.2475 ohms
5. X3: effect estimate = 0.2125 ohms
6. X1*X2*X3: effect estimate = 0.1425 ohms
7. X1*X2: effect estimate = 0.1275 ohms

Model From the above Yates output, we can define the potential models from
Selection and the Yates analysis. An important component of a Yates analysis is
Validation selecting the best model from the available potential models.
Once a tentative model has been selected, the error term should follow
the assumptions for a univariate measurement process. That is, the
model should be validated by analyzing the residuals.

Graphical Some analysts may prefer a more graphical presentation of the Yates
Presentation results. In particular, the following plots may be useful:
1. Ordered data plot
2. Ordered absolute effects plot
3. Cumulative residual standard deviation plot

Questions The Yates analysis can be used to answer the following questions:
1. What is the ranked list of factors?
2. What is the goodness-of-fit (as measured by the residual
standard deviation) for the various models?

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm (4 of 5) [11/13/2003 5:32:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i.htm (5 of 5) [11/13/2003 5:32:42 PM]


1.3.5.18.1. Defining Models and Prediction Equations 1.3.5.18.1. Defining Models and Prediction Equations

97.5% POINT OF T DISTRIBUTION = 0.12706216812E+02

IDENTIFIER EFFECT RESSD: T VALUE


RESSD:
MEAN + MEAN +
TERM CUM TERMS
1. Exploratory Data Analysis
----------------------------------------------------------
1.3. EDA Techniques
1.3.5. Quantitative Techniques
MEAN 2.65875 1.74106 1.74106
1.3.5.18. Yates Analysis 1 3.10250 21.8* 0.57272 0.57272
2 -0.86750 -6.1 1.81264 0.30429
23 0.29750 2.1 1.87270 0.26737
1.3.5.18.1. Defining Models and Prediction 13 0.24750 1.7 1.87513 0.23341
3 0.21250 1.5 1.87656 0.19121
Equations 123 0.14250 1.0 1.87876 0.18031
12 0.12750 0.9 1.87912 0.00000
Parameter In most cases of least squares fitting, the model coefficients for previously added terms
Estimates change depending on what was successively added. For example, the X1 coefficient The last column of the Yates table gives the residual standard deviation for 8 possible
Don't might change depending on whether or not an X2 term was included in the model. This models, each with one more term than the previous model.
Change as is not the case when the design is orthogonal, as is a 23 full factorial design. For
Additional orthogonal designs, the estimates for the previously included terms do not change as Potential For this example, we can summarize the possible prediction equations using the second
Terms additional terms are added. This means the ranked list of effect estimates Models and last columns of the Yates table:
Added simultaneously serves as the least squares coefficient estimates for progressively more ●
complicated models.
has a residual standard deviation of 1.74106 ohms. Note that this is the default
Yates For convenience, we list the sample Yates output for the Eddy current data set here. model. That is, if no factors are important, the model is simply the overall mean.
Table ●

(NOTE--DATA MUST BE IN STANDARD ORDER) has a residual standard deviation of 0.57272 ohms. (Here, X1 is either a +1 or -1,
NUMBER OF OBSERVATIONS = 8 and similarly for the other factors and interactions (products).)

NUMBER OF FACTORS = 3
NO REPLICATION CASE
has a residual standard deviation of 0.30429 ohms.
PSEUDO-REPLICATION STAND. DEV. = 0.20152531564E+00 ●

PSEUDO-DEGREES OF FREEDOM = 1
(THE PSEUDO-REP. STAND. DEV. ASSUMES ALL
has a residual standard deviation of 0.26737 ohms.
3, 4, 5, ...-TERM INTERACTIONS ARE NOT REAL, ●
BUT MANIFESTATIONS OF RANDOM ERROR)

STANDARD DEVIATION OF A COEF. = 0.14249992371E+00


(BASED ON PSEUDO-REP. ST. DEV.) has a residual standard deviation of 0.23341 ohms

GRAND MEAN = 0.26587500572E+01


GRAND STANDARD DEVIATION = 0.17410624027E+01

99% CONFIDENCE LIMITS (+-) = 0.90710897446E+01 has a residual standard deviation of 0.19121 ohms.
95% CONFIDENCE LIMITS (+-) = 0.18106349707E+01
99.5% POINT OF T DISTRIBUTION = 0.63656803131E+02

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i1.htm (1 of 3) [11/13/2003 5:32:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i1.htm (2 of 3) [11/13/2003 5:32:42 PM]


1.3.5.18.1. Defining Models and Prediction Equations 1.3.5.18.2. Important Factors

1. Exploratory Data Analysis


has a residual standard deviation of 0.18031 ohms. 1.3. EDA Techniques
● 1.3.5. Quantitative Techniques
1.3.5.18. Yates Analysis

1.3.5.18.2. Important Factors


has a residual standard deviation of 0.0 ohms. Note that the model with all Identify The Yates analysis generates a large number of potential models. From this list, we want to select
possible terms included will have a zero residual standard deviation. This will Important the most appropriate model. This requires balancing the following two goals.
always occur with an unreplicated two-level factorial design. Factors 1. We want the model to include all important factors.
2. We want the model to be parsimonious. That is, the model should be as simple as possible.
Model The above step lists all the potential models. From this list, we want to select the most
In short, we want our model to include all the important factors and interactions and to omit the
Selection appropriate model. This requires balancing the following two goals.
unimportant factors and interactions.
1. We want the model to include all important factors.
Seven criteria are utilized to define important factors. These seven criteria are not all equally
2. We want the model to be parsimonious. That is, the model should be as simple as important, nor will they yield identical subsets, in which case a consensus subset or a weighted
possible. consensus subset must be extracted. In practice, some of these criteria may not apply in all
Note that the residual standard deviation alone is insufficient for determining the most situations.
appropriate model as it will always be decreased by adding additional factors. The next These criteria will be examined in the context of the Eddy current data set. The Yates Analysis
section describes a number of approaches for determining which factors (and page gave the sample Yates output for these data and the Defining Models and Predictions page
interactions) to include in the model. listed the potential models from the Yates analysis.
In practice, not all of these criteria will be used with every analysis (and some analysts may have
additional criteria). These critierion are given as useful guidelines. Mosts analysts will focus on
those criteria that they find most useful.

Criteria for The seven criteria that we can use in determining whether to keep a factor in the model can be
Including summarized as follows.
Terms in the 1. Effects: Engineering Significance
Model
2. Effects: Order of Magnitude
3. Effects: Statistical Significance
4. Effects: Probability Plots
5. Averages: Youden Plot
6. Residual Standard Deviation: Engineering Significance
7. Residual Standard Deviation: Statistical Significance
The first four criteria focus on effect estimates with three numeric criteria and one graphical
criteria. The fifth criteria focuses on averages. The last two criteria focus on the residual standard
deviation of the model. We discuss each of these seven criteria in detail in the following sections.
The last section summarizes the conclusions based on all of the criteria.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i1.htm (3 of 3) [11/13/2003 5:32:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i2.htm (1 of 7) [11/13/2003 5:32:43 PM]


1.3.5.18.2. Important Factors 1.3.5.18.2. Important Factors

3. ignoring 3-term interactions and higher interactions leads to an estimate of based on


Effects: The minimum engineering significant difference is defined as omitting only a single term: the X1*X2*X3 interaction.
Engineering For the current example, if one assumes that the 3-term interaction is nil and hence represents a
Significance single drawing from a population centered at zero, then an estimate of the standard deviation of
where is the absolute value of the parameter estimate (i.e., the effect) and is the minimum an effect is simply the estimate of the 3-factor interaction (0.1425). In the Dataplot output for our
engineering significant difference. example, this is the effect estimate for the X1*X2*X3 interaction term (the EFFECT column for
the row labeled "123"). Two standard deviations is thus 0.2850. For this example, the rule is thus
That is, declare a factor as "important" if the effect is greater than some a priori declared
to keep all > 0.2850.
engineering difference. This implies that the engineering staff have in fact stated what a minimum
effect will be. Oftentimes this is not the case. In the absence of an a priori difference, a good This results in keeping three terms: X1 (3.10250), X2 (-.86750), and X1*X2 (.29750).
rough rule for the minimum engineering significant is to keep only those factors whose effect
is greater than, say, 10% of the current production average. In this case, let's say that the average
Effects: Probability plots can be used in the following manner.
detector has a sensitivity of 2.5 ohms. This would suggest that we would declare all factors whose
Probability 1. Normal Probability Plot: Keep a factor as "important" if it is well off the line through zero
effect is greater than 10% of 2.5 ohms = 0.25 ohm to be significant (from an engineering point of
Plots on a normal probability plot of the effect estimates.
view).
2. Half-Normal Probability Plot: Keep a factor as "important" if it is well off the line near
Based on this minimum engineering significant difference criterion, we conclude that we should
zero on a half-normal probability plot of the absolute value of effect estimates.
keep two terms: X1 and X2.
Both of these methods are based on the fact that the least squares estimates of effects for these
Effects: The order of magnitude criterion is defined as 2-level orthogonal designs are simply the difference of averages and so the central limit theorem,
Order of loosely applied, suggests that (if no factor were important) the effect estimates should have
Magnitude approximately a normal distribution with mean zero and the absolute value of the estimates
That is, exclude any factor that is less than 10% of the maximum effect size. We may or may not should have a half-normal distribution.
keep the other factors. This criterion is neither engineering nor statistical, but it does offer some Since the half-normal probability plot is only concerned with effect magnitudes as opposed to
additional numerical insight. For the current example, the largest effect is from X1 (3.10250 signed effects (which are subject to the vagaries of how the initial factor codings +1 and -1 were
ohms), and so 10% of that is 0.31 ohms, which suggests keeping all factors whose effects exceed assigned), the half-normal probability plot is preferred by some over the normal probability plot.
0.31 ohms.
Based on the order-of-magnitude criterion, we thus conclude that we should keep two terms: X1 Normal The following half-normal plot shows the normal probability plot of the effect estimates and the
and X2. A third term, X2*X3 (.29750), is just slightly under the cutoff level, so we may consider Probablity half-normal probability plot of the absolute value of the estimates for the Eddy current data.
keeping it based on the other criterion. Plot of
Effects and
Effects: Statistical significance is defined as Half-Normal
Statistical Probability
Significance Plot of
Effects
That is, declare a factor as important if its effect is more than 2 standard deviations away from 0
(0, by definition, meaning "no effect").
The "2" comes from normal theory (more specifically, a value of 1.96 yields a 95% confidence
interval). More precise values would come from t-distribution theory.
The difficulty with this is that in order to invoke this criterion we need the standard deviation, ,
of an observation. This is problematic because
1. the engineer may not know ;
2. the experiment might not have replication, and so a model-free estimate of is not
obtainable;
3. obtaining an estimate of by assuming the sometimes- employed assumption of ignoring
3-term interactions and higher may be incorrect from an engineering point of view.
For the Eddy current example:
1. the engineer did not know ;
2. the design (a 23 full factorial) did not have replication;

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i2.htm (2 of 7) [11/13/2003 5:32:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i2.htm (3 of 7) [11/13/2003 5:32:43 PM]


1.3.5.18.2. Important Factors 1.3.5.18.2. Important Factors

Youden Plot The following is the Youden plot of the effect estimatess for the Eddy current data.
of Effect
Estimatess

For the example at hand, both probability plots clearly show two factors displaced off the line,
and from the third plot (with factor tags included), we see that those two factors are factor 1 and
factor 2. All of the remaining five effects are behaving like random drawings from a normal
distribution centered at zero, and so are deemed to be statistically non-significant. In conclusion, For the example at hand, the Youden plot clearly shows a cluster of points near the grand average
this rule keeps two factors: X1 (3.10250) and X2 (-.86750). (2.65875) with two displaced points above (factor 1) and below (factor 2). Based on the Youden
plot, we conclude to keep two factors: X1 (3.10250) and X2 (-.86750).
Effects: A Youden plot can be used in the following way. Keep a factor as "important" if it is displaced
Youden Plot away from the central-tendancy "bunch" in a Youden plot of high and low averages. By Residual This criterion is defined as
definition, a factor is important when its average response for the low (-1) setting is significantly Standard
Deviation: Residual Standard Deviation > Cutoff
different from its average response for the high (+1) setting. Conversely, if the low and high
averages are about the same, then what difference does it make which setting to use and so why Engineering That is, declare a factor as "important" if the cumulative model that includes the factor (and all
would such a factor be considered important? This fact in combination with the intrinsic benefits Significance larger factors) has a residual standard deviation smaller than an a priori engineering-specified
of the Youden plot for comparing pairs of items leads to the technique of generating a Youden minimum residual standard deviation.
plot of the low and high averages. This criterion is different from the others in that it is model focused. In practice, this criterion
states that starting with the largest effect, we cumulatively keep adding terms to the model and
monitor how the residual standard deviation for each progressively more complicated model
becomes smaller. At some point, the cumulative model will become complicated enough and
comprehensive enough that the resulting residual standard deviation will drop below the
pre-specified engineering cutoff for the residual standard deviation. At that point, we stop adding
terms and declare all of the model-included terms to be "important" and everything not in the
model to be "unimportant".
This approach implies that the engineer has considered what a minimum residual standard
deviation should be. In effect, this relates to what the engineer can tolerate for the magnitude of
the typical residual (= difference between the raw data and the predicted value from the model).

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i2.htm (4 of 7) [11/13/2003 5:32:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i2.htm (5 of 7) [11/13/2003 5:32:43 PM]


1.3.5.18.2. Important Factors 1.3.5.18.2. Important Factors

In other words, how good does the engineer want the prediction equation to be. Unfortunately,
this engineering specification has not always been formulated and so this criterion can become Conclusions In summary, the seven criteria for specifying "important" factors yielded the following for the
moot. Eddy current data:
In the absence of a prior specified cutoff, a good rough rule for the minimum engineering residual 1. Effects, Engineering Significance: X1, X2
standard deviation is to keep adding terms until the residual standard deviation just dips below, 2. Effects, Numerically Significant: X1, X2
say, 5% of the current production average. For the Eddy current data, let's say that the average 3. Effects, Statistically Significant: X1, X2, X2*X3
detector has a sensitivity of 2.5 ohms. Then this would suggest that we would keep adding terms 4. Effects, Probability Plots: X1, X2
to the model until the residual standard deviation falls below 5% of 2.5 ohms = 0.125 ohms. 5. Averages, Youden Plot: X1, X2
6. Residual SD, Engineering Significance: all 7 terms
Based on the minimum residual standard deviation criteria, and by scanning the far right column 7. Residual SD, Statistical Significance: not applicable
of the Yates table, we would conclude to keep the following terms:
Such conflicting results are common. Arguably, the three most important criteria (listed in order
1. X1 (with a cumulative residual standard deviation = 0.57272) of most important) are:
2. X2 (with a cumulative residual standard deviation = 0.30429)
3. X2*X3 (with a cumulative residual standard deviation = 0.26737) 4. Effects, Probability Plots: X1, X2
4. X1*X3 (with a cumulative residual standard deviation = 0.23341) 1. Effects, Engineering Significance: X1, X2
5. X3 (with a cumulative residual standard deviation = 0.19121) 3. Residual SD, Engineering Significance: all 7 terms
6. X1*X2*X3 (with a cumulative residual standard deviation = 0.18031) Scanning all of the above, we thus declare the following consensus for the Eddy current data:
7. X1*X2 (with a cumulative residual standard deviation = 0.00000)
1. Important Factors: X1 and X2
Note that we must include all terms in order to drive the residual standard deviation below 0.125. 2. Parsimonious Prediction Equation:
Again, the 5% rule is a rough-and-ready rule that has no basis in engineering or statistics, but is
simply a "numerics". Ideally, the engineer has a better cutoff for the residual standard deviation
that is based on how well he/she wants the equation to peform in practice. If such a number were
(with a residual standard deviation of .30429 ohms)
available, then for this criterion and data set we would select something less than the entire
collection of terms. Note that this is the initial model selection. We still need to perform model validation with a
residual analysis.
Residual This criterion is defined as
Standard Residual Standard Deviation >
Deviation:
where is the standard deviation of an observation under replicated conditions.
Statistical
Significance That is, declare a term as "important" until the cumulative model that includes the term has a
residual standard deviation smaller than . In essence, we are allowing that we cannot demand a
model fit any better than what we would obtain if we had replicated data; that is, we cannot
demand that the residual standard deviation from any fitted model be any smaller than the
(theoretical or actual) replication standard deviation. We can drive the fitted standard deviation
down (by adding terms) until it achieves a value close to , but to attempt to drive it down further
means that we are, in effect, trying to fit noise.
In practice, this criterion may be difficult to apply because
1. the engineer may not know ;
2. the experiment might not have replication, and so a model-free estimate of is not
obtainable.
For the current case study:
1. the engineer did not know ;
2. the design (a 23 full factorial) did not have replication. The most common way of having
replication in such designs is to have replicated center points at the center of the cube
((X1,X2,X3) = (0,0,0)).
Thus for this current case, this criteria could not be used to yield a subset of "important" factors.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i2.htm (6 of 7) [11/13/2003 5:32:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda35i2.htm (7 of 7) [11/13/2003 5:32:43 PM]


1.3.6. Probability Distributions 1.3.6.1. What is a Probability Distribution

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.3. EDA Techniques 1.3. EDA Techniques
1.3.6. Probability Distributions

1.3.6. Probability Distributions


1.3.6.1. What is a Probability Distribution
Probability Probability distributions are a fundamental concept in statistics. They
Distributions are used both on a theoretical level and a practical level. Discrete The mathematical definition of a discrete probability function, p(x), is a
Distributions function that satisfies the following properties.
Some practical uses of probability distributions are:
1. The probability that x can take a specific value is p(x). That is
● To calculate confidence intervals for parameters and to calculate
critical regions for hypothesis tests.
● For univariate data, it is often useful to determine a reasonable 2. p(x) is non-negative for all real x.
distributional model for the data.
3. The sum of p(x) over all possible values of x is 1, that is
● Statistical intervals and hypothesis tests are often based on
specific distributional assumptions. Before computing an
interval or test based on a distributional assumption, we need to
verify that the assumption is justified for the given data set. In
this case, the distribution does not need to be the best-fitting where j represents all possible values that x can have and pj is the
distribution for the data, but an adequate enough model so that probability at xj.
the statistical technique yields valid conclusions.
One consequence of properties 2 and 3 is that 0 <= p(x) <= 1.
● Simulation studies with random numbers generated from using a
specific probability distribution are often needed. What does this actually mean? A discrete probability function is a
function that can take a discrete number of values (not necessarily
Table of 1. What is a probability distribution? finite). This is most often the non-negative integers or some subset of
Contents the non-negative integers. There is no mathematical restriction that
2. Related probability functions discrete probability functions only be defined at integers, but in practice
3. Families of distributions this is usually what makes sense. For example, if you toss a coin 6
times, you can get 2 heads or 3 heads but not 2 1/2 heads. Each of the
4. Location and scale parameters
discrete values has a certain probability of occurrence that is between
5. Estimating the parameters of a distribution zero and one. That is, a discrete function that allows negative values or
6. A gallery of common distributions values greater than one is not a probability function. The condition that
the probabilities sum to one means that at least one of the values has to
7. Tables for probability distributions occur.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda36.htm [11/13/2003 5:32:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda361.htm (1 of 2) [11/13/2003 5:32:43 PM]


1.3.6.1. What is a Probability Distribution 1.3.6.2. Related Distributions

Continuous The mathematical definition of a continuous probability function, f(x),


Distributions is a function that satisfies the following properties.
1. The probability that x is between two points a and b is
1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.6. Probability Distributions
2. It is non-negative for all real x.
3. The integral of the probability function is one, that is 1.3.6.2. Related Distributions
Probability distributions are typically defined in terms of the probability
density function. However, there are a number of probability functions
What does this actually mean? Since continuous probability functions used in applications.
are defined for an infinite number of points over a continuous interval,
the probability at a single point is always zero. Probabilities are
Probability For a continuous function, the probability density function (pdf) is the
measured over intervals, not single points. That is, the area under the
Density probability that the variate has the value x. Since for continuous
curve between two distinct points defines the probability for that
Function distributions the probability at a single point is zero, this is often
interval. This means that the height of the probability function can in
expressed in terms of an integral between two points.
fact be greater than one. The property that the integral must equal one is
equivalent to the property for discrete distributions that the sum of all
the probabilities must equal one.

Probability Discrete probability functions are referred to as probability mass For a discrete distribution, the pdf is the probability that the variate takes
Mass functions and continuous probability functions are referred to as the value x.
Functions probability density functions. The term probability functions covers
Versus both discrete and continuous distributions. When we are referring to
Probability probability functions in generic terms, we may use the term probability
Density density functions to mean both discrete and continuous probability The following is the plot of the normal probability density function.
Functions functions.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda361.htm (2 of 2) [11/13/2003 5:32:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm (1 of 8) [11/13/2003 5:32:44 PM]


1.3.6.2. Related Distributions 1.3.6.2. Related Distributions

The horizontal axis is the allowable domain for the given probability
Cumulative The cumulative distribution function (cdf) is the probability that the
function. Since the vertical axis is a probability, it must fall between
Distribution variable takes a value less than or equal to x. That is
zero and one. It increases from zero to one as we go from left to right on
Function
the horizontal axis.

For a continuous distribution, this can be expressed mathematically as Percent The percent point function (ppf) is the inverse of the cumulative
Point distribution function. For this reason, the percent point function is also
Function commonly referred to as the inverse distribution function. That is, for a
distribution function we calculate the probability that the variable is less
than or equal to x for a given x. For the percent point function, we start
For a discrete distribution, the cdf can be expressed as
with the probability and compute the corresponding x for the cumulative
distribution. Mathematically, this can be expressed as

The following is the plot of the normal cumulative distribution function. or alternatively

The following is the plot of the normal percent point function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm (2 of 8) [11/13/2003 5:32:44 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm (3 of 8) [11/13/2003 5:32:44 PM]


1.3.6.2. Related Distributions 1.3.6.2. Related Distributions

Since the horizontal axis is a probability, it goes from zero to one. The Hazard plots are most commonly used in reliability applications. Note
vertical axis goes from the smallest to the largest value of the that Johnson, Kotz, and Balakrishnan refer to this as the conditional
cumulative distribution function. failure density function rather than the hazard function.

Hazard The hazard function is the ratio of the probability density function to the Cumulative The cumulative hazard function is the integral of the hazard function. It
Function survival function, S(x). Hazard can be interpreted as the probability of failure at time x given survival
Function until time x.

The following is the plot of the normal distribution hazard function.


This can alternatively be expressed as

The following is the plot of the normal cumulative hazard function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm (4 of 8) [11/13/2003 5:32:44 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm (5 of 8) [11/13/2003 5:32:44 PM]


1.3.6.2. Related Distributions 1.3.6.2. Related Distributions

Cumulative hazard plots are most commonly used in reliability For a survival function, the y value on the graph starts at 1 and
applications. Note that Johnson, Kotz, and Balakrishnan refer to this as monotonically decreases to zero. The survival function should be
the hazard function rather than the cumulative hazard function. compared to the cumulative distribution function.

Survival Survival functions are most often used in reliability and related fields. Inverse Just as the percent point function is the inverse of the cumulative
Function The survival function is the probability that the variate takes a value Survival distribution function, the survival function also has an inverse function.
greater than x. Function The inverse survival function can be defined in terms of the percent
point function.

The following is the plot of the normal distribution survival function.


The following is the plot of the normal distribution inverse survival
function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm (6 of 8) [11/13/2003 5:32:44 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm (7 of 8) [11/13/2003 5:32:44 PM]


1.3.6.2. Related Distributions 1.3.6.3. Families of Distributions

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions

1.3.6.3. Families of Distributions


Shape Many probability distributions are not a single distribution, but are in
Parameters fact a family of distributions. This is due to the distribution having one
or more shape parameters.
Shape parameters allow a distribution to take on a variety of shapes,
depending on the value of the shape parameter. These distributions are
particularly useful in modeling applications since they are flexible
As with the percent point function, the horizontal axis is a probability. enough to model a variety of data sets.
Therefore the horizontal axis goes from 0 to 1 regardless of the
particular distribution. The appearance is similar to the percent point Example:
function. However, instead of going from the smallest to the largest The Weibull distribution is an example of a distribution that has a shape
Weibull
value on the vertical axis, it goes from the largest to the smallest value. parameter. The following graph plots the Weibull pdf with the following
Distribution
values for the shape parameter: 0.5, 1.0, 2.0, and 5.0.

The shapes above include an exponential distribution, a right-skewed


distribution, and a relatively symmetric distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm (8 of 8) [11/13/2003 5:32:44 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda363.htm (1 of 2) [11/13/2003 5:32:44 PM]


1.3.6.3. Families of Distributions 1.3.6.4. Location and Scale Parameters

The Weibull distribution has a relatively simple distributional form.


However, the shape parameter allows the Weibull to assume a wide
variety of shapes. This combination of simplicity and flexibility in the
shape of the Weibull distribution has made it an effective distributional
model in reliability applications. This ability to model a wide variety of
1. Exploratory Data Analysis
distributional shapes using a relatively simple distributional form is
1.3. EDA Techniques
possible with many other distributional families as well. 1.3.6. Probability Distributions

PPCC Plots The PPCC plot is an effective graphical tool for selecting the member of
a distributional family with a single shape parameter that best fits a 1.3.6.4. Location and Scale Parameters
given set of data.
Normal A probability distribution is characterized by location and scale
PDF parameters. Location and scale parameters are typically used in
modeling applications.
For example, the following graph is the probability density function for
the standard normal distribution, which has the location parameter equal
to zero and scale parameter equal to one.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda363.htm (2 of 2) [11/13/2003 5:32:44 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda364.htm (1 of 5) [11/13/2003 5:32:45 PM]


1.3.6.4. Location and Scale Parameters 1.3.6.4. Location and Scale Parameters

Location The next plot shows the probability density function for a normal
Parameter distribution with a location parameter of 10 and a scale parameter of 1.

In contrast, the next graph has a scale parameter of 1/3 (=0.333). The
effect of this scale parameter is to squeeze the pdf. That is, the
maximum y value is approximately 1.2 as opposed to 0.4 and the y
value is near zero at (+/-) 1 as opposed to (+/-) 3.
The effect of the location parameter is to translate the graph, relative to
the standard normal distribution, 10 units to the right on the horizontal
axis. A location parameter of -10 would have shifted the graph 10 units
to the left on the horizontal axis.
That is, a location parameter simply shifts the graph left or right on the
horizontal axis.

Scale
Parameter

The next plot has a scale parameter of 3 (and a location parameter of


zero). The effect of the scale parameter is to stretch out the graph. The
maximum y value is approximately 0.13 as opposed 0.4 in the previous
graphs. The y value, i.e., the vertical axis value, approaches zero at
about (+/-) 9 as opposed to (+/-) 3 with the first graph.
The effect of a scale parameter greater than one is to stretch the pdf. The
greater the magnitude, the greater the stretching. The effect of a scale
parameter less than one is to compress the pdf. The compressing
approaches a spike as the scale parameter goes to zero. A scale

http://www.itl.nist.gov/div898/handbook/eda/section3/eda364.htm (2 of 5) [11/13/2003 5:32:45 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda364.htm (3 of 5) [11/13/2003 5:32:45 PM]


1.3.6.4. Location and Scale Parameters 1.3.6.4. Location and Scale Parameters

parameter of 1 leaves the pdf unchanged (if the scale parameter is 1 to


begin with) and non-positive scale parameters are not allowed. Formulas The following are the formulas for computing various probability
for Location functions based on the standard form of the distribution. The parameter
Location The following graph shows the effect of both a location and a scale and Scale a refers to the location parameter and the parameter b refers to the scale
and Scale parameter. The plot has been shifted right 10 units and stretched by a Based on parameter. Shape parameters are not included.
Together factor of 3. the Standard
Cumulative Distribution Function F(x;a,b) = F((x-a)/b;0,1)
Form
Probability Density Function f(x;a,b) = (1/b)f((x-a)/b;0,1)
Percent Point Function G( ;a,b) = a + bG( ;0,1)
Hazard Function h(x;a,b) = (1/b)h((x-a)/b;0,1)
Cumulative Hazard Function H(x;a,b) = H((x-a)/b;0,1)
Survival Function S(x;a,b) = S((x-a)/b;0,1)
Inverse Survival Function Z( ;a,b) = a + bZ( ;0,1)
Random Numbers Y(a,b) = a + bY(0,1)

Relationship For the normal distribution, the location and scale parameters
to Mean and correspond to the mean and standard deviation, respectively. However,
Standard this is not necessarily true for other distributions. In fact, it is not true
Deviation for most distributions.

Standard The standard form of any distribution is the form that has location
Form parameter zero and scale parameter one.
It is common in statistical software packages to only compute the
standard form of the distribution. There are formulas for converting
from the standard form to the form with other location and scale
parameters. These formulas are independent of the particular probability
distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda364.htm (4 of 5) [11/13/2003 5:32:45 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda364.htm (5 of 5) [11/13/2003 5:32:45 PM]


1.3.6.5. Estimating the Parameters of a Distribution 1.3.6.5.1. Method of Moments

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.3. EDA Techniques 1.3. EDA Techniques
1.3.6. Probability Distributions 1.3.6. Probability Distributions
1.3.6.5. Estimating the Parameters of a Distribution

1.3.6.5. Estimating the Parameters of a


1.3.6.5.1. Method of Moments
Distribution
Method of The method of moments equates sample moments to parameter
Model a One common application of probability distributions is modeling Moments estimates. When moment methods are available, they have the
univariate univariate data with a specific probability distribution. This involves the advantage of simplicity. The disadvantage is that they are often not
data set with following two steps: available and they do not have the desirable optimality properties of
a 1. Determination of the "best-fitting" distribution. maximum likelihood and least squares estimators.
probability
2. Estimation of the parameters (shape, location, and scale The primary use of moment estimates is as starting values for the more
distribution
parameters) for that distribution. precise maximum likelihood and least squares estimates.

Various There are various methods, both numerical and graphical, for estimating
Software Most general purpose statistical software does not include explicit
Methods the parameters of a probability distribution.
method of moments parameter estimation commands. However, when
1. Method of moments utilized, the method of moment formulas tend to be straightforward and
2. Maximum likelihood can be easily implemented in most statistical software programs.
3. Least squares
4. PPCC and probability plots

http://www.itl.nist.gov/div898/handbook/eda/section3/eda365.htm [11/13/2003 5:32:45 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3651.htm [11/13/2003 5:32:45 PM]


1.3.6.5.2. Maximum Likelihood 1.3.6.5.2. Maximum Likelihood

generate confidence bounds and hypothesis tests for the


parameters.
● Several popular statistical software packages provide excellent
algorithms for maximum likelihood estimates for many of the
1. Exploratory Data Analysis commonly used distributions. This helps mitigate the
1.3. EDA Techniques computational complexity of maximum likelihood estimation.
1.3.6. Probability Distributions
1.3.6.5. Estimating the Parameters of a Distribution Disadvantages The disadvantages of this method are:
● The likelihood equations need to be specifically worked out for
a given distribution and estimation problem. The mathematics is
1.3.6.5.2. Maximum Likelihood often non-trivial, particularly if confidence intervals for the
parameters are desired.
Maximum Maximum likelihood estimation begins with the mathematical ● The numerical estimation is usually non-trivial. Except for a
Likelihood expression known as a likelihood function of the sample data. Loosely few cases where the maximum likelihood formulas are in fact
speaking, the likelihood of a set of data is the probability of obtaining simple, it is generally best to rely on high quality statistical
that particular set of data given the chosen probability model. This software to obtain maximum likelihood estimates. Fortunately,
expression contains the unknown parameters. Those values of the high quality maximum likelihood software is becoming
parameter that maximize the sample likelihood are known as the increasingly common.
maximum likelihood estimates.
● Maximum likelihood estimates can be heavily biased for small
The reliability chapter contains some examples of the likelihood samples. The optimality properties may not apply for small
functions for a few of the commonly used distributions in reliability samples.
analysis. ● Maximum likelihood can be sensitive to the choice of starting
values.
Advantages The advantages of this method are:
Software Most general purpose statistical software programs support maximum
● Maximum likelihood provides a consistent approach to
parameter estimation problems. This means that maximum likelihood estimation (MLE) in some form. MLE estimation can be
likelihood estimates can be developed for a large variety of supported in two ways.
estimation situations. For example, they can be applied in 1. A software program may provide a generic function
reliability analysis to censored data under various censoring minimization (or equivalently, maximization) capability. This is
models. also referred to as function optimization. Maximum likelihood
● Maximum likelihood methods have desirable mathematical and estimation is essentially a function optimization problem.
optimality properties. Specifically, This type of capability is particularly common in mathematical
1. They become minimum variance unbiased estimators as software programs.
the sample size increases. By unbiased, we mean that if 2. A software program may provide MLE computations for a
we take (a very large number of) random samples with specific problem. For example, it may generate ML estimates
replacement from a population, the average value of the for the parameters of a Weibull distribution.
parameter estimates will be theoretically exactly equal to
the population value. By minimum variance, we mean Statistical software programs will often provide ML estimates
that the estimator has the smallest variance, and thus the for many specific problems even when they do not support
narrowest confidence interval, of all estimators of that general function optimization.
type. The advantage of function minimization software is that it can be
2. They have approximate normal distributions and applied to many different MLE problems. The drawback is that you
approximate sample variances that can be used to have to specify the maximum likelihood equations to the software. As

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3652.htm (1 of 3) [11/13/2003 5:32:45 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3652.htm (2 of 3) [11/13/2003 5:32:45 PM]


1.3.6.5.2. Maximum Likelihood 1.3.6.5.3. Least Squares

the functions can be non-trivial, there is potential for error in entering


the equations.
The advantage of the specific MLE procedures is that greater
efficiency and better numerical stability can often be obtained by
taking advantage of the properties of the specific estimation problem. 1. Exploratory Data Analysis
The specific methods often return explicit confidence intervals. In 1.3. EDA Techniques
addition, you do not have to know or specify the likelihood equations 1.3.6. Probability Distributions
to the software. The disadvantage is that each MLE problem must be 1.3.6.5. Estimating the Parameters of a Distribution
specifically coded.
Dataplot supports MLE for a limited number of distributions. 1.3.6.5.3. Least Squares
Least Squares Non-linear least squares provides an alternative to maximum
likelihood.

Advantages The advantages of this method are:


● Non-linear least squares software may be available in many
statistical software packages that do not support maximum
likelihood estimates.
● It can be applied more generally than maximum likelihood.
That is, if your software provides non-linear fitting and it has
the ability to specify the probability function you are interested
in, then you can generate least squares estimates for that
distribution. This will allow you to obtain reasonable estimates
for distributions even if the software does not provide
maximum likelihood estimates.

Disadvantages The disadvantages of this method are:


● It is not readily applicable to censored data.

● It is generally considered to have less desirable optimality


properties than maximum likelihood.
● It can be quite sensitive to the choice of starting values.

Software Non-linear least squares fitting is available in many general purpose


statistical software programs. The macro developed for Dataplot can
be adapted to many software programs that provide least squares
estimation.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3652.htm (3 of 3) [11/13/2003 5:32:45 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3653.htm [11/13/2003 5:32:45 PM]


1.3.6.5.4. PPCC and Probability Plots 1.3.6.5.4. PPCC and Probability Plots

Disadvantages The disadvantages of this method are:


● It is limited to distributions with a single shape parameter.

● PPCC plots are not widely available in statistical software


1. Exploratory Data Analysis packages other than Dataplot (Dataplot provides PPCC plots for
1.3. EDA Techniques 40+ distributions). Probability plots are generally available.
1.3.6. Probability Distributions However, many statistical software packages only provide them
1.3.6.5. Estimating the Parameters of a Distribution
for a limited number of distributions.
● Significance levels for the correlation coefficient (i.e., if the

1.3.6.5.4. PPCC and Probability Plots maximum correlation value is above a given value, then the
distribution provides an adequate fit for the data with a given
confidence level) have only been worked out for a limited
PPCC and The PPCC plot can be used to estimate the shape parameter of a number of distributions.
Probability distribution with a single shape parameter. After finding the best value
Plots of the shape parameter, the probability plot can be used to estimate the Case Study The airplane glass failure time case study demonstrates the use of the
location and scale parameters of a probability distribution. PPCC and probability plots in finding the best distributional model
and the parameter estimation of the distributional model.
Advantages The advantages of this method are:
● It is based on two well-understood concepts. Other For reliability applications, the hazard plot and the Weibull plot are
1. The linearity (i.e., straightness) of the probability plot is a Graphical alternative graphical methods that are commonly used to estimate
good measure of the adequacy of the distributional fit. Methods parameters.
2. The correlation coefficient between the points on the
probability plot is a good measure of the linearity of the
probability plot.
● It is an easy technique to implement for a wide variety of
distributions with a single shape parameter. The basic
requirement is to be able to compute the percent point function,
which is needed in the computation of both the probability plot
and the PPCC plot.
● The PPCC plot provides insight into the sensitivity of the shape
parameter. That is, if the PPCC plot is relatively flat in the
neighborhood of the optimal value of the shape parameter, this
is a strong indication that the fitted model will not be sensitive
to small deviations, or even large deviations in some cases, in
the value of the shape parameter.
● The maximum correlation value provides a method for
comparing across distributions as well as identifying the best
value of the shape parameter for a given distribution. For
example, we could use the PPCC and probability fits for the
Weibull, lognormal, and possibly several other distributions.
Comparing the maximum correlation coefficient achieved for
each distribution can help in selecting which is the best
distribution to use.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3654.htm (1 of 2) [11/13/2003 5:32:46 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3654.htm (2 of 2) [11/13/2003 5:32:46 PM]


1.3.6.6. Gallery of Distributions 1.3.6.6. Gallery of Distributions

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions t Distribution F Distribution Chi-Square
Distribution

1.3.6.6. Gallery of Distributions


Gallery of Detailed information on a few of the most common distributions is
Common available below. There are a large number of distributions used in
Distributions statistical applications. It is beyond the scope of this Handbook to
discuss more than a few of these. Two excellent sources for additional Exponential Weibull Distribution Lognormal
detailed information on a large array of distributions are Johnson, Distribution Distribution
Kotz, and Balakrishnan and Evans, Hastings, and Peacock. Equations
for the probability functions are given for the standard form of the
distribution. Formulas exist for defining the functions with location
and scale parameters in terms of the standard form of the distribution.
The sections on parameter estimation are restricted to the method of
moments and maximum likelihood. This is because the least squares Fatigue Life Gamma Distribution Double Exponential
and PPCC and probability plot estimation procedures are generic. The Distribution Distribution
maximum likelihood equations are not listed if they involve solving
simultaneous equations. This is because these methods require
sophisticated computer software to solve. Except where the maximum
likelihood estimates are trivial, you should depend on a statistical
software program to compute them. References are given for those
who are interested.
Be aware that different sources may give formulas that are different Power Normal Power Lognormal Tukey-Lambda
from those shown here. In some cases, these are simply Distribution Distribution Distribution
mathematically equivalent formulations. In other cases, a different
parameterization may be used.

Continuous
Distributions Extreme Value Beta Distribution
Type I Distribution

Normal Distribution Uniform Distribution Cauchy Distribution

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366.htm (1 of 3) [11/13/2003 5:32:46 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366.htm (2 of 3) [11/13/2003 5:32:46 PM]


1.3.6.6. Gallery of Distributions 1.3.6.6.1. Normal Distribution

Discrete
Distributions

1. Exploratory Data Analysis


1.3. EDA Techniques
Binomial Poisson Distribution 1.3.6. Probability Distributions
Distribution 1.3.6.6. Gallery of Distributions

1.3.6.6.1. Normal Distribution


Probability The general formula for the probability density function of the normal
Density distribution is
Function

where is the location parameter and is the scale parameter. The case
where = 0 and = 1 is called the standard normal distribution. The
equation for the standard normal distribution is

Since the general form of probability functions can be expressed in


terms of the standard distribution, all subsequent formulas in this section
are given for the standard form of the function.
The following is the plot of the standard normal probability density
function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366.htm (3 of 3) [11/13/2003 5:32:46 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm (1 of 7) [11/13/2003 5:32:47 PM]


1.3.6.6.1. Normal Distribution 1.3.6.6.1. Normal Distribution

Percent The formula for the percent point function of the normal distribution
Point does not exist in a simple closed formula. It is computed numerically.
Function
The following is the plot of the normal percent point function.

Cumulative The formula for the cumulative distribution function of the normal
Distribution distribution does not exist in a simple closed formula. It is computed
Function numerically.
The following is the plot of the normal cumulative distribution function.

Hazard The formula for the hazard function of the normal distribution is
Function

where is the cumulative distribution function of the standard normal


distribution and is the probability density function of the standard
normal distribution.
The following is the plot of the normal hazard function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm (2 of 7) [11/13/2003 5:32:47 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm (3 of 7) [11/13/2003 5:32:47 PM]


1.3.6.6.1. Normal Distribution 1.3.6.6.1. Normal Distribution

Survival The normal survival function can be computed from the normal
Function cumulative distribution function.
The following is the plot of the normal survival function.

Cumulative The normal cumulative hazard function can be computed from the
Hazard normal cumulative distribution function.
Function
The following is the plot of the normal cumulative hazard function.

Inverse The normal inverse survival function can be computed from the normal
Survival percent point function.
Function
The following is the plot of the normal inverse survival function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm (4 of 7) [11/13/2003 5:32:47 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm (5 of 7) [11/13/2003 5:32:47 PM]


1.3.6.6.1. Normal Distribution 1.3.6.6.1. Normal Distribution

Theroretical The normal distribution is widely used. Part of the appeal is that it is
Justification well behaved and mathematically tractable. However, the central limit
- Central theorem provides a theoretical basis for why it has wide applicability.
Limit
Theorem The central limit theorem basically states that as the sample size (N)
becomes large, the following occur:
1. The sampling distribution of the mean becomes approximately
normal regardless of the distribution of the original variable.
2. The sampling distribution of the mean is centered at the
population mean, , of the original variable. In addition, the
standard deviation of the sampling distribution of the mean
approaches .

Software Most general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the normal
distribution.
Common Mean The location parameter .
Statistics Median The location parameter .
Mode The location parameter .
Range Infinity in both directions.
Standard Deviation The scale parameter .
Coefficient of
Variation
Skewness 0
Kurtosis 3

Parameter The location and scale parameters of the normal distribution can be
Estimation estimated with the sample mean and sample standard deviation,
respectively.

Comments For both theoretical and practical reasons, the normal distribution is
probably the most important distribution in statistics. For example,
● Many classical statistical tests are based on the assumption that
the data follow a normal distribution. This assumption should be
tested before applying these tests.
● In modeling applications, such as linear and non-linear regression,
the error term is often assumed to follow a normal distribution
with fixed location and scale.
● The normal distribution is used to find significance levels in many
hypothesis tests and confidence intervals.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm (6 of 7) [11/13/2003 5:32:47 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3661.htm (7 of 7) [11/13/2003 5:32:47 PM]


1.3.6.6.2. Uniform Distribution 1.3.6.6.2. Uniform Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.2. Uniform Distribution


Probability The general formula for the probability density function of the uniform
Density distribution is
Function

where A is the location parameter and (B - A) is the scale parameter. The case Cumulative The formula for the cumulative distribution function of the uniform
where A = 0 and B = 1 is called the standard uniform distribution. The Distribution distribution is
equation for the standard uniform distribution is Function

The following is the plot of the uniform cumulative distribution function.


Since the general form of probability functions can be expressed in terms of
the standard distribution, all subsequent formulas in this section are given for
the standard form of the function.
The following is the plot of the uniform probability density function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm (1 of 7) [11/13/2003 5:32:48 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm (2 of 7) [11/13/2003 5:32:48 PM]


1.3.6.6.2. Uniform Distribution 1.3.6.6.2. Uniform Distribution

Percent The formula for the percent point function of the uniform distribution is
Point
Function
The following is the plot of the uniform percent point function.

Cumulative The formula for the cumulative hazard function of the uniform distribution is
Hazard
Function
The following is the plot of the uniform cumulative hazard function.

Hazard The formula for the hazard function of the uniform distribution is
Function

The following is the plot of the uniform hazard function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm (3 of 7) [11/13/2003 5:32:48 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm (4 of 7) [11/13/2003 5:32:48 PM]


1.3.6.6.2. Uniform Distribution 1.3.6.6.2. Uniform Distribution

Survival The uniform survival function can be computed from the uniform cumulative
Function distribution function.
The following is the plot of the uniform survival function.

Common Mean (A + B)/2


Statistics Median (A + B)/2
Range B-A
Standard Deviation

Inverse The uniform inverse survival function can be computed from the uniform
Survival percent point function. Coefficient of
Function Variation
The following is the plot of the uniform inverse survival function.
Skewness 0
Kurtosis 9/5

Parameter The method of moments estimators for A and B are


Estimation

The maximum likelihood estimators for A and B are

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm (5 of 7) [11/13/2003 5:32:48 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm (6 of 7) [11/13/2003 5:32:48 PM]


1.3.6.6.2. Uniform Distribution 1.3.6.6.3. Cauchy Distribution

Comments The uniform distribution defines equal probability over a given range for a
continuous distribution. For this reason, it is important as a reference
distribution.
One of the most important applications of the uniform distribution is in the 1. Exploratory Data Analysis
1.3. EDA Techniques
generation of random numbers. That is, almost all random number generators
1.3.6. Probability Distributions
generate random numbers on the (0,1) interval. For other distributions, some
1.3.6.6. Gallery of Distributions
transformation is applied to the uniform random numbers.

Software Most general purpose statistical software programs, including Dataplot, 1.3.6.6.3. Cauchy Distribution
support at least some of the probability functions for the uniform distribution.
Probability The general formula for the probability density function of the Cauchy
Density distribution is
Function

where t is the location parameter and s is the scale parameter. The case
where t = 0 and s = 1 is called the standard Cauchy distribution. The
equation for the standard Cauchy distribution reduces to

Since the general form of probability functions can be expressed in


terms of the standard distribution, all subsequent formulas in this section
are given for the standard form of the function.
The following is the plot of the standard Cauchy probability density
function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3662.htm (7 of 7) [11/13/2003 5:32:48 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm (1 of 7) [11/13/2003 5:32:48 PM]


1.3.6.6.3. Cauchy Distribution 1.3.6.6.3. Cauchy Distribution

Percent The formula for the percent point function of the Cauchy distribution is
Point
Function
The following is the plot of the Cauchy percent point function.

Cumulative The formula for the cumulative distribution function for the Cauchy
Distribution distribution is
Function

The following is the plot of the Cauchy cumulative distribution function.


Hazard The Cauchy hazard function can be computed from the Cauchy
Function probability density and cumulative distribution functions.
The following is the plot of the Cauchy hazard function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm (2 of 7) [11/13/2003 5:32:48 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm (3 of 7) [11/13/2003 5:32:48 PM]


1.3.6.6.3. Cauchy Distribution 1.3.6.6.3. Cauchy Distribution

Survival The Cauchy survival function can be computed from the Cauchy
Function cumulative distribution function.
The following is the plot of the Cauchy survival function.

Cumulative The Cauchy cumulative hazard function can be computed from the
Hazard Cauchy cumulative distribution function.
Function
The following is the plot of the Cauchy cumulative hazard function.

Inverse The Cauchy inverse survival function can be computed from the Cauchy
Survival percent point function.
Function
The following is the plot of the Cauchy inverse survival function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm (4 of 7) [11/13/2003 5:32:48 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm (5 of 7) [11/13/2003 5:32:48 PM]


1.3.6.6.3. Cauchy Distribution 1.3.6.6.3. Cauchy Distribution

Comments The Cauchy distribution is important as an example of a pathological


case. Cauchy distributions look similar to a normal distribution.
However, they have much heavier tails. When studying hypothesis tests
that assume normality, seeing how the tests perform on data from a
Cauchy distribution is a good indicator of how sensitive the tests are to
heavy-tail departures from normality. Likewise, it is a good check for
robust techniques that are designed to work well under a wide variety of
distributional assumptions.
The mean and standard deviation of the Cauchy distribution are
undefined. The practical meaning of this is that collecting 1,000 data
points gives no more accurate an estimate of the mean and standard
deviation than does a single point.

Software Many general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the Cauchy
distribution.
Common Mean The mean is undefined.
Statistics Median The location parameter t.
Mode The location parameter t.
Range Infinity in both directions.
Standard Deviation The standard deviation is undefined.
Coefficient of The coefficient of variation is undefined.
Variation
Skewness The skewness is undefined.
Kurtosis The kurtosis is undefined.

Parameter The likelihood functions for the Cauchy maximum likelihood estimates
Estimation are given in chapter 16 of Johnson, Kotz, and Balakrishnan. These
equations typically must be solved numerically on a computer.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm (6 of 7) [11/13/2003 5:32:48 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3663.htm (7 of 7) [11/13/2003 5:32:48 PM]


1.3.6.6.4. t Distribution 1.3.6.6.4. t Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.4. t Distribution
Probability The formula for the probability density function of the t distribution is
Density
Function

where is the beta function and is a positive integer shape parameter. These plots all have a similar shape. The difference is in the heaviness
The formula for the beta function is of the tails. In fact, the t distribution with equal to 1 is a Cauchy
distribution. The t distribution approaches a normal distribution as
becomes large. The approximation is quite good for values of > 30.

In a testing context, the t distribution is treated as a "standardized Cumulative The formula for the cumulative distribution function of the t distribution
distribution" (i.e., no location or scale parameters). However, in a Distribution is complicated and is not included here. It is given in the Evans,
distributional modeling context (as with other probability distributions), Function Hastings, and Peacock book.
the t distribution itself can be transformed with a location parameter, ,
The following are the plots of the t cumulative distribution function with
and a scale parameter, .
the same values of as the pdf plots above.
The following is the plot of the t probability density function for 4
different values of the shape parameter.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htm (1 of 4) [11/13/2003 5:32:49 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htm (2 of 4) [11/13/2003 5:32:49 PM]


1.3.6.6.4. t Distribution 1.3.6.6.4. t Distribution

Other Since the t distribution is typically used to develop hypothesis tests and
Probability confidence intervals and rarely for modeling applications, we omit the
Functions formulas and plots for the hazard, cumulative hazard, survival, and
inverse survival probability functions.

Common Mean 0 (It is undefined for equal to 1.)


Statistics Median 0
Mode 0
Range Infinity in both directions.
Standard Deviation

It is undefined for equal to 1 or 2.


Coefficient of Undefined
Variation
Skewness 0. It is undefined for less than or equal to 3.
Percent The formula for the percent point function of the t distribution does not However, the t distribution is symmetric in all
Point exist in a simple closed form. It is computed numerically. cases.
Function Kurtosis
The following are the plots of the t percent point function with the same
values of as the pdf plots above.
It is undefined for less than or equal to 4.

Parameter Since the t distribution is typically used to develop hypothesis tests and
Estimation confidence intervals and rarely for modeling applications, we omit any
discussion of parameter estimation.

Comments The t distribution is used in many cases for the critical regions for
hypothesis tests and in determining confidence intervals. The most
common example is testing if data are consistent with the assumed
process mean.

Software Most general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the t distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htm (3 of 4) [11/13/2003 5:32:49 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3664.htm (4 of 4) [11/13/2003 5:32:49 PM]


1.3.6.6.5. F Distribution 1.3.6.6.5. F Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.5. F Distribution
Probability The F distribution is the ratio of two chi-square distributions with
Density degrees of freedom and , respectively, where each chi-square has
Function first been divided by its degrees of freedom. The formula for the
probability density function of the F distribution is

Cumulative The formula for the Cumulative distribution function of the F


Distribution distribution is
where and are the shape parameters and is the gamma function. Function
The formula for the gamma function is
where k = / ( + *x) and Ik is the incomplete beta function. The
formula for the incomplete beta function is
In a testing context, the F distribution is treated as a "standardized
distribution" (i.e., no location or scale parameters). However, in a
distributional modeling context (as with other probability distributions),
the F distribution itself can be transformed with a location parameter, , where B is the beta function
and a scale parameter, .
The following is the plot of the F probability density function for 4
different values of the shape parameters. The following is the plot of the F cumulative distribution function with
the same values of and as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3665.htm (1 of 4) [11/13/2003 5:32:49 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3665.htm (2 of 4) [11/13/2003 5:32:49 PM]


1.3.6.6.5. F Distribution 1.3.6.6.5. F Distribution

Other Since the F distribution is typically used to develop hypothesis tests and
Probability confidence intervals and rarely for modeling applications, we omit the
Functions formulas and plots for the hazard, cumulative hazard, survival, and
inverse survival probability functions.

Common The formulas below are for the case where the location parameter is
Statistics zero and the scale parameter is one.
Mean

Mode

Range 0 to positive infinity


Standard Deviation

Percent The formula for the percent point function of the F distribution does not Coefficient of
Point exist in a simple closed form. It is computed numerically. Variation
Function
The following is the plot of the F percent point function with the same
values of and as the pdf plots above. Skewness

Parameter Since the F distribution is typically used to develop hypothesis tests and
Estimation confidence intervals and rarely for modeling applications, we omit any
discussion of parameter estimation.

Comments The F distribution is used in many cases for the critical regions for
hypothesis tests and in determining confidence intervals. Two common
examples are the analysis of variance and the F test to determine if the
variances of two populations are equal.

Software Most general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the F distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3665.htm (3 of 4) [11/13/2003 5:32:49 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3665.htm (4 of 4) [11/13/2003 5:32:49 PM]


1.3.6.6.6. Chi-Square Distribution 1.3.6.6.6. Chi-Square Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.6. Chi-Square Distribution


Probability The chi-square distribution results when independent variables with
Density standard normal distributions are squared and summed. The formula for
Function the probability density function of the chi-square distribution is

Cumulative The formula for the cumulative distribution function of the chi-square
Distribution distribution is
where is the shape parameter and is the gamma function. The
Function
formula for the gamma function is

where is the gamma function defined above and is the incomplete


In a testing context, the chi-square distribution is treated as a gamma function. The formula for the incomplete gamma function is
"standardized distribution" (i.e., no location or scale parameters).
However, in a distributional modeling context (as with other probability
distributions), the chi-square distribution itself can be transformed with
a location parameter, , and a scale parameter, .
The following is the plot of the chi-square cumulative distribution
The following is the plot of the chi-square probability density function function with the same values of as the pdf plots above.
for 4 different values of the shape parameter.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm (1 of 4) [11/13/2003 5:32:50 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm (2 of 4) [11/13/2003 5:32:50 PM]


1.3.6.6.6. Chi-Square Distribution 1.3.6.6.6. Chi-Square Distribution

Other Since the chi-square distribution is typically used to develop hypothesis


Probability tests and confidence intervals and rarely for modeling applications, we
Functions omit the formulas and plots for the hazard, cumulative hazard, survival,
and inverse survival probability functions.

Common Mean
Statistics Median approximately - 2/3 for large
Mode
Range 0 to positive infinity
Standard Deviation
Coefficient of
Variation

Skewness

Kurtosis
Percent The formula for the percent point function of the chi-square distribution
Point does not exist in a simple closed form. It is computed numerically.
Function
The following is the plot of the chi-square percent point function with Parameter Since the chi-square distribution is typically used to develop hypothesis
the same values of as the pdf plots above. Estimation tests and confidence intervals and rarely for modeling applications, we
omit any discussion of parameter estimation.

Comments The chi-square distribution is used in many cases for the critical regions
for hypothesis tests and in determining confidence intervals. Two
common examples are the chi-square test for independence in an RxC
contingency table and the chi-square test to determine if the standard
deviation of a population is equal to a pre-specified value.

Software Most general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the chi-square
distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm (3 of 4) [11/13/2003 5:32:50 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3666.htm (4 of 4) [11/13/2003 5:32:50 PM]


1.3.6.6.7. Exponential Distribution 1.3.6.6.7. Exponential Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.7. Exponential Distribution


Probability The general formula for the probability density function of the
Density exponential distribution is
Function

where is the location parameter and is the scale parameter (the Cumulative The formula for the cumulative distribution function of the exponential
scale parameter is often referred to as which equals ). The case Distribution distribution is
Function
where = 0 and = 1 is called the standard exponential distribution.
The equation for the standard exponential distribution is
The following is the plot of the exponential cumulative distribution
function.
The general form of probability functions can be expressed in terms of
the standard distribution. Subsequent formulas in this section are given
for the 1-parameter (i.e., with scale parameter) form of the function.
The following is the plot of the exponential probability density function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3667.htm (1 of 7) [11/13/2003 5:32:51 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3667.htm (2 of 7) [11/13/2003 5:32:51 PM]


1.3.6.6.7. Exponential Distribution 1.3.6.6.7. Exponential Distribution

Percent The formula for the percent point function of the exponential
Point distribution is
Function

The following is the plot of the exponential percent point function.

Cumulative The formula for the cumulative hazard function of the exponential
Hazard distribution is
Function

The following is the plot of the exponential cumulative hazard function.


Hazard The formula for the hazard function of the exponential distribution is
Function

The following is the plot of the exponential hazard function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3667.htm (3 of 7) [11/13/2003 5:32:51 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3667.htm (4 of 7) [11/13/2003 5:32:51 PM]


1.3.6.6.7. Exponential Distribution 1.3.6.6.7. Exponential Distribution

Survival The formula for the survival function of the exponential distribution is
Function

The following is the plot of the exponential survival function.

Common Mean
Statistics Median
Mode Zero
Range Zero to plus infinity
Standard Deviation
Coefficient of 1
Inverse The formula for the inverse survival function of the exponential Variation
Survival distribution is Skewness 2
Function Kurtosis 9

The following is the plot of the exponential inverse survival function. Parameter For the full sample case, the maximum likelihood estimator of the scale
Estimation parameter is the sample mean. Maximum likelihood estimation for the
exponential distribution is discussed in the chapter on reliability
(Chapter 8). It is also discussed in chapter 19 of Johnson, Kotz, and
Balakrishnan.

Comments The exponential distribution is primarily used in reliability applications.


The exponential distribution is used to model data with a constant
failure rate (indicated by the hazard plot which is simply equal to a
constant).

Software Most general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the exponential
distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3667.htm (5 of 7) [11/13/2003 5:32:51 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3667.htm (6 of 7) [11/13/2003 5:32:51 PM]


1.3.6.6.7. Exponential Distribution 1.3.6.6.8. Weibull Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.8. Weibull Distribution


Probability The formula for the probability density function of the general Weibull distribution
Density is
Function

where is the shape parameter, is the location parameter and is the scale
parameter. The case where = 0 and = 1 is called the standard Weibull
distribution. The case where = 0 is called the 2-parameter Weibull distribution.
The equation for the standard Weibull distribution reduces to

Since the general form of probability functions can be expressed in terms of the
standard distribution, all subsequent formulas in this section are given for the
standard form of the function.
The following is the plot of the Weibull probability density function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3667.htm (7 of 7) [11/13/2003 5:32:51 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3668.htm (1 of 7) [11/13/2003 5:32:52 PM]


1.3.6.6.8. Weibull Distribution 1.3.6.6.8. Weibull Distribution

Percent The formula for the percent point function of the Weibull distribution is
Point
Function

The following is the plot of the Weibull percent point function with the same
values of as the pdf plots above.

Cumulative The formula for the cumulative distribution function of the Weibull distribution is
Distribution
Function

The following is the plot of the Weibull cumulative distribution function with the
same values of as the pdf plots above.

Hazard The formula for the hazard function of the Weibull distribution is
Function

The following is the plot of the Weibull hazard function with the same values of
as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3668.htm (2 of 7) [11/13/2003 5:32:52 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3668.htm (3 of 7) [11/13/2003 5:32:52 PM]


1.3.6.6.8. Weibull Distribution 1.3.6.6.8. Weibull Distribution

Survival The formula for the survival function of the Weibull distribution is
Function

The following is the plot of the Weibull survival function with the same values of
as the pdf plots above.

Cumulative The formula for the cumulative hazard function of the Weibull distribution is
Hazard
Function
The following is the plot of the Weibull cumulative hazard function with the same
values of as the pdf plots above.

Inverse The formula for the inverse survival function of the Weibull distribution is
Survival
Function

The following is the plot of the Weibull inverse survival function with the same
values of as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3668.htm (4 of 7) [11/13/2003 5:32:52 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3668.htm (5 of 7) [11/13/2003 5:32:52 PM]


1.3.6.6.8. Weibull Distribution 1.3.6.6.8. Weibull Distribution

Parameter Maximum likelihood estimation for the Weibull distribution is discussed in the
Estimation Reliability chapter (Chapter 8). It is also discussed in Chapter 21 of Johnson, Kotz,
and Balakrishnan.

Comments The Weibull distribution is used extensively in reliability applications to model


failure times.

Software Most general purpose statistical software programs, including Dataplot, support at
least some of the probability functions for the Weibull distribution.

Common The formulas below are with the location parameter equal to zero and the scale
Statistics parameter equal to one.
Mean

where is the gamma function

Median
Mode

Range Zero to positive infinity.


Standard Deviation

Coefficient of Variation

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3668.htm (6 of 7) [11/13/2003 5:32:52 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3668.htm (7 of 7) [11/13/2003 5:32:52 PM]


1.3.6.6.9. Lognormal Distribution 1.3.6.6.9. Lognormal Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.9. Lognormal Distribution


Probability A variable X is lognormally distributed if Y = LN(X) is normally
Density distributed with "LN" denoting the natural logarithm. The general
Function formula for the probability density function of the lognormal
distribution is

There are several common parameterizations of the lognormal


distribution. The form given here is from Evans, Hastings, and Peacock.

where is the shape parameter, is the location parameter and m is the Cumulative The formula for the cumulative distribution function of the lognormal
scale parameter. The case where = 0 and m = 1 is called the standard Distribution distribution is
lognormal distribution. The case where equals zero is called the Function
2-parameter lognormal distribution.
The equation for the standard lognormal distribution is
where is the cumulative distribution function of the normal
distribution.

The following is the plot of the lognormal cumulative distribution


function with the same values of as the pdf plots above.
Since the general form of probability functions can be expressed in
terms of the standard distribution, all subsequent formulas in this section
are given for the standard form of the function.
The following is the plot of the lognormal probability density function
for four values of .

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm (1 of 8) [11/13/2003 5:32:53 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm (2 of 8) [11/13/2003 5:32:53 PM]


1.3.6.6.9. Lognormal Distribution 1.3.6.6.9. Lognormal Distribution

Hazard The formula for the hazard function of the lognormal distribution is
Function

where is the probability density function of the normal distribution


and is the cumulative distribution function of the normal distribution.
The following is the plot of the lognormal hazard function with the same
values of as the pdf plots above.

Percent The formula for the percent point function of the lognormal distribution
Point is
Function

where is the percent point function of the normal distribution.


The following is the plot of the lognormal percent point function with
the same values of as the pdf plots above.

Cumulative The formula for the cumulative hazard function of the lognormal
Hazard distribution is
Function

where is the cumulative distribution function of the normal


distribution.
The following is the plot of the lognormal cumulative hazard function
with the same values of as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm (3 of 8) [11/13/2003 5:32:53 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm (4 of 8) [11/13/2003 5:32:53 PM]


1.3.6.6.9. Lognormal Distribution 1.3.6.6.9. Lognormal Distribution

Survival The formula for the survival function of the lognormal distribution is Inverse The formula for the inverse survival function of the lognormal
Function Survival distribution is
Function

where is the cumulative distribution function of the normal where is the percent point function of the normal distribution.
distribution.
The following is the plot of the lognormal inverse survival function with
The following is the plot of the lognormal survival function with the the same values of as the pdf plots above.
same values of as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm (5 of 8) [11/13/2003 5:32:53 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm (6 of 8) [11/13/2003 5:32:53 PM]


1.3.6.6.9. Lognormal Distribution 1.3.6.6.9. Lognormal Distribution

Common The formulas below are with the location parameter equal to zero and
Statistics the scale parameter equal to one.
Mean
Median Scale parameter m (= 1 if scale parameter not
specified).
Mode

Range Zero to positive infinity


Standard Deviation

Skewness

Kurtosis
Coefficient of
Variation

Parameter The maximum likelihood estimates for the scale parameter, m, and the
Estimation shape parameter, , are

and

where

If the location parameter is known, it can be subtracted from the original


data points before computing the maximum likelihood estimates of the
shape and scale parameters.

Comments The lognormal distribution is used extensively in reliability applications


to model failure times. The lognormal and Weibull distributions are
probably the most commonly used distributions in reliability
applications.

Software Most general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the lognormal
distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm (7 of 8) [11/13/2003 5:32:53 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3669.htm (8 of 8) [11/13/2003 5:32:53 PM]


1.3.6.6.10. Fatigue Life Distribution 1.3.6.6.10. Fatigue Life Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.10. Fatigue Life Distribution


Probability The fatigue life distribution is also commonly known as the Birnbaum-Saunders
Density distribution. There are several alternative formulations of the fatigue life
Function distribution in the literature.
The general formula for the probability density function of the fatigue life
distribution is
Cumulative The formula for the cumulative distribution function of the fatigue life
Distribution distribution is
Function

where is the shape parameter, is the location parameter, is the scale


parameter, is the probability density function of the standard normal
distribution, and is the cumulative distribution function of the standard normal where is the cumulative distribution function of the standard normal
distribution. The case where = 0 and = 1 is called the standard fatigue life distribution. The following is the plot of the fatigue life cumulative distribution
distribution. The equation for the standard fatigue life distribution reduces to function with the same values of as the pdf plots above.

Since the general form of probability functions can be expressed in terms of the
standard distribution, all subsequent formulas in this section are given for the
standard form of the function.
The following is the plot of the fatigue life probability density function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm (1 of 7) [11/13/2003 5:32:54 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm (2 of 7) [11/13/2003 5:32:54 PM]


1.3.6.6.10. Fatigue Life Distribution 1.3.6.6.10. Fatigue Life Distribution

Hazard The fatigue life hazard function can be computed from the fatigue life probability
Function density and cumulative distribution functions.
The following is the plot of the fatigue life hazard function with the same values
of as the pdf plots above.

Percent The formula for the percent point function of the fatigue life distribution is
Point
Function

where is the percent point function of the standard normal distribution. The
following is the plot of the fatigue life percent point function with the same Cumulative The fatigue life cumulative hazard function can be computed from the fatigue life
values of as the pdf plots above. Hazard cumulative distribution function.
Function
The following is the plot of the fatigue cumulative hazard function with the same
values of as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm (3 of 7) [11/13/2003 5:32:54 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm (4 of 7) [11/13/2003 5:32:54 PM]


1.3.6.6.10. Fatigue Life Distribution 1.3.6.6.10. Fatigue Life Distribution

Inverse The fatigue life inverse survival function can be computed from the fatigue life
Survival percent point function.
Function
The following is the plot of the gamma inverse survival function with the same
values of as the pdf plots above.

Survival The fatigue life survival function can be computed from the fatigue life
Function cumulative distribution function.
The following is the plot of the fatigue survival function with the same values of
as the pdf plots above.

Common The formulas below are with the location parameter equal to zero and the scale
Statistics parameter equal to one.
Mean

Range Zero to positive infinity.


Standard Deviation

Coefficient of Variation

Parameter Maximum likelihood estimation for the fatigue life distribution is discussed in the
Estimation Reliability chapter.

Comments The fatigue life distribution is used extensively in reliability applications to model
failure times.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm (5 of 7) [11/13/2003 5:32:54 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm (6 of 7) [11/13/2003 5:32:54 PM]


1.3.6.6.10. Fatigue Life Distribution 1.3.6.6.11. Gamma Distribution

Software Some general purpose statistical software programs, including Dataplot, support
at least some of the probability functions for the fatigue life distribution. Support
for this distribution is likely to be available for statistical programs that
emphasize reliability applications. 1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.11. Gamma Distribution


Probability The general formula for the probability density function of the gamma
Density distribution is
Function

where is the shape parameter, is the location parameter, is the


scale parameter, and is the gamma function which has the formula

The case where = 0 and = 1 is called the standard gamma


distribution. The equation for the standard gamma distribution reduces
to

Since the general form of probability functions can be expressed in


terms of the standard distribution, all subsequent formulas in this section
are given for the standard form of the function.
The following is the plot of the gamma probability density function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366a.htm (7 of 7) [11/13/2003 5:32:54 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm (1 of 7) [11/13/2003 5:32:55 PM]


1.3.6.6.11. Gamma Distribution 1.3.6.6.11. Gamma Distribution

Cumulative The formula for the cumulative distribution function of the gamma Percent The formula for the percent point function of the gamma distribution
Distribution distribution is Point does not exist in a simple closed form. It is computed numerically.
Function Function
The following is the plot of the gamma percent point function with the
same values of as the pdf plots above.

where is the gamma function defined above and is the


incomplete gamma function. The incomplete gamma function has the
formula

The following is the plot of the gamma cumulative distribution function


with the same values of as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm (2 of 7) [11/13/2003 5:32:55 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm (3 of 7) [11/13/2003 5:32:55 PM]


1.3.6.6.11. Gamma Distribution 1.3.6.6.11. Gamma Distribution

Hazard The formula for the hazard function of the gamma distribution is
Function

The following is the plot of the gamma hazard function with the same
values of as the pdf plots above.

Survival The formula for the survival function of the gamma distribution is
Function

where is the gamma function defined above and is the


incomplete gamma function defined above.
The following is the plot of the gamma survival function with the same
Cumulative The formula for the cumulative hazard function of the gamma values of as the pdf plots above.
Hazard distribution is
Function

where is the gamma function defined above and is the


incomplete gamma function defined above.
The following is the plot of the gamma cumulative hazard function with
the same values of as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm (4 of 7) [11/13/2003 5:32:55 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm (5 of 7) [11/13/2003 5:32:55 PM]


1.3.6.6.11. Gamma Distribution 1.3.6.6.11. Gamma Distribution

Common The formulas below are with the location parameter equal to zero and
Statistics the scale parameter equal to one.
Mean
Mode
Range Zero to positive infinity.
Standard Deviation
Skewness

Kurtosis

Coefficient of
Variation

Parameter The method of moments estimators of the gamma distribution are


Inverse The gamma inverse survival function does not exist in simple closed Estimation
Survival form. It is computed numberically.
Function
The following is the plot of the gamma inverse survival function with
the same values of as the pdf plots above.

where and s are the sample mean and standard deviation, respectively.
The equations for the maximum likelihood estimation of the shape and
scale parameters are given in Chapter 18 of Evans, Hastings, and
Peacock and Chapter 17 of Johnson, Kotz, and Balakrishnan. These
equations need to be solved numerically; this is typically accomplished
by using statistical software packages.

Software Some general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the gamma
distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm (6 of 7) [11/13/2003 5:32:55 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366b.htm (7 of 7) [11/13/2003 5:32:55 PM]


1.3.6.6.12. Double Exponential Distribution 1.3.6.6.12. Double Exponential Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.12. Double Exponential Distribution


Probability The general formula for the probability density function of the double
Density exponential distribution is
Function

where is the location parameter and is the scale parameter. The Cumulative The formula for the cumulative distribution function of the double
Distribution exponential distribution is
case where = 0 and = 1 is called the standard double exponential Function
distribution. The equation for the standard double exponential
distribution is

The following is the plot of the double exponential cumulative


distribution function.
Since the general form of probability functions can be expressed in
terms of the standard distribution, all subsequent formulas in this section
are given for the standard form of the function.
The following is the plot of the double exponential probability density
function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366c.htm (1 of 7) [11/13/2003 5:32:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366c.htm (2 of 7) [11/13/2003 5:32:56 PM]


1.3.6.6.12. Double Exponential Distribution 1.3.6.6.12. Double Exponential Distribution

Percent The formula for the percent point function of the double exponential
Point distribution is
Function

The following is the plot of the double exponential percent point


function.

Cumulative The formula for the cumulative hazard function of the double
Hazard exponential distribution is
Function

The following is the plot of the double exponential cumulative hazard


function.

Hazard The formula for the hazard function of the double exponential
Function distribution is

The following is the plot of the double exponential hazard function.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366c.htm (3 of 7) [11/13/2003 5:32:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366c.htm (4 of 7) [11/13/2003 5:32:56 PM]


1.3.6.6.12. Double Exponential Distribution 1.3.6.6.12. Double Exponential Distribution

Survival The double exponential survival function can be computed from the
Function cumulative distribution function of the double exponential distribution.
The following is the plot of the double exponential survival function.

Common Mean
Statistics Median
Mode
Range Negative infinity to positive infinity
Standard Deviation
Inverse The formula for the inverse survival function of the double exponential Skewness 0
Survival distribution is Kurtosis 6
Function Coefficient of
Variation

The following is the plot of the double exponential inverse survival Parameter The maximum likelihood estimators of the location and scale parameters
function. Estimation of the double exponential distribution are

where is the sample median.

Software Some general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the double
exponential distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366c.htm (5 of 7) [11/13/2003 5:32:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366c.htm (6 of 7) [11/13/2003 5:32:56 PM]


1.3.6.6.12. Double Exponential Distribution 1.3.6.6.13. Power Normal Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.13. Power Normal Distribution


Probability The formula for the probability density function of the standard form of
Density the power normal distribution is
Function

where p is the shape parameter (also referred to as the power parameter),


is the cumulative distribution function of the standard normal
distribution, and is the probability density function of the standard
normal distribution.
As with other probability distributions, the power normal distribution
can be transformed with a location parameter, , and a scale parameter,
. We omit the equation for the general form of the power normal
distribution. Since the general form of probability functions can be
expressed in terms of the standard distribution, all subsequent formulas
in this section are given for the standard form of the function.
The following is the plot of the power normal probability density
function with four values of p.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366c.htm (7 of 7) [11/13/2003 5:32:56 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366d.htm (1 of 7) [11/13/2003 5:32:57 PM]


1.3.6.6.13. Power Normal Distribution 1.3.6.6.13. Power Normal Distribution

Cumulative The formula for the cumulative distribution function of the power Percent The formula for the percent point function of the power normal
Distribution normal distribution is Point distribution is
Function Function

where is the cumulative distribution function of the standard normal where is the percent point function of the standard normal
distribution. distribution.
The following is the plot of the power normal cumulative distribution The following is the plot of the power normal percent point function
function with the same values of p as the pdf plots above. with the same values of p as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366d.htm (2 of 7) [11/13/2003 5:32:57 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366d.htm (3 of 7) [11/13/2003 5:32:57 PM]


1.3.6.6.13. Power Normal Distribution 1.3.6.6.13. Power Normal Distribution

Cumulative The formula for the cumulative hazard function of the power normal
Hazard distribution is
Function

The following is the plot of the power normal cumulative hazard


function with the same values of p as the pdf plots above.

Hazard The formula for the hazard function of the power normal distribution is
Function

The following is the plot of the power normal hazard function with the
same values of p as the pdf plots above.

Survival The formula for the survival function of the power normal distribution is
Function

The following is the plot of the power normal survival function with the
same values of p as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366d.htm (4 of 7) [11/13/2003 5:32:57 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366d.htm (5 of 7) [11/13/2003 5:32:57 PM]


1.3.6.6.13. Power Normal Distribution 1.3.6.6.13. Power Normal Distribution

Common The statistics for the power normal distribution are complicated and
Statistics require tables. Nelson discusses the mean, median, mode, and standard
deviation of the power normal distribution and provides references to
the appropriate tables.

Software Most general purpose statistical software programs do not support the
probability functions for the power normal distribution. Dataplot does
support them.

Inverse The formula for the inverse survival function of the power normal
Survival distribution is
Function

The following is the plot of the power normal inverse survival function
with the same values of p as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366d.htm (6 of 7) [11/13/2003 5:32:57 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366d.htm (7 of 7) [11/13/2003 5:32:57 PM]


1.3.6.6.14. Power Lognormal Distribution 1.3.6.6.14. Power Lognormal Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.14. Power Lognormal Distribution


Probability The formula for the probability density function of the standard form of the power
Density lognormal distribution is
Function

where p (also referred to as the power parameter) and are the shape parameters, Cumulative The formula for the cumulative distribution function of the power lognormal
is the cumulative distribution function of the standard normal distribution, and Distribution distribution is
is the probability density function of the standard normal distribution. Function

As with other probability distributions, the power lognormal distribution can be


transformed with a location parameter, , and a scale parameter, B. We omit the
equation for the general form of the power lognormal distribution. Since the where is the cumulative distribution function of the standard normal distribution.
general form of probability functions can be expressed in terms of the standard The following is the plot of the power lognormal cumulative distribution function
distribution, all subsequent formulas in this section are given for the standard form with the same values of p as the pdf plots above.
of the function.
The following is the plot of the power lognormal probability density function with
four values of p and set to 1.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366e.htm (1 of 6) [11/13/2003 5:33:03 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366e.htm (2 of 6) [11/13/2003 5:33:03 PM]


1.3.6.6.14. Power Lognormal Distribution 1.3.6.6.14. Power Lognormal Distribution

Hazard The formula for the hazard function of the power lognormal distribution is
Function

where is the cumulative distribution function of the standard normal distribution,


and is the probability density function of the standard normal distribution.
Note that this is simply a multiple (p) of the lognormal hazard function.
The following is the plot of the power lognormal hazard function with the same
values of p as the pdf plots above.

Percent The formula for the percent point function of the power lognormal distribution is
Point
Function

where is the percent point function of the standard normal distribution.


The following is the plot of the power lognormal percent point function with the
same values of p as the pdf plots above.

Cumulative The formula for the cumulative hazard function of the power lognormal
Hazard distribution is
Function

The following is the plot of the power lognormal cumulative hazard function with
the same values of p as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366e.htm (3 of 6) [11/13/2003 5:33:03 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366e.htm (4 of 6) [11/13/2003 5:33:03 PM]


1.3.6.6.14. Power Lognormal Distribution 1.3.6.6.14. Power Lognormal Distribution

Inverse The formula for the inverse survival function of the power lognormal distribution is
Survival
Function

The following is the plot of the power lognormal inverse survival function with the
same values of p as the pdf plots above.

Survival The formula for the survival function of the power lognormal distribution is
Function

The following is the plot of the power lognormal survival function with the same
values of p as the pdf plots above.

Common The statistics for the power lognormal distribution are complicated and require
Statistics tables. Nelson discusses the mean, median, mode, and standard deviation of the
power lognormal distribution and provides references to the appropriate tables.

Parameter Nelson discusses maximum likelihood estimation for the power lognormal
Estimation distribution. These estimates need to be performed with computer software.
Software for maximum likelihood estimation of the parameters of the power
lognormal distribution is not as readily available as for other reliability
distributions such as the exponential, Weibull, and lognormal.

Software Most general purpose statistical software programs do not support the probability
functions for the power lognormal distribution. Dataplot does support them.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366e.htm (5 of 6) [11/13/2003 5:33:03 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366e.htm (6 of 6) [11/13/2003 5:33:03 PM]


1.3.6.6.15. Tukey-Lambda Distribution 1.3.6.6.15. Tukey-Lambda Distribution

Cumulative The Tukey-Lambda distribution does not have a simple, closed form. It
Distribution is computed numerically.
Function
The following is the plot of the Tukey-Lambda cumulative distribution
1. Exploratory Data Analysis
function with the same values of as the pdf plots above.
1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.15. Tukey-Lambda Distribution


Probability The Tukey-Lambda density function does not have a simple, closed
Density form. It is computed numerically.
Function
The Tukey-Lambda distribution has the shape parameter . As with
other probability distributions, the Tukey-Lambda distribution can be
transformed with a location parameter, , and a scale parameter, .
Since the general form of probability functions can be expressed in
terms of the standard distribution, all subsequent formulas in this section
are given for the standard form of the function.
The following is the plot of the Tukey-Lambda probability density
function for four values of .
Percent The formula for the percent point function of the standard form of the
Point Tukey-Lambda distribution is
Function

The following is the plot of the Tukey-Lambda percent point function


with the same values of as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366f.htm (1 of 4) [11/13/2003 5:33:04 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366f.htm (2 of 4) [11/13/2003 5:33:04 PM]


1.3.6.6.15. Tukey-Lambda Distribution 1.3.6.6.15. Tukey-Lambda Distribution

As the Tukey-Lambda distribution is a symmetric distribution, the use


of the Tukey-Lambda PPCC plot to determine a reasonable distribution
to model the data only applies to symmetric distributuins. A histogram
of the data should provide evidence as to whether the data can be
reasonably modeled with a symmetric distribution.

Software Most general purpose statistical software programs do not support the
probability functions for the Tukey-Lambda distribution. Dataplot does
support them.

Other The Tukey-Lambda distribution is typically used to identify an


Probability appropriate distribution (see the comments below) and not used in
Functions statistical models directly. For this reason, we omit the formulas, and
plots for the hazard, cumulative hazard, survival, and inverse survival
functions. We also omit the common statistics and parameter estimation
sections.

Comments The Tukey-Lambda distribution is actually a family of distributions that


can approximate a number of common distributions. For example,
= -1 approximately Cauchy
= 0 exactly logistic
= 0.14 approximately normal
= 0.5 U-shaped
= 1 exactly uniform (from -1 to +1)
The most common use of this distribution is to generate a
Tukey-Lambda PPCC plot of a data set. Based on the ppcc plot, an
appropriate model for the data is suggested. For example, if the
maximum correlation occurs for a value of at or near 0.14, then the
data can be modeled with a normal distribution. Values of less than
this imply a heavy-tailed distribution (with -1 approximating a Cauchy).
That is, as the optimal value of goes from 0.14 to -1, increasingly
heavy tails are implied. Similarly, as the optimal value of becomes
greater than 0.14, shorter tails are implied.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366f.htm (3 of 4) [11/13/2003 5:33:04 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366f.htm (4 of 4) [11/13/2003 5:33:04 PM]


1.3.6.6.16. Extreme Value Type I Distribution 1.3.6.6.16. Extreme Value Type I Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.16. Extreme Value Type I


Distribution
Probability The extreme value type I distribution has two forms. One is based on the
Density smallest extreme and the other is based on the largest extreme. We call
Function these the minimum and maximum cases, respectively. Formulas and
plots for both cases are given. The extreme value type I distribution is
also referred to as the Gumbel distribution. The general formula for the probability density function of the Gumbel
(maximum) distribution is
The general formula for the probability density function of the Gumbel
(minimum) distribution is

where is the location parameter and is the scale parameter. The


where is the location parameter and is the scale parameter. The case where = 0 and = 1 is called the standard Gumbel
distribution. The equation for the standard Gumbel distribution
case where = 0 and = 1 is called the standard Gumbel (maximum) reduces to
distribution. The equation for the standard Gumbel distribution
(minimum) reduces to

The following is the plot of the Gumbel probability density function for
the maximum case.
The following is the plot of the Gumbel probability density function for
the minimum case.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (1 of 12) [11/13/2003 5:33:05 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (2 of 12) [11/13/2003 5:33:05 PM]
1.3.6.6.16. Extreme Value Type I Distribution 1.3.6.6.16. Extreme Value Type I Distribution

Since the general form of probability functions can be expressed in The formula for the cumulative distribution function of the Gumbel
terms of the standard distribution, all subsequent formulas in this section distribution (maximum) is
are given for the standard form of the function.

Cumulative The formula for the cumulative distribution function of the Gumbel The following is the plot of the Gumbel cumulative distribution function
Distribution distribution (minimum) is for the maximum case.
Function

The following is the plot of the Gumbel cumulative distribution function


for the minimum case.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (3 of 12) [11/13/2003 5:33:05 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (4 of 12) [11/13/2003 5:33:05 PM]
1.3.6.6.16. Extreme Value Type I Distribution 1.3.6.6.16. Extreme Value Type I Distribution

Percent The formula for the percent point function of the Gumbel distribution
Point (minimum) is
Function

The following is the plot of the Gumbel percent point function for the
minimum case.

Hazard The formula for the hazard function of the Gumbel distribution
Function (minimum) is

The following is the plot of the Gumbel hazard function for the
minimum case.

The formula for the percent point function of the Gumbel distribution
(maximum) is

The following is the plot of the Gumbel percent point function for the
maximum case.

The formula for the hazard function of the Gumbel distribution

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (5 of 12) [11/13/2003 5:33:05 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (6 of 12) [11/13/2003 5:33:05 PM]
1.3.6.6.16. Extreme Value Type I Distribution 1.3.6.6.16. Extreme Value Type I Distribution

(maximum) is

The following is the plot of the Gumbel hazard function for the
maximum case.

The formula for the cumulative hazard function of the Gumbel


distribution (maximum) is

The following is the plot of the Gumbel cumulative hazard function for
the maximum case.

Cumulative The formula for the cumulative hazard function of the Gumbel
Hazard distribution (minimum) is
Function

The following is the plot of the Gumbel cumulative hazard function for
the minimum case.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (7 of 12) [11/13/2003 5:33:05 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (8 of 12) [11/13/2003 5:33:05 PM]
1.3.6.6.16. Extreme Value Type I Distribution 1.3.6.6.16. Extreme Value Type I Distribution

Survival The formula for the survival function of the Gumbel distribution
Function (minimum) is

The following is the plot of the Gumbel survival function for the
minimum case.

Inverse The formula for the inverse survival function of the Gumbel distribution
Survival (minimum) is
Function

The following is the plot of the Gumbel inverse survival function for the
minimum case.
The formula for the survival function of the Gumbel distribution
(maximum) is

The following is the plot of the Gumbel survival function for the
maximum case.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (9 of 12) [11/13/2003 5:33:05 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (10 of 12) [11/13/2003 5:33:05 PM]
1.3.6.6.16. Extreme Value Type I Distribution 1.3.6.6.16. Extreme Value Type I Distribution

The formula for the inverse survival function of the Gumbel distribution
(maximum) is Parameter The method of moments estimators of the Gumbel (maximum)
Estimation distribution are

The following is the plot of the Gumbel inverse survival function for the
maximum case.

where and s are the sample mean and standard deviation,


respectively.
The equations for the maximum likelihood estimation of the shape and
scale parameters are discussed in Chapter 15 of Evans, Hastings, and
Peacock and Chapter 22 of Johnson, Kotz, and Balakrishnan. These
equations need to be solved numerically and this is typically
accomplished by using statistical software packages.

Software Some general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the extreme value
type I distribution.

Common The formulas below are for the maximum order statistic case.
Statistics Mean

The constant 0.5772 is Euler's number.


Median
Mode
Range Negative infinity to positive infinity.
Standard Deviation

Skewness 1.13955
Kurtosis 5.4
Coefficient of
Variation

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (11 of 12) [11/13/2003 5:33:05 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm (12 of 12) [11/13/2003 5:33:05 PM]
1.3.6.6.17. Beta Distribution 1.3.6.6.17. Beta Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.17. Beta Distribution


Probability The general formula for the probability density function of the beta distribution is
Density
Function

where p and q are the shape parameters, a and b are the lower and upper bounds,
respectively, of the distribution, and B(p,q) is the beta function. The beta function has Cumulative The formula for the cumulative distribution function of the beta distribution is also
the formula Distribution called the incomplete beta function ratio (commonly denoted by Ix) and is defined as
Function

The case where a = 0 and b = 1 is called the standard beta distribution. The equation where B is the beta function defined above.
for the standard beta distribution is The following is the plot of the beta cumulative distribution function with the same
values of the shape parameters as the pdf plots above.

Typically we define the general form of a distribution in terms of location and scale
parameters. The beta is different in that we define the general distribution in terms of
the lower and upper bounds. However, the location and scale parameters can be
defined in terms of the lower and upper limits as follows:
location = a
scale = b - a
Since the general form of probability functions can be expressed in terms of the
standard distribution, all subsequent formulas in this section are given for the standard
form of the function.
The following is the plot of the beta probability density function for four different
values of the shape parameters.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366h.htm (1 of 4) [11/13/2003 5:33:06 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366h.htm (2 of 4) [11/13/2003 5:33:06 PM]


1.3.6.6.17. Beta Distribution 1.3.6.6.17. Beta Distribution

Percent The formula for the percent point function of the beta distribution does not exist in a Parameter First consider the case where a and b are assumed to be known. For this case, the
Point simple closed form. It is computed numerically. Estimation method of moments estimates are
Function
The following is the plot of the beta percent point function with the same values of the
shape parameters as the pdf plots above.

where is the sample mean and s2 is the sample variance. If a and b are not 0 and 1,

respectively, then replace with and s2 with in the above

equations.
For the case when a and b are known, the maximum likelihood estimates can be
obtained by solving the following set of equations

The maximum likelihood equations for the case when a and b are not known are given
in pages 221-235 of Volume II of Johnson, Kotz, and Balakrishan.
Other Since the beta distribution is not typically used for reliability applications, we omit the
Probability formulas and plots for the hazard, cumulative hazard, survival, and inverse survival
Functions probability functions. Software Most general purpose statistical software programs, including Dataplot, support at
least some of the probability functions for the beta distribution.
Common The formulas below are for the case where the lower limit is zero and the upper limit is
Statistics one.
Mean

Mode

Range 0 to 1
Standard Deviation

Coefficient of Variation

Skewness

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366h.htm (3 of 4) [11/13/2003 5:33:06 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366h.htm (4 of 4) [11/13/2003 5:33:06 PM]


1.3.6.6.18. Binomial Distribution 1.3.6.6.18. Binomial Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.6. Gallery of Distributions

1.3.6.6.18. Binomial Distribution


Probability The binomial distribution is used when there are exactly two mutually
Mass exclusive outcomes of a trial. These outcomes are appropriately labeled
Function "success" and "failure". The binomial distribution is used to obtain the
probability of observing x successes in N trials, with the probability of success
on a single trial denoted by p. The binomial distribution assumes that p is fixed
for all trials.
Cumulative The formula for the binomial cumulative probability function is
The formula for the binomial probability mass function is
Distribution
Function

The following is the plot of the binomial cumulative distribution function with
where
the same values of p as the pdf plots above.

The following is the plot of the binomial probability density function for four
values of p and n = 100.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366i.htm (1 of 4) [11/13/2003 5:33:07 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366i.htm (2 of 4) [11/13/2003 5:33:07 PM]


1.3.6.6.18. Binomial Distribution 1.3.6.6.18. Binomial Distribution

Percent The binomial percent point function does not exist in simple closed form. It is Parameter The maximum likelihood estimator of p (n is fixed) is
Point computed numerically. Note that because this is a discrete distribution that is Estimation
Function only defined for integer values of x, the percent point function is not smooth in
the way the percent point function typically is for a continuous distribution.
The following is the plot of the binomial percent point function with the same Software Most general purpose statistical software programs, including Dataplot, support
values of p as the pdf plots above. at least some of the probability functions for the binomial distribution.

Common Mean
Statistics Mode
Range 0 to N
Standard Deviation

Coefficient of
Variation

Skewness

Kurtosis

Comments The binomial distribution is probably the most commonly used discrete
distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366i.htm (3 of 4) [11/13/2003 5:33:07 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366i.htm (4 of 4) [11/13/2003 5:33:07 PM]


1.3.6.6.19. Poisson Distribution 1.3.6.6.19. Poisson Distribution

Cumulative The formula for the Poisson cumulative probability function is


Distribution
Function
1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.6. Probability Distributions The following is the plot of the Poisson cumulative distribution function
1.3.6.6. Gallery of Distributions with the same values of as the pdf plots above.

1.3.6.6.19. Poisson Distribution


Probability The Poisson distribution is used to model the number of events
Mass occurring within a given time interval.
Function
The formula for the Poisson probability mass function is

is the shape parameter which indicates the average number of events


in the given time interval.
The following is the plot of the Poisson probability density function for
four values of .

Percent The Poisson percent point function does not exist in simple closed form.
Point It is computed numerically. Note that because this is a discrete
Function distribution that is only defined for integer values of x, the percent point
function is not smooth in the way the percent point function typically is
for a continuous distribution.
The following is the plot of the Poisson percent point function with the
same values of as the pdf plots above.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366j.htm (1 of 4) [11/13/2003 5:33:07 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366j.htm (2 of 4) [11/13/2003 5:33:07 PM]


1.3.6.6.19. Poisson Distribution 1.3.6.6.19. Poisson Distribution

Common Mean
Statistics Mode For non-integer , it is the largest integer less
than . For integer , x = and x = - 1 are
both the mode.
Range 0 to positive infinity
Standard Deviation
Coefficient of
Variation
Skewness

Kurtosis

Parameter The maximum likelihood estimator of is


Estimation

where is the sample mean.

Software Most general purpose statistical software programs, including Dataplot,


support at least some of the probability functions for the Poisson
distribution.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda366j.htm (3 of 4) [11/13/2003 5:33:07 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda366j.htm (4 of 4) [11/13/2003 5:33:07 PM]


1.3.6.7. Tables for Probability Distributions 1.3.6.7.1. Cumulative Distribution Function of the Standard Normal Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1. Exploratory Data Analysis
1.3.6. Probability Distributions
1.3. EDA Techniques 1.3.6.7. Tables for Probability Distributions
1.3.6. Probability Distributions

1.3.6.7.1. Cumulative Distribution Function


1.3.6.7. Tables for Probability Distributions of the Standard Normal
Distribution
Tables Several commonly used tables for probability distributions can be
referenced below. How to Use The table below contains the area under the standard normal curve from
This Table 0 to z. This can be used to compute the cumulative distribution function
The values from these tables can also be obtained from most general values for the standard normal distribution.
purpose statistical software programs. Most introductory statistics
textbooks (e.g., Snedecor and Cochran) contain more extensive tables The table utilizes the symmetry of the normal distribution, so what in
fact is given is
than are included here. These tables are included for convenience.
1. Cumulative distribution function for the standard normal where a is the value of interest. This is demonstrated in the graph below
distribution for a = 0.5. The shaded area of the curve represents the probability that x
is between 0 and a.
2. Upper critical values of Student's t-distribution with degrees of
freedom
3. Upper critical values of the F-distribution with and degrees
of freedom
4. Upper critical values of the chi-square distribution with degrees
of freedom
5. Critical values of t* distribution for testing the output of a linear
calibration line at 3 points
6. Upper critical values of the normal PPCC distribution

This can be clarified by a few simple examples.


1. What is the probability that x is less than or equal to 1.53? Look
for 1.5 in the X column, go right to the 0.03 column to find the
value 0.43699. Now add 0.5 (for the probability less than zero) to
obtain the final result of 0.93699.
2. What is the probability that x is less than or equal to -1.53? For
negative values, use the relationship

From the first example, this gives 1 - 0.93699 = 0.06301.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda367.htm [11/13/2003 5:33:07 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3671.htm (1 of 3) [11/13/2003 5:33:08 PM]


1.3.6.7.1. Cumulative Distribution Function of the Standard Normal Distribution 1.3.6.7.1. Cumulative Distribution Function of the Standard Normal Distribution
3. What is the probability that x is between -1 and 0.5? Look up the 1.5 0.43319 0.43448 0.43574 0.43699 0.43822 0.43943 0.44062 0.44179 0.44295
values for 0.5 (0.5 + 0.19146 = 0.69146) and -1 (1 - (0.5 + 0.44408
0.34134) = 0.15866). Then subtract the results (0.69146 - 1.6 0.44520 0.44630 0.44738 0.44845 0.44950 0.45053 0.45154 0.45254 0.45352
0.15866) to obtain the result 0.5328. 0.45449
To use this table with a non-standard normal distribution (either the 1.7 0.45543 0.45637 0.45728 0.45818 0.45907 0.45994 0.46080 0.46164 0.46246
location parameter is not 0 or the scale parameter is not 1), standardize 0.46327
your value by subtracting the mean and dividing the result by the 1.8 0.46407 0.46485 0.46562 0.46638 0.46712 0.46784 0.46856 0.46926 0.46995
standard deviation. Then look up the value for this standardized value. 0.47062
1.9 0.47128 0.47193 0.47257 0.47320 0.47381 0.47441 0.47500 0.47558 0.47615
A few particularly important numbers derived from the table below, 0.47670
specifically numbers that are commonly used in significance tests, are 2.0 0.47725 0.47778 0.47831 0.47882 0.47932 0.47982 0.48030 0.48077 0.48124
summarized in the following table: 0.48169
p 0.001 0.005 0.010 0.025 0.050 0.100 2.1 0.48214 0.48257 0.48300 0.48341 0.48382 0.48422 0.48461 0.48500 0.48537
0.48574
Zp -3.090 -2.576 -2.326 -1.960 -1.645 -1.282
2.2 0.48610 0.48645 0.48679 0.48713 0.48745 0.48778 0.48809 0.48840 0.48870
p 0.999 0.995 0.990 0.975 0.950 0.900 0.48899
Zp +3.090 +2.576 +2.326 +1.960 +1.645 +1.282 2.3 0.48928 0.48956 0.48983 0.49010 0.49036 0.49061 0.49086 0.49111 0.49134
0.49158
These are critical values for the normal distribution. 2.4 0.49180 0.49202 0.49224 0.49245 0.49266 0.49286 0.49305 0.49324 0.49343
0.49361
2.5 0.49379 0.49396 0.49413 0.49430 0.49446 0.49461 0.49477 0.49492 0.49506
0.49520
2.6 0.49534 0.49547 0.49560 0.49573 0.49585 0.49598 0.49609 0.49621 0.49632
Area under the Normal Curve from 0 to X 0.49643
2.7 0.49653 0.49664 0.49674 0.49683 0.49693 0.49702 0.49711 0.49720 0.49728
X 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.49736
2.8 0.49744 0.49752 0.49760 0.49767 0.49774 0.49781 0.49788 0.49795 0.49801
0.49807
0.0 0.00000 0.00399 0.00798 0.01197 0.01595 0.01994 0.02392 0.02790 0.03188 2.9 0.49813 0.49819 0.49825 0.49831 0.49836 0.49841 0.49846 0.49851 0.49856
0.03586 0.49861
0.1 0.03983 0.04380 0.04776 0.05172 0.05567 0.05962 0.06356 0.06749 0.07142 3.0 0.49865 0.49869 0.49874 0.49878 0.49882 0.49886 0.49889 0.49893 0.49896
0.07535 0.49900
0.2 0.07926 0.08317 0.08706 0.09095 0.09483 0.09871 0.10257 0.10642 0.11026 3.1 0.49903 0.49906 0.49910 0.49913 0.49916 0.49918 0.49921 0.49924 0.49926
0.11409 0.49929
0.3 0.11791 0.12172 0.12552 0.12930 0.13307 0.13683 0.14058 0.14431 0.14803 3.2 0.49931 0.49934 0.49936 0.49938 0.49940 0.49942 0.49944 0.49946 0.49948
0.15173 0.49950
0.4 0.15542 0.15910 0.16276 0.16640 0.17003 0.17364 0.17724 0.18082 0.18439 3.3 0.49952 0.49953 0.49955 0.49957 0.49958 0.49960 0.49961 0.49962 0.49964
0.18793 0.49965
0.5 0.19146 0.19497 0.19847 0.20194 0.20540 0.20884 0.21226 0.21566 0.21904 3.4 0.49966 0.49968 0.49969 0.49970 0.49971 0.49972 0.49973 0.49974 0.49975
0.22240 0.49976
0.6 0.22575 0.22907 0.23237 0.23565 0.23891 0.24215 0.24537 0.24857 0.25175 3.5 0.49977 0.49978 0.49978 0.49979 0.49980 0.49981 0.49981 0.49982 0.49983
0.25490 0.49983
0.7 0.25804 0.26115 0.26424 0.26730 0.27035 0.27337 0.27637 0.27935 0.28230 3.6 0.49984 0.49985 0.49985 0.49986 0.49986 0.49987 0.49987 0.49988 0.49988
0.28524 0.49989
0.8 0.28814 0.29103 0.29389 0.29673 0.29955 0.30234 0.30511 0.30785 0.31057 3.7 0.49989 0.49990 0.49990 0.49990 0.49991 0.49991 0.49992 0.49992 0.49992
0.31327 0.49992
0.9 0.31594 0.31859 0.32121 0.32381 0.32639 0.32894 0.33147 0.33398 0.33646 3.8 0.49993 0.49993 0.49993 0.49994 0.49994 0.49994 0.49994 0.49995 0.49995
0.33891 0.49995
1.0 0.34134 0.34375 0.34614 0.34849 0.35083 0.35314 0.35543 0.35769 0.35993 3.9 0.49995 0.49995 0.49996 0.49996 0.49996 0.49996 0.49996 0.49996 0.49997
0.36214 0.49997
1.1 0.36433 0.36650 0.36864 0.37076 0.37286 0.37493 0.37698 0.37900 0.38100 4.0 0.49997 0.49997 0.49997 0.49997 0.49997 0.49997 0.49998 0.49998 0.49998
0.38298 0.49998
1.2 0.38493 0.38686 0.38877 0.39065 0.39251 0.39435 0.39617 0.39796 0.39973
0.40147
1.3 0.40320 0.40490 0.40658 0.40824 0.40988 0.41149 0.41308 0.41466 0.41621
0.41774
1.4 0.41924 0.42073 0.42220 0.42364 0.42507 0.42647 0.42785 0.42922 0.43056
0.43189

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3671.htm (2 of 3) [11/13/2003 5:33:08 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3671.htm (3 of 3) [11/13/2003 5:33:08 PM]


1.3.6.7.2. Upper Critical Values of the Student's-t Distribution 1.3.6.7.2. Upper Critical Values of the Student's-t Distribution

1. Exploratory Data Analysis Upper critical values of Student's t distribution with degrees of freedom
1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.7. Tables for Probability Distributions
Probability of exceeding the critical value

1.3.6.7.2. Upper Critical Values of the Student's-t 0.10 0.05 0.025 0.01 0.005 0.001
Distribution
1. 3.078 6.314 12.706 31.821 63.657 318.313
How to This table contains the upper critical values of the Student's t-distribution. The upper critical 2. 1.886 2.920 4.303 6.965 9.925 22.327
Use This values are computed using the percent point function. Due to the symmetry of the t-distribution,
Table
3. 1.638 2.353 3.182 4.541 5.841 10.215
this table can be used for both 1-sided (lower and upper) and 2-sided tests using the appropriate
value of .
4. 1.533 2.132 2.776 3.747 4.604 7.173
5. 1.476 2.015 2.571 3.365 4.032 5.893
The significance level, , is demonstrated with the graph below which plots a t distribution with 6. 1.440 1.943 2.447 3.143 3.707 5.208
10 degrees of freedom. The most commonly used significance level is = 0.05. For a two-sided
test, we compute the percent point function at /2 (0.025). If the absolute value of the test 7. 1.415 1.895 2.365 2.998 3.499 4.782
statistic is greater than the upper critical value (0.025), then we reject the null hypothesis. Due to 8. 1.397 1.860 2.306 2.896 3.355 4.499
the symmetry of the t-distribution, we only tabulate the upper critical values in the table below. 9. 1.383 1.833 2.262 2.821 3.250 4.296
10. 1.372 1.812 2.228 2.764 3.169 4.143
11. 1.363 1.796 2.201 2.718 3.106 4.024
12. 1.356 1.782 2.179 2.681 3.055 3.929
13. 1.350 1.771 2.160 2.650 3.012 3.852
14. 1.345 1.761 2.145 2.624 2.977 3.787
15. 1.341 1.753 2.131 2.602 2.947 3.733
16. 1.337 1.746 2.120 2.583 2.921 3.686
17. 1.333 1.740 2.110 2.567 2.898 3.646
18. 1.330 1.734 2.101 2.552 2.878 3.610
19. 1.328 1.729 2.093 2.539 2.861 3.579
20. 1.325 1.725 2.086 2.528 2.845 3.552
21. 1.323 1.721 2.080 2.518 2.831 3.527
22. 1.321 1.717 2.074 2.508 2.819 3.505
23. 1.319 1.714 2.069 2.500 2.807 3.485
24. 1.318 1.711 2.064 2.492 2.797 3.467
Given a specified value for : 25. 1.316 1.708 2.060 2.485 2.787 3.450
1. For a two-sided test, find the column corresponding to /2 and reject the null hypothesis if 26. 1.315 1.706 2.056 2.479 2.779 3.435
the absolute value of the test statistic is greater than the value of in the table below. 27. 1.314 1.703 2.052 2.473 2.771 3.421
2. For an upper one-sided test, find the column corresponding to and reject the null 28. 1.313 1.701 2.048 2.467 2.763 3.408
hypothesis if the test statistic is greater than the tabled value.
29. 1.311 1.699 2.045 2.462 2.756 3.396
3. For an lower one-sided test, find the column corresponding to and reject the null
hypothesis if the test statistic is less than the negative of the tabled value.
30. 1.310 1.697 2.042 2.457 2.750 3.385

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm (1 of 4) [11/13/2003 5:33:08 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm (2 of 4) [11/13/2003 5:33:08 PM]


1.3.6.7.2. Upper Critical Values of the Student's-t Distribution 1.3.6.7.2. Upper Critical Values of the Student's-t Distribution

31. 1.309 1.696 2.040 2.453 2.744 3.375 72. 1.293 1.666 1.993 2.379 2.646 3.207
32. 1.309 1.694 2.037 2.449 2.738 3.365 73. 1.293 1.666 1.993 2.379 2.645 3.206
33. 1.308 1.692 2.035 2.445 2.733 3.356 74. 1.293 1.666 1.993 2.378 2.644 3.204
34. 1.307 1.691 2.032 2.441 2.728 3.348 75. 1.293 1.665 1.992 2.377 2.643 3.202
35. 1.306 1.690 2.030 2.438 2.724 3.340 76. 1.293 1.665 1.992 2.376 2.642 3.201
36. 1.306 1.688 2.028 2.434 2.719 3.333 77. 1.293 1.665 1.991 2.376 2.641 3.199
37. 1.305 1.687 2.026 2.431 2.715 3.326 78. 1.292 1.665 1.991 2.375 2.640 3.198
38. 1.304 1.686 2.024 2.429 2.712 3.319 79. 1.292 1.664 1.990 2.374 2.640 3.197
39. 1.304 1.685 2.023 2.426 2.708 3.313 80. 1.292 1.664 1.990 2.374 2.639 3.195
40. 1.303 1.684 2.021 2.423 2.704 3.307 81. 1.292 1.664 1.990 2.373 2.638 3.194
41. 1.303 1.683 2.020 2.421 2.701 3.301 82. 1.292 1.664 1.989 2.373 2.637 3.193
42. 1.302 1.682 2.018 2.418 2.698 3.296 83. 1.292 1.663 1.989 2.372 2.636 3.191
43. 1.302 1.681 2.017 2.416 2.695 3.291 84. 1.292 1.663 1.989 2.372 2.636 3.190
44. 1.301 1.680 2.015 2.414 2.692 3.286 85. 1.292 1.663 1.988 2.371 2.635 3.189
45. 1.301 1.679 2.014 2.412 2.690 3.281 86. 1.291 1.663 1.988 2.370 2.634 3.188
46. 1.300 1.679 2.013 2.410 2.687 3.277 87. 1.291 1.663 1.988 2.370 2.634 3.187
47. 1.300 1.678 2.012 2.408 2.685 3.273 88. 1.291 1.662 1.987 2.369 2.633 3.185
48. 1.299 1.677 2.011 2.407 2.682 3.269 89. 1.291 1.662 1.987 2.369 2.632 3.184
49. 1.299 1.677 2.010 2.405 2.680 3.265 90. 1.291 1.662 1.987 2.368 2.632 3.183
50. 1.299 1.676 2.009 2.403 2.678 3.261 91. 1.291 1.662 1.986 2.368 2.631 3.182
51. 1.298 1.675 2.008 2.402 2.676 3.258 92. 1.291 1.662 1.986 2.368 2.630 3.181
52. 1.298 1.675 2.007 2.400 2.674 3.255 93. 1.291 1.661 1.986 2.367 2.630 3.180
53. 1.298 1.674 2.006 2.399 2.672 3.251 94. 1.291 1.661 1.986 2.367 2.629 3.179
54. 1.297 1.674 2.005 2.397 2.670 3.248 95. 1.291 1.661 1.985 2.366 2.629 3.178
55. 1.297 1.673 2.004 2.396 2.668 3.245 96. 1.290 1.661 1.985 2.366 2.628 3.177
56. 1.297 1.673 2.003 2.395 2.667 3.242 97. 1.290 1.661 1.985 2.365 2.627 3.176
57. 1.297 1.672 2.002 2.394 2.665 3.239 98. 1.290 1.661 1.984 2.365 2.627 3.175
58. 1.296 1.672 2.002 2.392 2.663 3.237 99. 1.290 1.660 1.984 2.365 2.626 3.175
59. 1.296 1.671 2.001 2.391 2.662 3.234 100. 1.290 1.660 1.984 2.364 2.626 3.174
60. 1.296 1.671 2.000 2.390 2.660 3.232 1.282 1.645 1.960 2.326 2.576 3.090
61. 1.296 1.670 2.000 2.389 2.659 3.229
62. 1.295 1.670 1.999 2.388 2.657 3.227
63. 1.295 1.669 1.998 2.387 2.656 3.225
64. 1.295 1.669 1.998 2.386 2.655 3.223
65. 1.295 1.669 1.997 2.385 2.654 3.220
66. 1.295 1.668 1.997 2.384 2.652 3.218
67. 1.294 1.668 1.996 2.383 2.651 3.216
68. 1.294 1.668 1.995 2.382 2.650 3.214
69. 1.294 1.667 1.995 2.382 2.649 3.213
70. 1.294 1.667 1.994 2.381 2.648 3.211
71. 1.294 1.667 1.994 2.380 2.647 3.209

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm (3 of 4) [11/13/2003 5:33:08 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm (4 of 4) [11/13/2003 5:33:08 PM]


1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

Upper critical values of the F distribution


1. Exploratory Data Analysis
1.3. EDA Techniques
1.3.6. Probability Distributions for numerator degrees of freedom and denominator degrees of freedom
1.3.6.7. Tables for Probability Distributions
5% significance level
1.3.6.7.3. Upper Critical Values of the F
Distribution
How to Use This table contains the upper critical values of the F distribution. This \ 1 2 3 4 5 6 7 8
This Table table is used for one-sided F tests at the = 0.05, 0.10, and 0.01 levels.
9 10
More specifically, a test statistic is computed with and degrees of
freedom, and the result is compared to this table. For a one-sided test,
the null hypothesis is rejected when the test statistic is greater than the
tabled value. This is demonstrated with the graph of an F distribution 1 161.448 199.500 215.707 224.583 230.162 233.986 236.768
with = 10 and = 10. The shaded area of the graph indicates the 238.882 240.543 241.882
rejection region at the significance level. Since this is a one-sided test,
we have probability in the upper tail of exceeding the critical value
2 18.513 19.000 19.164 19.247 19.296 19.330 19.353
and zero in the lower tail. Because the F distribution is asymmetric, a 19.371 19.385 19.396
two-sided test requires a set of of tables (not included here) that contain 3 10.128 9.552 9.277 9.117 9.013 8.941 8.887
the rejection regions for both the lower and upper tails. 8.845 8.812 8.786
4 7.709 6.944 6.591 6.388 6.256 6.163 6.094
6.041 5.999 5.964
5 6.608 5.786 5.409 5.192 5.050 4.950 4.876
4.818 4.772 4.735
6 5.987 5.143 4.757 4.534 4.387 4.284 4.207
4.147 4.099 4.060
7 5.591 4.737 4.347 4.120 3.972 3.866 3.787
3.726 3.677 3.637
8 5.318 4.459 4.066 3.838 3.687 3.581 3.500
3.438 3.388 3.347
9 5.117 4.256 3.863 3.633 3.482 3.374 3.293
3.230 3.179 3.137
10 4.965 4.103 3.708 3.478 3.326 3.217 3.135
3.072 3.020 2.978
11 4.844 3.982 3.587 3.357 3.204 3.095 3.012
Contents The following tables for from 1 to 100 are included: 2.948 2.896 2.854
1. One sided, 5% significance level, = 1 - 10 12 4.747 3.885 3.490 3.259 3.106 2.996 2.913
2. One sided, 5% significance level, = 11 - 20 2.849 2.796 2.753
3. One sided, 10% significance level, = 1 - 10 13 4.667 3.806 3.411 3.179 3.025 2.915 2.832
4. One sided, 10% significance level, = 11 - 20
2.767 2.714 2.671
14 4.600 3.739 3.344 3.112 2.958 2.848 2.764
5. One sided, 1% significance level, = 1 - 10
2.699 2.646 2.602
6. One sided, 1% significance level, = 11 - 20 15 4.543 3.682 3.287 3.056 2.901 2.790 2.707
2.641 2.588 2.544

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (1 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (2 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

16 4.494 3.634 3.239 3.007 2.852 2.741 2.657 39 4.091 3.238 2.845 2.612 2.456 2.342 2.255
2.591 2.538 2.494 2.187 2.131 2.084
17 4.451 3.592 3.197 2.965 2.810 2.699 2.614 40 4.085 3.232 2.839 2.606 2.449 2.336 2.249
2.548 2.494 2.450 2.180 2.124 2.077
18 4.414 3.555 3.160 2.928 2.773 2.661 2.577 41 4.079 3.226 2.833 2.600 2.443 2.330 2.243
2.510 2.456 2.412 2.174 2.118 2.071
19 4.381 3.522 3.127 2.895 2.740 2.628 2.544 42 4.073 3.220 2.827 2.594 2.438 2.324 2.237
2.477 2.423 2.378 2.168 2.112 2.065
20 4.351 3.493 3.098 2.866 2.711 2.599 2.514 43 4.067 3.214 2.822 2.589 2.432 2.318 2.232
2.447 2.393 2.348 2.163 2.106 2.059
21 4.325 3.467 3.072 2.840 2.685 2.573 2.488 44 4.062 3.209 2.816 2.584 2.427 2.313 2.226
2.420 2.366 2.321 2.157 2.101 2.054
22 4.301 3.443 3.049 2.817 2.661 2.549 2.464 45 4.057 3.204 2.812 2.579 2.422 2.308 2.221
2.397 2.342 2.297 2.152 2.096 2.049
23 4.279 3.422 3.028 2.796 2.640 2.528 2.442 46 4.052 3.200 2.807 2.574 2.417 2.304 2.216
2.375 2.320 2.275 2.147 2.091 2.044
24 4.260 3.403 3.009 2.776 2.621 2.508 2.423 47 4.047 3.195 2.802 2.570 2.413 2.299 2.212
2.355 2.300 2.255 2.143 2.086 2.039
25 4.242 3.385 2.991 2.759 2.603 2.490 2.405 48 4.043 3.191 2.798 2.565 2.409 2.295 2.207
2.337 2.282 2.236 2.138 2.082 2.035
26 4.225 3.369 2.975 2.743 2.587 2.474 2.388 49 4.038 3.187 2.794 2.561 2.404 2.290 2.203
2.321 2.265 2.220 2.134 2.077 2.030
27 4.210 3.354 2.960 2.728 2.572 2.459 2.373 50 4.034 3.183 2.790 2.557 2.400 2.286 2.199
2.305 2.250 2.204 2.130 2.073 2.026
28 4.196 3.340 2.947 2.714 2.558 2.445 2.359 51 4.030 3.179 2.786 2.553 2.397 2.283 2.195
2.291 2.236 2.190 2.126 2.069 2.022
29 4.183 3.328 2.934 2.701 2.545 2.432 2.346 52 4.027 3.175 2.783 2.550 2.393 2.279 2.192
2.278 2.223 2.177 2.122 2.066 2.018
30 4.171 3.316 2.922 2.690 2.534 2.421 2.334 53 4.023 3.172 2.779 2.546 2.389 2.275 2.188
2.266 2.211 2.165 2.119 2.062 2.015
31 4.160 3.305 2.911 2.679 2.523 2.409 2.323 54 4.020 3.168 2.776 2.543 2.386 2.272 2.185
2.255 2.199 2.153 2.115 2.059 2.011
32 4.149 3.295 2.901 2.668 2.512 2.399 2.313 55 4.016 3.165 2.773 2.540 2.383 2.269 2.181
2.244 2.189 2.142 2.112 2.055 2.008
33 4.139 3.285 2.892 2.659 2.503 2.389 2.303 56 4.013 3.162 2.769 2.537 2.380 2.266 2.178
2.235 2.179 2.133 2.109 2.052 2.005
34 4.130 3.276 2.883 2.650 2.494 2.380 2.294 57 4.010 3.159 2.766 2.534 2.377 2.263 2.175
2.225 2.170 2.123 2.106 2.049 2.001
35 4.121 3.267 2.874 2.641 2.485 2.372 2.285 58 4.007 3.156 2.764 2.531 2.374 2.260 2.172
2.217 2.161 2.114 2.103 2.046 1.998
36 4.113 3.259 2.866 2.634 2.477 2.364 2.277 59 4.004 3.153 2.761 2.528 2.371 2.257 2.169
2.209 2.153 2.106 2.100 2.043 1.995
37 4.105 3.252 2.859 2.626 2.470 2.356 2.270 60 4.001 3.150 2.758 2.525 2.368 2.254 2.167
2.201 2.145 2.098 2.097 2.040 1.993
38 4.098 3.245 2.852 2.619 2.463 2.349 2.262 61 3.998 3.148 2.755 2.523 2.366 2.251 2.164
2.194 2.138 2.091 2.094 2.037 1.990

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (3 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (4 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

62 3.996 3.145 2.753 2.520 2.363 2.249 2.161 85 3.953 3.104 2.712 2.479 2.322 2.207 2.119
2.092 2.035 1.987 2.049 1.992 1.944
63 3.993 3.143 2.751 2.518 2.361 2.246 2.159 86 3.952 3.103 2.711 2.478 2.321 2.206 2.118
2.089 2.032 1.985 2.048 1.991 1.943
64 3.991 3.140 2.748 2.515 2.358 2.244 2.156 87 3.951 3.101 2.709 2.476 2.319 2.205 2.117
2.087 2.030 1.982 2.047 1.989 1.941
65 3.989 3.138 2.746 2.513 2.356 2.242 2.154 88 3.949 3.100 2.708 2.475 2.318 2.203 2.115
2.084 2.027 1.980 2.045 1.988 1.940
66 3.986 3.136 2.744 2.511 2.354 2.239 2.152 89 3.948 3.099 2.707 2.474 2.317 2.202 2.114
2.082 2.025 1.977 2.044 1.987 1.939
67 3.984 3.134 2.742 2.509 2.352 2.237 2.150 90 3.947 3.098 2.706 2.473 2.316 2.201 2.113
2.080 2.023 1.975 2.043 1.986 1.938
68 3.982 3.132 2.740 2.507 2.350 2.235 2.148 91 3.946 3.097 2.705 2.472 2.315 2.200 2.112
2.078 2.021 1.973 2.042 1.984 1.936
69 3.980 3.130 2.737 2.505 2.348 2.233 2.145 92 3.945 3.095 2.704 2.471 2.313 2.199 2.111
2.076 2.019 1.971 2.041 1.983 1.935
70 3.978 3.128 2.736 2.503 2.346 2.231 2.143 93 3.943 3.094 2.703 2.470 2.312 2.198 2.110
2.074 2.017 1.969 2.040 1.982 1.934
71 3.976 3.126 2.734 2.501 2.344 2.229 2.142 94 3.942 3.093 2.701 2.469 2.311 2.197 2.109
2.072 2.015 1.967 2.038 1.981 1.933
72 3.974 3.124 2.732 2.499 2.342 2.227 2.140 95 3.941 3.092 2.700 2.467 2.310 2.196 2.108
2.070 2.013 1.965 2.037 1.980 1.932
73 3.972 3.122 2.730 2.497 2.340 2.226 2.138 96 3.940 3.091 2.699 2.466 2.309 2.195 2.106
2.068 2.011 1.963 2.036 1.979 1.931
74 3.970 3.120 2.728 2.495 2.338 2.224 2.136 97 3.939 3.090 2.698 2.465 2.308 2.194 2.105
2.066 2.009 1.961 2.035 1.978 1.930
75 3.968 3.119 2.727 2.494 2.337 2.222 2.134 98 3.938 3.089 2.697 2.465 2.307 2.193 2.104
2.064 2.007 1.959 2.034 1.977 1.929
76 3.967 3.117 2.725 2.492 2.335 2.220 2.133 99 3.937 3.088 2.696 2.464 2.306 2.192 2.103
2.063 2.006 1.958 2.033 1.976 1.928
77 3.965 3.115 2.723 2.490 2.333 2.219 2.131 100 3.936 3.087 2.696 2.463 2.305 2.191 2.103
2.061 2.004 1.956 2.032 1.975 1.927
78 3.963 3.114 2.722 2.489 2.332 2.217 2.129
2.059 2.002 1.954
79 3.962 3.112 2.720 2.487 2.330 2.216 2.128 \ 11 12 13 14 15 16 17 18
2.058 2.001 1.953 19 20
80 3.960 3.111 2.719 2.486 2.329 2.214 2.126
2.056 1.999 1.951
81 3.959 3.109 2.717 2.484 2.327 2.213 2.125
2.055 1.998 1.950
82 3.957 3.108 2.716 2.483 2.326 2.211 2.123 1 242.983 243.906 244.690 245.364 245.950 246.464 246.918
2.053 1.996 1.948 247.323 247.686 248.013
83 3.956 3.107 2.715 2.482 2.324 2.210 2.122 2 19.405 19.413 19.419 19.424 19.429 19.433 19.437
2.052 1.995 1.947 19.440 19.443 19.446
84 3.955 3.105 2.713 2.480 2.323 2.209 2.121 3 8.763 8.745 8.729 8.715 8.703 8.692 8.683
2.051 1.993 1.945 8.675 8.667 8.660

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (5 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (6 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

4 5.936 5.912 5.891 5.873 5.858 5.844 5.832 27 2.166 2.132 2.103 2.078 2.056 2.036 2.018
5.821 5.811 5.803 2.002 1.987 1.974
5 4.704 4.678 4.655 4.636 4.619 4.604 4.590 28 2.151 2.118 2.089 2.064 2.041 2.021 2.003
4.579 4.568 4.558 1.987 1.972 1.959
6 4.027 4.000 3.976 3.956 3.938 3.922 3.908 29 2.138 2.104 2.075 2.050 2.027 2.007 1.989
3.896 3.884 3.874 1.973 1.958 1.945
7 3.603 3.575 3.550 3.529 3.511 3.494 3.480 30 2.126 2.092 2.063 2.037 2.015 1.995 1.976
3.467 3.455 3.445 1.960 1.945 1.932
8 3.313 3.284 3.259 3.237 3.218 3.202 3.187 31 2.114 2.080 2.051 2.026 2.003 1.983 1.965
3.173 3.161 3.150 1.948 1.933 1.920
9 3.102 3.073 3.048 3.025 3.006 2.989 2.974 32 2.103 2.070 2.040 2.015 1.992 1.972 1.953
2.960 2.948 2.936 1.937 1.922 1.908
10 2.943 2.913 2.887 2.865 2.845 2.828 2.812 33 2.093 2.060 2.030 2.004 1.982 1.961 1.943
2.798 2.785 2.774 1.926 1.911 1.898
11 2.818 2.788 2.761 2.739 2.719 2.701 2.685 34 2.084 2.050 2.021 1.995 1.972 1.952 1.933
2.671 2.658 2.646 1.917 1.902 1.888
12 2.717 2.687 2.660 2.637 2.617 2.599 2.583 35 2.075 2.041 2.012 1.986 1.963 1.942 1.924
2.568 2.555 2.544 1.907 1.892 1.878
13 2.635 2.604 2.577 2.554 2.533 2.515 2.499 36 2.067 2.033 2.003 1.977 1.954 1.934 1.915
2.484 2.471 2.459 1.899 1.883 1.870
14 2.565 2.534 2.507 2.484 2.463 2.445 2.428 37 2.059 2.025 1.995 1.969 1.946 1.926 1.907
2.413 2.400 2.388 1.890 1.875 1.861
15 2.507 2.475 2.448 2.424 2.403 2.385 2.368 38 2.051 2.017 1.988 1.962 1.939 1.918 1.899
2.353 2.340 2.328 1.883 1.867 1.853
16 2.456 2.425 2.397 2.373 2.352 2.333 2.317 39 2.044 2.010 1.981 1.954 1.931 1.911 1.892
2.302 2.288 2.276 1.875 1.860 1.846
17 2.413 2.381 2.353 2.329 2.308 2.289 2.272 40 2.038 2.003 1.974 1.948 1.924 1.904 1.885
2.257 2.243 2.230 1.868 1.853 1.839
18 2.374 2.342 2.314 2.290 2.269 2.250 2.233 41 2.031 1.997 1.967 1.941 1.918 1.897 1.879
2.217 2.203 2.191 1.862 1.846 1.832
19 2.340 2.308 2.280 2.256 2.234 2.215 2.198 42 2.025 1.991 1.961 1.935 1.912 1.891 1.872
2.182 2.168 2.155 1.855 1.840 1.826
20 2.310 2.278 2.250 2.225 2.203 2.184 2.167 43 2.020 1.985 1.955 1.929 1.906 1.885 1.866
2.151 2.137 2.124 1.849 1.834 1.820
21 2.283 2.250 2.222 2.197 2.176 2.156 2.139 44 2.014 1.980 1.950 1.924 1.900 1.879 1.861
2.123 2.109 2.096 1.844 1.828 1.814
22 2.259 2.226 2.198 2.173 2.151 2.131 2.114 45 2.009 1.974 1.945 1.918 1.895 1.874 1.855
2.098 2.084 2.071 1.838 1.823 1.808
23 2.236 2.204 2.175 2.150 2.128 2.109 2.091 46 2.004 1.969 1.940 1.913 1.890 1.869 1.850
2.075 2.061 2.048 1.833 1.817 1.803
24 2.216 2.183 2.155 2.130 2.108 2.088 2.070 47 1.999 1.965 1.935 1.908 1.885 1.864 1.845
2.054 2.040 2.027 1.828 1.812 1.798
25 2.198 2.165 2.136 2.111 2.089 2.069 2.051 48 1.995 1.960 1.930 1.904 1.880 1.859 1.840
2.035 2.021 2.007 1.823 1.807 1.793
26 2.181 2.148 2.119 2.094 2.072 2.052 2.034 49 1.990 1.956 1.926 1.899 1.876 1.855 1.836
2.018 2.003 1.990 1.819 1.803 1.789

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (7 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (8 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

50 1.986 1.952 1.921 1.895 1.871 1.850 1.831 73 1.922 1.887 1.857 1.830 1.806 1.784 1.765
1.814 1.798 1.784 1.747 1.731 1.716
51 1.982 1.947 1.917 1.891 1.867 1.846 1.827 74 1.921 1.885 1.855 1.828 1.804 1.782 1.763
1.810 1.794 1.780 1.745 1.729 1.714
52 1.978 1.944 1.913 1.887 1.863 1.842 1.823 75 1.919 1.884 1.853 1.826 1.802 1.780 1.761
1.806 1.790 1.776 1.743 1.727 1.712
53 1.975 1.940 1.910 1.883 1.859 1.838 1.819 76 1.917 1.882 1.851 1.824 1.800 1.778 1.759
1.802 1.786 1.772 1.741 1.725 1.710
54 1.971 1.936 1.906 1.879 1.856 1.835 1.816 77 1.915 1.880 1.849 1.822 1.798 1.777 1.757
1.798 1.782 1.768 1.739 1.723 1.708
55 1.968 1.933 1.903 1.876 1.852 1.831 1.812 78 1.914 1.878 1.848 1.821 1.797 1.775 1.755
1.795 1.779 1.764 1.738 1.721 1.707
56 1.964 1.930 1.899 1.873 1.849 1.828 1.809 79 1.912 1.877 1.846 1.819 1.795 1.773 1.754
1.791 1.775 1.761 1.736 1.720 1.705
57 1.961 1.926 1.896 1.869 1.846 1.824 1.805 80 1.910 1.875 1.845 1.817 1.793 1.772 1.752
1.788 1.772 1.757 1.734 1.718 1.703
58 1.958 1.923 1.893 1.866 1.842 1.821 1.802 81 1.909 1.874 1.843 1.816 1.792 1.770 1.750
1.785 1.769 1.754 1.733 1.716 1.702
59 1.955 1.920 1.890 1.863 1.839 1.818 1.799 82 1.907 1.872 1.841 1.814 1.790 1.768 1.749
1.781 1.766 1.751 1.731 1.715 1.700
60 1.952 1.917 1.887 1.860 1.836 1.815 1.796 83 1.906 1.871 1.840 1.813 1.789 1.767 1.747
1.778 1.763 1.748 1.729 1.713 1.698
61 1.949 1.915 1.884 1.857 1.834 1.812 1.793 84 1.905 1.869 1.838 1.811 1.787 1.765 1.746
1.776 1.760 1.745 1.728 1.712 1.697
62 1.947 1.912 1.882 1.855 1.831 1.809 1.790 85 1.903 1.868 1.837 1.810 1.786 1.764 1.744
1.773 1.757 1.742 1.726 1.710 1.695
63 1.944 1.909 1.879 1.852 1.828 1.807 1.787 86 1.902 1.867 1.836 1.808 1.784 1.762 1.743
1.770 1.754 1.739 1.725 1.709 1.694
64 1.942 1.907 1.876 1.849 1.826 1.804 1.785 87 1.900 1.865 1.834 1.807 1.783 1.761 1.741
1.767 1.751 1.737 1.724 1.707 1.692
65 1.939 1.904 1.874 1.847 1.823 1.802 1.782 88 1.899 1.864 1.833 1.806 1.782 1.760 1.740
1.765 1.749 1.734 1.722 1.706 1.691
66 1.937 1.902 1.871 1.845 1.821 1.799 1.780 89 1.898 1.863 1.832 1.804 1.780 1.758 1.739
1.762 1.746 1.732 1.721 1.705 1.690
67 1.935 1.900 1.869 1.842 1.818 1.797 1.777 90 1.897 1.861 1.830 1.803 1.779 1.757 1.737
1.760 1.744 1.729 1.720 1.703 1.688
68 1.932 1.897 1.867 1.840 1.816 1.795 1.775 91 1.895 1.860 1.829 1.802 1.778 1.756 1.736
1.758 1.742 1.727 1.718 1.702 1.687
69 1.930 1.895 1.865 1.838 1.814 1.792 1.773 92 1.894 1.859 1.828 1.801 1.776 1.755 1.735
1.755 1.739 1.725 1.717 1.701 1.686
70 1.928 1.893 1.863 1.836 1.812 1.790 1.771 93 1.893 1.858 1.827 1.800 1.775 1.753 1.734
1.753 1.737 1.722 1.716 1.699 1.684
71 1.926 1.891 1.861 1.834 1.810 1.788 1.769 94 1.892 1.857 1.826 1.798 1.774 1.752 1.733
1.751 1.735 1.720 1.715 1.698 1.683
72 1.924 1.889 1.859 1.832 1.808 1.786 1.767 95 1.891 1.856 1.825 1.797 1.773 1.751 1.731
1.749 1.733 1.718 1.713 1.697 1.682

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (9 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (10 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

96 1.890 1.854 1.823 1.796 1.772 1.750 1.730 10 3.285 2.924 2.728 2.605 2.522 2.461 2.414
1.712 1.696 1.681 2.377 2.347 2.323
97 1.889 1.853 1.822 1.795 1.771 1.749 1.729 11 3.225 2.860 2.660 2.536 2.451 2.389 2.342
1.711 1.695 1.680 2.304 2.274 2.248
98 1.888 1.852 1.821 1.794 1.770 1.748 1.728 12 3.177 2.807 2.606 2.480 2.394 2.331 2.283
1.710 1.694 1.679 2.245 2.214 2.188
99 1.887 1.851 1.820 1.793 1.769 1.747 1.727 13 3.136 2.763 2.560 2.434 2.347 2.283 2.234
1.709 1.693 1.678 2.195 2.164 2.138
100 1.886 1.850 1.819 1.792 1.768 1.746 1.726 14 3.102 2.726 2.522 2.395 2.307 2.243 2.193
1.708 1.691 1.676 2.154 2.122 2.095
15 3.073 2.695 2.490 2.361 2.273 2.208 2.158
2.119 2.086 2.059
16 3.048 2.668 2.462 2.333 2.244 2.178 2.128
2.088 2.055 2.028
17 3.026 2.645 2.437 2.308 2.218 2.152 2.102
Upper critical values of the F distribution 2.061 2.028 2.001
18 3.007 2.624 2.416 2.286 2.196 2.130 2.079
for numerator degrees of freedom and denominator degrees of freedom 2.038 2.005 1.977
19 2.990 2.606 2.397 2.266 2.176 2.109 2.058
10% significance level 2.017 1.984 1.956
20 2.975 2.589 2.380 2.249 2.158 2.091 2.040
1.999 1.965 1.937
21 2.961 2.575 2.365 2.233 2.142 2.075 2.023
1.982 1.948 1.920
\ 1 2 3 4 5 6 7 8 22 2.949 2.561 2.351 2.219 2.128 2.060 2.008
9 10 1.967 1.933 1.904
23 2.937 2.549 2.339 2.207 2.115 2.047 1.995
1.953 1.919 1.890
1 39.863 49.500 53.593 55.833 57.240 58.204 58.906 24 2.927 2.538 2.327 2.195 2.103 2.035 1.983
59.439 59.858 60.195 1.941 1.906 1.877
2 8.526 9.000 9.162 9.243 9.293 9.326 9.349 25 2.918 2.528 2.317 2.184 2.092 2.024 1.971
9.367 9.381 9.392 1.929 1.895 1.866
3 5.538 5.462 5.391 5.343 5.309 5.285 5.266 26 2.909 2.519 2.307 2.174 2.082 2.014 1.961
5.252 5.240 5.230 1.919 1.884 1.855
4 4.545 4.325 4.191 4.107 4.051 4.010 3.979 27 2.901 2.511 2.299 2.165 2.073 2.005 1.952
3.955 3.936 3.920 1.909 1.874 1.845
5 4.060 3.780 3.619 3.520 3.453 3.405 3.368 28 2.894 2.503 2.291 2.157 2.064 1.996 1.943
3.339 3.316 3.297 1.900 1.865 1.836
6 3.776 3.463 3.289 3.181 3.108 3.055 3.014 29 2.887 2.495 2.283 2.149 2.057 1.988 1.935
2.983 2.958 2.937 1.892 1.857 1.827
7 3.589 3.257 3.074 2.961 2.883 2.827 2.785 30 2.881 2.489 2.276 2.142 2.049 1.980 1.927
2.752 2.725 2.703 1.884 1.849 1.819
8 3.458 3.113 2.924 2.806 2.726 2.668 2.624 31 2.875 2.482 2.270 2.136 2.042 1.973 1.920
2.589 2.561 2.538 1.877 1.842 1.812
9 3.360 3.006 2.813 2.693 2.611 2.551 2.505 32 2.869 2.477 2.263 2.129 2.036 1.967 1.913
2.469 2.440 2.416 1.870 1.835 1.805

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (11 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (12 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

33 2.864 2.471 2.258 2.123 2.030 1.961 1.907 56 2.797 2.400 2.184 2.048 1.953 1.882 1.827
1.864 1.828 1.799 1.782 1.746 1.715
34 2.859 2.466 2.252 2.118 2.024 1.955 1.901 57 2.796 2.398 2.182 2.046 1.951 1.880 1.825
1.858 1.822 1.793 1.780 1.744 1.713
35 2.855 2.461 2.247 2.113 2.019 1.950 1.896 58 2.794 2.396 2.181 2.044 1.949 1.878 1.823
1.852 1.817 1.787 1.779 1.742 1.711
36 2.850 2.456 2.243 2.108 2.014 1.945 1.891 59 2.793 2.395 2.179 2.043 1.947 1.876 1.821
1.847 1.811 1.781 1.777 1.740 1.709
37 2.846 2.452 2.238 2.103 2.009 1.940 1.886 60 2.791 2.393 2.177 2.041 1.946 1.875 1.819
1.842 1.806 1.776 1.775 1.738 1.707
38 2.842 2.448 2.234 2.099 2.005 1.935 1.881 61 2.790 2.392 2.176 2.039 1.944 1.873 1.818
1.838 1.802 1.772 1.773 1.736 1.705
39 2.839 2.444 2.230 2.095 2.001 1.931 1.877 62 2.788 2.390 2.174 2.038 1.942 1.871 1.816
1.833 1.797 1.767 1.771 1.735 1.703
40 2.835 2.440 2.226 2.091 1.997 1.927 1.873 63 2.787 2.389 2.173 2.036 1.941 1.870 1.814
1.829 1.793 1.763 1.770 1.733 1.702
41 2.832 2.437 2.222 2.087 1.993 1.923 1.869 64 2.786 2.387 2.171 2.035 1.939 1.868 1.813
1.825 1.789 1.759 1.768 1.731 1.700
42 2.829 2.434 2.219 2.084 1.989 1.919 1.865 65 2.784 2.386 2.170 2.033 1.938 1.867 1.811
1.821 1.785 1.755 1.767 1.730 1.699
43 2.826 2.430 2.216 2.080 1.986 1.916 1.861 66 2.783 2.385 2.169 2.032 1.937 1.865 1.810
1.817 1.781 1.751 1.765 1.728 1.697
44 2.823 2.427 2.213 2.077 1.983 1.913 1.858 67 2.782 2.384 2.167 2.031 1.935 1.864 1.808
1.814 1.778 1.747 1.764 1.727 1.696
45 2.820 2.425 2.210 2.074 1.980 1.909 1.855 68 2.781 2.382 2.166 2.029 1.934 1.863 1.807
1.811 1.774 1.744 1.762 1.725 1.694
46 2.818 2.422 2.207 2.071 1.977 1.906 1.852 69 2.780 2.381 2.165 2.028 1.933 1.861 1.806
1.808 1.771 1.741 1.761 1.724 1.693
47 2.815 2.419 2.204 2.068 1.974 1.903 1.849 70 2.779 2.380 2.164 2.027 1.931 1.860 1.804
1.805 1.768 1.738 1.760 1.723 1.691
48 2.813 2.417 2.202 2.066 1.971 1.901 1.846 71 2.778 2.379 2.163 2.026 1.930 1.859 1.803
1.802 1.765 1.735 1.758 1.721 1.690
49 2.811 2.414 2.199 2.063 1.968 1.898 1.843 72 2.777 2.378 2.161 2.025 1.929 1.858 1.802
1.799 1.763 1.732 1.757 1.720 1.689
50 2.809 2.412 2.197 2.061 1.966 1.895 1.840 73 2.776 2.377 2.160 2.024 1.928 1.856 1.801
1.796 1.760 1.729 1.756 1.719 1.687
51 2.807 2.410 2.194 2.058 1.964 1.893 1.838 74 2.775 2.376 2.159 2.022 1.927 1.855 1.800
1.794 1.757 1.727 1.755 1.718 1.686
52 2.805 2.408 2.192 2.056 1.961 1.891 1.836 75 2.774 2.375 2.158 2.021 1.926 1.854 1.798
1.791 1.755 1.724 1.754 1.716 1.685
53 2.803 2.406 2.190 2.054 1.959 1.888 1.833 76 2.773 2.374 2.157 2.020 1.925 1.853 1.797
1.789 1.752 1.722 1.752 1.715 1.684
54 2.801 2.404 2.188 2.052 1.957 1.886 1.831 77 2.772 2.373 2.156 2.019 1.924 1.852 1.796
1.787 1.750 1.719 1.751 1.714 1.683
55 2.799 2.402 2.186 2.050 1.955 1.884 1.829 78 2.771 2.372 2.155 2.018 1.923 1.851 1.795
1.785 1.748 1.717 1.750 1.713 1.682

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (13 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (14 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

79 2.770 2.371 2.154 2.017 1.922 1.850 1.794 \ 11 12 13 14 15 16 17 18


1.749 1.712 1.681 19 20
80 2.769 2.370 2.154 2.016 1.921 1.849 1.793
1.748 1.711 1.680
81 2.769 2.369 2.153 2.016 1.920 1.848 1.792
1.747 1.710 1.679
82 2.768 2.368 2.152 2.015 1.919 1.847 1.791 1 60.473 60.705 60.903 61.073 61.220 61.350 61.464
1.746 1.709 1.678 61.566 61.658 61.740
83 2.767 2.368 2.151 2.014 1.918 1.846 1.790 2 9.401 9.408 9.415 9.420 9.425 9.429 9.433
1.745 1.708 1.677 9.436 9.439 9.441
84 2.766 2.367 2.150 2.013 1.917 1.845 1.790 3 5.222 5.216 5.210 5.205 5.200 5.196 5.193
1.744 1.707 1.676 5.190 5.187 5.184
85 2.765 2.366 2.149 2.012 1.916 1.845 1.789 4 3.907 3.896 3.886 3.878 3.870 3.864 3.858
1.744 1.706 1.675 3.853 3.849 3.844
86 2.765 2.365 2.149 2.011 1.915 1.844 1.788 5 3.282 3.268 3.257 3.247 3.238 3.230 3.223
1.743 1.705 1.674 3.217 3.212 3.207
87 2.764 2.365 2.148 2.011 1.915 1.843 1.787 6 2.920 2.905 2.892 2.881 2.871 2.863 2.855
1.742 1.705 1.673 2.848 2.842 2.836
88 2.763 2.364 2.147 2.010 1.914 1.842 1.786 7 2.684 2.668 2.654 2.643 2.632 2.623 2.615
1.741 1.704 1.672 2.607 2.601 2.595
89 2.763 2.363 2.146 2.009 1.913 1.841 1.785 8 2.519 2.502 2.488 2.475 2.464 2.455 2.446
1.740 1.703 1.671 2.438 2.431 2.425
90 2.762 2.363 2.146 2.008 1.912 1.841 1.785 9 2.396 2.379 2.364 2.351 2.340 2.329 2.320
1.739 1.702 1.670 2.312 2.305 2.298
91 2.761 2.362 2.145 2.008 1.912 1.840 1.784 10 2.302 2.284 2.269 2.255 2.244 2.233 2.224
1.739 1.701 1.670 2.215 2.208 2.201
92 2.761 2.361 2.144 2.007 1.911 1.839 1.783 11 2.227 2.209 2.193 2.179 2.167 2.156 2.147
1.738 1.701 1.669 2.138 2.130 2.123
93 2.760 2.361 2.144 2.006 1.910 1.838 1.782 12 2.166 2.147 2.131 2.117 2.105 2.094 2.084
1.737 1.700 1.668 2.075 2.067 2.060
94 2.760 2.360 2.143 2.006 1.910 1.838 1.782 13 2.116 2.097 2.080 2.066 2.053 2.042 2.032
1.736 1.699 1.667 2.023 2.014 2.007
95 2.759 2.359 2.142 2.005 1.909 1.837 1.781 14 2.073 2.054 2.037 2.022 2.010 1.998 1.988
1.736 1.698 1.667 1.978 1.970 1.962
96 2.759 2.359 2.142 2.004 1.908 1.836 1.780 15 2.037 2.017 2.000 1.985 1.972 1.961 1.950
1.735 1.698 1.666 1.941 1.932 1.924
97 2.758 2.358 2.141 2.004 1.908 1.836 1.780 16 2.005 1.985 1.968 1.953 1.940 1.928 1.917
1.734 1.697 1.665 1.908 1.899 1.891
98 2.757 2.358 2.141 2.003 1.907 1.835 1.779 17 1.978 1.958 1.940 1.925 1.912 1.900 1.889
1.734 1.696 1.665 1.879 1.870 1.862
99 2.757 2.357 2.140 2.003 1.906 1.835 1.778 18 1.954 1.933 1.916 1.900 1.887 1.875 1.864
1.733 1.696 1.664 1.854 1.845 1.837
100 2.756 2.356 2.139 2.002 1.906 1.834 1.778 19 1.932 1.912 1.894 1.878 1.865 1.852 1.841
1.732 1.695 1.663 1.831 1.822 1.814
20 1.913 1.892 1.875 1.859 1.845 1.833 1.821
1.811 1.802 1.794

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (15 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (16 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

21 1.896 1.875 1.857 1.841 1.827 1.815 1.803 44 1.721 1.699 1.679 1.662 1.646 1.632 1.620
1.793 1.784 1.776 1.608 1.598 1.588
22 1.880 1.859 1.841 1.825 1.811 1.798 1.787 45 1.718 1.695 1.676 1.658 1.643 1.629 1.616
1.777 1.768 1.759 1.605 1.594 1.585
23 1.866 1.845 1.827 1.811 1.796 1.784 1.772 46 1.715 1.692 1.672 1.655 1.639 1.625 1.613
1.762 1.753 1.744 1.601 1.591 1.581
24 1.853 1.832 1.814 1.797 1.783 1.770 1.759 47 1.712 1.689 1.669 1.652 1.636 1.622 1.609
1.748 1.739 1.730 1.598 1.587 1.578
25 1.841 1.820 1.802 1.785 1.771 1.758 1.746 48 1.709 1.686 1.666 1.648 1.633 1.619 1.606
1.736 1.726 1.718 1.594 1.584 1.574
26 1.830 1.809 1.790 1.774 1.760 1.747 1.735 49 1.706 1.683 1.663 1.645 1.630 1.616 1.603
1.724 1.715 1.706 1.591 1.581 1.571
27 1.820 1.799 1.780 1.764 1.749 1.736 1.724 50 1.703 1.680 1.660 1.643 1.627 1.613 1.600
1.714 1.704 1.695 1.588 1.578 1.568
28 1.811 1.790 1.771 1.754 1.740 1.726 1.715 51 1.700 1.677 1.658 1.640 1.624 1.610 1.597
1.704 1.694 1.685 1.586 1.575 1.565
29 1.802 1.781 1.762 1.745 1.731 1.717 1.705 52 1.698 1.675 1.655 1.637 1.621 1.607 1.594
1.695 1.685 1.676 1.583 1.572 1.562
30 1.794 1.773 1.754 1.737 1.722 1.709 1.697 53 1.695 1.672 1.652 1.635 1.619 1.605 1.592
1.686 1.676 1.667 1.580 1.570 1.560
31 1.787 1.765 1.746 1.729 1.714 1.701 1.689 54 1.693 1.670 1.650 1.632 1.616 1.602 1.589
1.678 1.668 1.659 1.578 1.567 1.557
32 1.780 1.758 1.739 1.722 1.707 1.694 1.682 55 1.691 1.668 1.648 1.630 1.614 1.600 1.587
1.671 1.661 1.652 1.575 1.564 1.555
33 1.773 1.751 1.732 1.715 1.700 1.687 1.675 56 1.688 1.666 1.645 1.628 1.612 1.597 1.585
1.664 1.654 1.645 1.573 1.562 1.552
34 1.767 1.745 1.726 1.709 1.694 1.680 1.668 57 1.686 1.663 1.643 1.625 1.610 1.595 1.582
1.657 1.647 1.638 1.571 1.560 1.550
35 1.761 1.739 1.720 1.703 1.688 1.674 1.662 58 1.684 1.661 1.641 1.623 1.607 1.593 1.580
1.651 1.641 1.632 1.568 1.558 1.548
36 1.756 1.734 1.715 1.697 1.682 1.669 1.656 59 1.682 1.659 1.639 1.621 1.605 1.591 1.578
1.645 1.635 1.626 1.566 1.555 1.546
37 1.751 1.729 1.709 1.692 1.677 1.663 1.651 60 1.680 1.657 1.637 1.619 1.603 1.589 1.576
1.640 1.630 1.620 1.564 1.553 1.543
38 1.746 1.724 1.704 1.687 1.672 1.658 1.646 61 1.679 1.656 1.635 1.617 1.601 1.587 1.574
1.635 1.624 1.615 1.562 1.551 1.541
39 1.741 1.719 1.700 1.682 1.667 1.653 1.641 62 1.677 1.654 1.634 1.616 1.600 1.585 1.572
1.630 1.619 1.610 1.560 1.549 1.540
40 1.737 1.715 1.695 1.678 1.662 1.649 1.636 63 1.675 1.652 1.632 1.614 1.598 1.583 1.570
1.625 1.615 1.605 1.558 1.548 1.538
41 1.733 1.710 1.691 1.673 1.658 1.644 1.632 64 1.673 1.650 1.630 1.612 1.596 1.582 1.569
1.620 1.610 1.601 1.557 1.546 1.536
42 1.729 1.706 1.687 1.669 1.654 1.640 1.628 65 1.672 1.649 1.628 1.610 1.594 1.580 1.567
1.616 1.606 1.596 1.555 1.544 1.534
43 1.725 1.703 1.683 1.665 1.650 1.636 1.624 66 1.670 1.647 1.627 1.609 1.593 1.578 1.565
1.612 1.602 1.592 1.553 1.542 1.532

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (17 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (18 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

67 1.669 1.646 1.625 1.607 1.591 1.577 1.564 90 1.643 1.620 1.599 1.581 1.564 1.550 1.536
1.552 1.541 1.531 1.524 1.513 1.503
68 1.667 1.644 1.624 1.606 1.590 1.575 1.562 91 1.643 1.619 1.598 1.580 1.564 1.549 1.535
1.550 1.539 1.529 1.523 1.512 1.502
69 1.666 1.643 1.622 1.604 1.588 1.574 1.560 92 1.642 1.618 1.598 1.579 1.563 1.548 1.534
1.548 1.538 1.527 1.522 1.511 1.501
70 1.665 1.641 1.621 1.603 1.587 1.572 1.559 93 1.641 1.617 1.597 1.578 1.562 1.547 1.534
1.547 1.536 1.526 1.521 1.510 1.500
71 1.663 1.640 1.619 1.601 1.585 1.571 1.557 94 1.640 1.617 1.596 1.578 1.561 1.546 1.533
1.545 1.535 1.524 1.521 1.509 1.499
72 1.662 1.639 1.618 1.600 1.584 1.569 1.556 95 1.640 1.616 1.595 1.577 1.560 1.545 1.532
1.544 1.533 1.523 1.520 1.509 1.498
73 1.661 1.637 1.617 1.599 1.583 1.568 1.555 96 1.639 1.615 1.594 1.576 1.560 1.545 1.531
1.543 1.532 1.522 1.519 1.508 1.497
74 1.659 1.636 1.616 1.597 1.581 1.567 1.553 97 1.638 1.614 1.594 1.575 1.559 1.544 1.530
1.541 1.530 1.520 1.518 1.507 1.497
75 1.658 1.635 1.614 1.596 1.580 1.565 1.552 98 1.637 1.614 1.593 1.575 1.558 1.543 1.530
1.540 1.529 1.519 1.517 1.506 1.496
76 1.657 1.634 1.613 1.595 1.579 1.564 1.551 99 1.637 1.613 1.592 1.574 1.557 1.542 1.529
1.539 1.528 1.518 1.517 1.505 1.495
77 1.656 1.632 1.612 1.594 1.578 1.563 1.550 100 1.636 1.612 1.592 1.573 1.557 1.542 1.528
1.538 1.527 1.516 1.516 1.505 1.494
78 1.655 1.631 1.611 1.593 1.576 1.562 1.548
1.536 1.525 1.515
79 1.654 1.630 1.610 1.592 1.575 1.561 1.547
1.535 1.524 1.514
80 1.653 1.629 1.609 1.590 1.574 1.559 1.546
1.534 1.523 1.513
81 1.652 1.628 1.608 1.589 1.573 1.558 1.545 Upper critical values of the F distribution
1.533 1.522 1.512
82 1.651 1.627 1.607 1.588 1.572 1.557 1.544 for numerator degrees of freedom and denominator degrees of freedom
1.532 1.521 1.511
83 1.650 1.626 1.606 1.587 1.571 1.556 1.543 1% significance level
1.531 1.520 1.509
84 1.649 1.625 1.605 1.586 1.570 1.555 1.542
1.530 1.519 1.508
85 1.648 1.624 1.604 1.585 1.569 1.554 1.541
\ 1 2 3 4 5 6 7 8
1.529 1.518 1.507
9 10
86 1.647 1.623 1.603 1.584 1.568 1.553 1.540
1.528 1.517 1.506
87 1.646 1.622 1.602 1.583 1.567 1.552 1.539
1.527 1.516 1.505 1 4052.19 4999.52 5403.34 5624.62 5763.65 5858.97 5928.33
88 1.645 1.622 1.601 1.583 1.566 1.551 1.538 5981.10 6022.50 6055.85
1.526 1.515 1.504 2 98.502 99.000 99.166 99.249 99.300 99.333 99.356
89 1.644 1.621 1.600 1.582 1.565 1.550 1.537 99.374 99.388 99.399
1.525 1.514 1.503 3 34.116 30.816 29.457 28.710 28.237 27.911 27.672

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (19 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (20 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

27.489 27.345 27.229 3.288 3.182 3.094


4 21.198 18.000 16.694 15.977 15.522 15.207 14.976 27 7.677 5.488 4.601 4.106 3.785 3.558 3.388
14.799 14.659 14.546 3.256 3.149 3.062
5 16.258 13.274 12.060 11.392 10.967 10.672 10.456 28 7.636 5.453 4.568 4.074 3.754 3.528 3.358
10.289 10.158 10.051 3.226 3.120 3.032
6 13.745 10.925 9.780 9.148 8.746 8.466 8.260 29 7.598 5.420 4.538 4.045 3.725 3.499 3.330
8.102 7.976 7.874 3.198 3.092 3.005
7 12.246 9.547 8.451 7.847 7.460 7.191 6.993 30 7.562 5.390 4.510 4.018 3.699 3.473 3.305
6.840 6.719 6.620 3.173 3.067 2.979
8 11.259 8.649 7.591 7.006 6.632 6.371 6.178 31 7.530 5.362 4.484 3.993 3.675 3.449 3.281
6.029 5.911 5.814 3.149 3.043 2.955
9 10.561 8.022 6.992 6.422 6.057 5.802 5.613 32 7.499 5.336 4.459 3.969 3.652 3.427 3.258
5.467 5.351 5.257 3.127 3.021 2.934
10 10.044 7.559 6.552 5.994 5.636 5.386 5.200 33 7.471 5.312 4.437 3.948 3.630 3.406 3.238
5.057 4.942 4.849 3.106 3.000 2.913
11 9.646 7.206 6.217 5.668 5.316 5.069 4.886 34 7.444 5.289 4.416 3.927 3.611 3.386 3.218
4.744 4.632 4.539 3.087 2.981 2.894
12 9.330 6.927 5.953 5.412 5.064 4.821 4.640 35 7.419 5.268 4.396 3.908 3.592 3.368 3.200
4.499 4.388 4.296 3.069 2.963 2.876
13 9.074 6.701 5.739 5.205 4.862 4.620 4.441 36 7.396 5.248 4.377 3.890 3.574 3.351 3.183
4.302 4.191 4.100 3.052 2.946 2.859
14 8.862 6.515 5.564 5.035 4.695 4.456 4.278 37 7.373 5.229 4.360 3.873 3.558 3.334 3.167
4.140 4.030 3.939 3.036 2.930 2.843
15 8.683 6.359 5.417 4.893 4.556 4.318 4.142 38 7.353 5.211 4.343 3.858 3.542 3.319 3.152
4.004 3.895 3.805 3.021 2.915 2.828
16 8.531 6.226 5.292 4.773 4.437 4.202 4.026 39 7.333 5.194 4.327 3.843 3.528 3.305 3.137
3.890 3.780 3.691 3.006 2.901 2.814
17 8.400 6.112 5.185 4.669 4.336 4.102 3.927 40 7.314 5.179 4.313 3.828 3.514 3.291 3.124
3.791 3.682 3.593 2.993 2.888 2.801
18 8.285 6.013 5.092 4.579 4.248 4.015 3.841 41 7.296 5.163 4.299 3.815 3.501 3.278 3.111
3.705 3.597 3.508 2.980 2.875 2.788
19 8.185 5.926 5.010 4.500 4.171 3.939 3.765 42 7.280 5.149 4.285 3.802 3.488 3.266 3.099
3.631 3.523 3.434 2.968 2.863 2.776
20 8.096 5.849 4.938 4.431 4.103 3.871 3.699 43 7.264 5.136 4.273 3.790 3.476 3.254 3.087
3.564 3.457 3.368 2.957 2.851 2.764
21 8.017 5.780 4.874 4.369 4.042 3.812 3.640 44 7.248 5.123 4.261 3.778 3.465 3.243 3.076
3.506 3.398 3.310 2.946 2.840 2.754
22 7.945 5.719 4.817 4.313 3.988 3.758 3.587 45 7.234 5.110 4.249 3.767 3.454 3.232 3.066
3.453 3.346 3.258 2.935 2.830 2.743
23 7.881 5.664 4.765 4.264 3.939 3.710 3.539 46 7.220 5.099 4.238 3.757 3.444 3.222 3.056
3.406 3.299 3.211 2.925 2.820 2.733
24 7.823 5.614 4.718 4.218 3.895 3.667 3.496 47 7.207 5.087 4.228 3.747 3.434 3.213 3.046
3.363 3.256 3.168 2.916 2.811 2.724
25 7.770 5.568 4.675 4.177 3.855 3.627 3.457 48 7.194 5.077 4.218 3.737 3.425 3.204 3.037
3.324 3.217 3.129 2.907 2.802 2.715
26 7.721 5.526 4.637 4.140 3.818 3.591 3.421 49 7.182 5.066 4.208 3.728 3.416 3.195 3.028

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (21 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (22 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

2.898 2.793 2.706 2.769 2.664 2.578


50 7.171 5.057 4.199 3.720 3.408 3.186 3.020 73 6.995 4.908 4.062 3.588 3.279 3.060 2.895
2.890 2.785 2.698 2.765 2.660 2.574
51 7.159 5.047 4.191 3.711 3.400 3.178 3.012 74 6.990 4.904 4.058 3.584 3.275 3.056 2.891
2.882 2.777 2.690 2.762 2.657 2.570
52 7.149 5.038 4.182 3.703 3.392 3.171 3.005 75 6.985 4.900 4.054 3.580 3.272 3.052 2.887
2.874 2.769 2.683 2.758 2.653 2.567
53 7.139 5.030 4.174 3.695 3.384 3.163 2.997 76 6.981 4.896 4.050 3.577 3.268 3.049 2.884
2.867 2.762 2.675 2.755 2.650 2.563
54 7.129 5.021 4.167 3.688 3.377 3.156 2.990 77 6.976 4.892 4.047 3.573 3.265 3.046 2.881
2.860 2.755 2.668 2.751 2.647 2.560
55 7.119 5.013 4.159 3.681 3.370 3.149 2.983 78 6.971 4.888 4.043 3.570 3.261 3.042 2.877
2.853 2.748 2.662 2.748 2.644 2.557
56 7.110 5.006 4.152 3.674 3.363 3.143 2.977 79 6.967 4.884 4.040 3.566 3.258 3.039 2.874
2.847 2.742 2.655 2.745 2.640 2.554
57 7.102 4.998 4.145 3.667 3.357 3.136 2.971 80 6.963 4.881 4.036 3.563 3.255 3.036 2.871
2.841 2.736 2.649 2.742 2.637 2.551
58 7.093 4.991 4.138 3.661 3.351 3.130 2.965 81 6.958 4.877 4.033 3.560 3.252 3.033 2.868
2.835 2.730 2.643 2.739 2.634 2.548
59 7.085 4.984 4.132 3.655 3.345 3.124 2.959 82 6.954 4.874 4.030 3.557 3.249 3.030 2.865
2.829 2.724 2.637 2.736 2.632 2.545
60 7.077 4.977 4.126 3.649 3.339 3.119 2.953 83 6.950 4.870 4.027 3.554 3.246 3.027 2.863
2.823 2.718 2.632 2.733 2.629 2.542
61 7.070 4.971 4.120 3.643 3.333 3.113 2.948 84 6.947 4.867 4.024 3.551 3.243 3.025 2.860
2.818 2.713 2.626 2.731 2.626 2.539
62 7.062 4.965 4.114 3.638 3.328 3.108 2.942 85 6.943 4.864 4.021 3.548 3.240 3.022 2.857
2.813 2.708 2.621 2.728 2.623 2.537
63 7.055 4.959 4.109 3.632 3.323 3.103 2.937 86 6.939 4.861 4.018 3.545 3.238 3.019 2.854
2.808 2.703 2.616 2.725 2.621 2.534
64 7.048 4.953 4.103 3.627 3.318 3.098 2.932 87 6.935 4.858 4.015 3.543 3.235 3.017 2.852
2.803 2.698 2.611 2.723 2.618 2.532
65 7.042 4.947 4.098 3.622 3.313 3.093 2.928 88 6.932 4.855 4.012 3.540 3.233 3.014 2.849
2.798 2.693 2.607 2.720 2.616 2.529
66 7.035 4.942 4.093 3.618 3.308 3.088 2.923 89 6.928 4.852 4.010 3.538 3.230 3.012 2.847
2.793 2.689 2.602 2.718 2.613 2.527
67 7.029 4.937 4.088 3.613 3.304 3.084 2.919 90 6.925 4.849 4.007 3.535 3.228 3.009 2.845
2.789 2.684 2.598 2.715 2.611 2.524
68 7.023 4.932 4.083 3.608 3.299 3.080 2.914 91 6.922 4.846 4.004 3.533 3.225 3.007 2.842
2.785 2.680 2.593 2.713 2.609 2.522
69 7.017 4.927 4.079 3.604 3.295 3.075 2.910 92 6.919 4.844 4.002 3.530 3.223 3.004 2.840
2.781 2.676 2.589 2.711 2.606 2.520
70 7.011 4.922 4.074 3.600 3.291 3.071 2.906 93 6.915 4.841 3.999 3.528 3.221 3.002 2.838
2.777 2.672 2.585 2.709 2.604 2.518
71 7.006 4.917 4.070 3.596 3.287 3.067 2.902 94 6.912 4.838 3.997 3.525 3.218 3.000 2.835
2.773 2.668 2.581 2.706 2.602 2.515
72 7.001 4.913 4.066 3.591 3.283 3.063 2.898 95 6.909 4.836 3.995 3.523 3.216 2.998 2.833

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (23 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (24 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

2.704 2.600 2.513 3.556 3.529 3.505


96 6.906 4.833 3.992 3.521 3.214 2.996 2.831 15. 3.730 3.666 3.612 3.564 3.522 3.485 3.452
2.702 2.598 2.511 3.423 3.396 3.372
97 6.904 4.831 3.990 3.519 3.212 2.994 2.829 16. 3.616 3.553 3.498 3.451 3.409 3.372 3.339
2.700 2.596 2.509 3.310 3.283 3.259
98 6.901 4.829 3.988 3.517 3.210 2.992 2.827 17. 3.519 3.455 3.401 3.353 3.312 3.275 3.242
2.698 2.594 2.507 3.212 3.186 3.162
99 6.898 4.826 3.986 3.515 3.208 2.990 2.825 18. 3.434 3.371 3.316 3.269 3.227 3.190 3.158
2.696 2.592 2.505 3.128 3.101 3.077
100 6.895 4.824 3.984 3.513 3.206 2.988 2.823 19. 3.360 3.297 3.242 3.195 3.153 3.116 3.084
2.694 2.590 2.503 3.054 3.027 3.003
20. 3.294 3.231 3.177 3.130 3.088 3.051 3.018
2.989 2.962 2.938
\ 11 12 13 14 15 16 17 18 21. 3.236 3.173 3.119 3.072 3.030 2.993 2.960
19 20 2.931 2.904 2.880
22. 3.184 3.121 3.067 3.019 2.978 2.941 2.908
2.879 2.852 2.827
23. 3.137 3.074 3.020 2.973 2.931 2.894 2.861
2.832 2.805 2.781
1. 6083.35 6106.35 6125.86 6142.70 6157.28 6170.12 6181.42 24. 3.094 3.032 2.977 2.930 2.889 2.852 2.819
6191.52 6200.58 6208.74 2.789 2.762 2.738
2. 99.408 99.416 99.422 99.428 99.432 99.437 99.440 25. 3.056 2.993 2.939 2.892 2.850 2.813 2.780
99.444 99.447 99.449 2.751 2.724 2.699
3. 27.133 27.052 26.983 26.924 26.872 26.827 26.787 26. 3.021 2.958 2.904 2.857 2.815 2.778 2.745
26.751 26.719 26.690 2.715 2.688 2.664
4. 14.452 14.374 14.307 14.249 14.198 14.154 14.115 27. 2.988 2.926 2.871 2.824 2.783 2.746 2.713
14.080 14.048 14.020 2.683 2.656 2.632
5. 9.963 9.888 9.825 9.770 9.722 9.680 9.643 28. 2.959 2.896 2.842 2.795 2.753 2.716 2.683
9.610 9.580 9.553 2.653 2.626 2.602
6. 7.790 7.718 7.657 7.605 7.559 7.519 7.483 29. 2.931 2.868 2.814 2.767 2.726 2.689 2.656
7.451 7.422 7.396 2.626 2.599 2.574
7. 6.538 6.469 6.410 6.359 6.314 6.275 6.240 30. 2.906 2.843 2.789 2.742 2.700 2.663 2.630
6.209 6.181 6.155 2.600 2.573 2.549
8. 5.734 5.667 5.609 5.559 5.515 5.477 5.442 31. 2.882 2.820 2.765 2.718 2.677 2.640 2.606
5.412 5.384 5.359 2.577 2.550 2.525
9. 5.178 5.111 5.055 5.005 4.962 4.924 4.890 32. 2.860 2.798 2.744 2.696 2.655 2.618 2.584
4.860 4.833 4.808 2.555 2.527 2.503
10. 4.772 4.706 4.650 4.601 4.558 4.520 4.487 33. 2.840 2.777 2.723 2.676 2.634 2.597 2.564
4.457 4.430 4.405 2.534 2.507 2.482
11. 4.462 4.397 4.342 4.293 4.251 4.213 4.180 34. 2.821 2.758 2.704 2.657 2.615 2.578 2.545
4.150 4.123 4.099 2.515 2.488 2.463
12. 4.220 4.155 4.100 4.052 4.010 3.972 3.939 35. 2.803 2.740 2.686 2.639 2.597 2.560 2.527
3.909 3.883 3.858 2.497 2.470 2.445
13. 4.025 3.960 3.905 3.857 3.815 3.778 3.745 36. 2.786 2.723 2.669 2.622 2.580 2.543 2.510
3.716 3.689 3.665 2.480 2.453 2.428
14. 3.864 3.800 3.745 3.698 3.656 3.619 3.586 37. 2.770 2.707 2.653 2.606 2.564 2.527 2.494

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (25 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (26 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.3. Upper Critical Values of the F Distribution

2.464 2.437 2.412 2.251 2.223 2.198


38. 2.755 2.692 2.638 2.591 2.549 2.512 2.479 61. 2.553 2.491 2.436 2.389 2.347 2.309 2.276
2.449 2.421 2.397 2.245 2.218 2.192
39. 2.741 2.678 2.624 2.577 2.535 2.498 2.465 62. 2.548 2.486 2.431 2.384 2.342 2.304 2.270
2.434 2.407 2.382 2.240 2.212 2.187
40. 2.727 2.665 2.611 2.563 2.522 2.484 2.451 63. 2.543 2.481 2.426 2.379 2.337 2.299 2.265
2.421 2.394 2.369 2.235 2.207 2.182
41. 2.715 2.652 2.598 2.551 2.509 2.472 2.438 64. 2.538 2.476 2.421 2.374 2.332 2.294 2.260
2.408 2.381 2.356 2.230 2.202 2.177
42. 2.703 2.640 2.586 2.539 2.497 2.460 2.426 65. 2.534 2.471 2.417 2.369 2.327 2.289 2.256
2.396 2.369 2.344 2.225 2.198 2.172
43. 2.691 2.629 2.575 2.527 2.485 2.448 2.415 66. 2.529 2.466 2.412 2.365 2.322 2.285 2.251
2.385 2.357 2.332 2.221 2.193 2.168
44. 2.680 2.618 2.564 2.516 2.475 2.437 2.404 67. 2.525 2.462 2.408 2.360 2.318 2.280 2.247
2.374 2.346 2.321 2.216 2.188 2.163
45. 2.670 2.608 2.553 2.506 2.464 2.427 2.393 68. 2.520 2.458 2.403 2.356 2.314 2.276 2.242
2.363 2.336 2.311 2.212 2.184 2.159
46. 2.660 2.598 2.544 2.496 2.454 2.417 2.384 69. 2.516 2.454 2.399 2.352 2.310 2.272 2.238
2.353 2.326 2.301 2.208 2.180 2.155
47. 2.651 2.588 2.534 2.487 2.445 2.408 2.374 70. 2.512 2.450 2.395 2.348 2.306 2.268 2.234
2.344 2.316 2.291 2.204 2.176 2.150
48. 2.642 2.579 2.525 2.478 2.436 2.399 2.365 71. 2.508 2.446 2.391 2.344 2.302 2.264 2.230
2.335 2.307 2.282 2.200 2.172 2.146
49. 2.633 2.571 2.517 2.469 2.427 2.390 2.356 72. 2.504 2.442 2.388 2.340 2.298 2.260 2.226
2.326 2.299 2.274 2.196 2.168 2.143
50. 2.625 2.562 2.508 2.461 2.419 2.382 2.348 73. 2.501 2.438 2.384 2.336 2.294 2.256 2.223
2.318 2.290 2.265 2.192 2.164 2.139
51. 2.617 2.555 2.500 2.453 2.411 2.374 2.340 74. 2.497 2.435 2.380 2.333 2.290 2.253 2.219
2.310 2.282 2.257 2.188 2.161 2.135
52. 2.610 2.547 2.493 2.445 2.403 2.366 2.333 75. 2.494 2.431 2.377 2.329 2.287 2.249 2.215
2.302 2.275 2.250 2.185 2.157 2.132
53. 2.602 2.540 2.486 2.438 2.396 2.359 2.325 76. 2.490 2.428 2.373 2.326 2.284 2.246 2.212
2.295 2.267 2.242 2.181 2.154 2.128
54. 2.595 2.533 2.479 2.431 2.389 2.352 2.318 77. 2.487 2.424 2.370 2.322 2.280 2.243 2.209
2.288 2.260 2.235 2.178 2.150 2.125
55. 2.589 2.526 2.472 2.424 2.382 2.345 2.311 78. 2.484 2.421 2.367 2.319 2.277 2.239 2.206
2.281 2.253 2.228 2.175 2.147 2.122
56. 2.582 2.520 2.465 2.418 2.376 2.339 2.305 79. 2.481 2.418 2.364 2.316 2.274 2.236 2.202
2.275 2.247 2.222 2.172 2.144 2.118
57. 2.576 2.513 2.459 2.412 2.370 2.332 2.299 80. 2.478 2.415 2.361 2.313 2.271 2.233 2.199
2.268 2.241 2.215 2.169 2.141 2.115
58. 2.570 2.507 2.453 2.406 2.364 2.326 2.293 81. 2.475 2.412 2.358 2.310 2.268 2.230 2.196
2.262 2.235 2.209 2.166 2.138 2.112
59. 2.564 2.502 2.447 2.400 2.358 2.320 2.287 82. 2.472 2.409 2.355 2.307 2.265 2.227 2.193
2.256 2.229 2.203 2.163 2.135 2.109
60. 2.559 2.496 2.442 2.394 2.352 2.315 2.281 83. 2.469 2.406 2.352 2.304 2.262 2.224 2.191

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (27 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (28 of 29) [11/13/2003 5:33:09 PM]
1.3.6.7.3. Upper Critical Values of the F Distribution 1.3.6.7.4. Critical Values of the Chi-Square Distribution

2.160 2.132 2.106


84. 2.466 2.404 2.349 2.302 2.259 2.222 2.188
2.157 2.129 2.104
85. 2.464 2.401 2.347 2.299 2.257 2.219 2.185
2.154 2.126 2.101 1. Exploratory Data Analysis
86. 2.461 2.398 2.344 2.296 2.254 2.216 2.182 1.3. EDA Techniques
2.152 2.124 2.098 1.3.6. Probability Distributions
1.3.6.7. Tables for Probability Distributions
87. 2.459 2.396 2.342 2.294 2.252 2.214 2.180
2.149 2.121 2.096
88. 2.456 2.393 2.339 2.291 2.249 2.211 2.177 1.3.6.7.4. Critical Values of the Chi-Square
2.147 2.119 2.093
89. 2.454 2.391 2.337 2.289 2.247 2.209 2.175 Distribution
2.144 2.116 2.091
90. 2.451 2.389 2.334 2.286 2.244 2.206 2.172 How to Use This table contains the critical values of the chi-square distribution.
2.142 2.114 2.088 This Table Because of the lack of symmetry of the chi-square distribution, separate
91. 2.449 2.386 2.332 2.284 2.242 2.204 2.170 tables are provided for the upper and lower tails of the distribution.
2.139 2.111 2.086 A test statistic with degrees of freedom is computed from the data. For
92. 2.447 2.384 2.330 2.282 2.240 2.202 2.168 upper one-sided tests, the test statistic is compared with a value from the
2.137 2.109 2.083 table of upper critical values. For two-sided tests, the test statistic is
93. 2.444 2.382 2.327 2.280 2.237 2.200 2.166 compared with values from both the table for the upper critical value
2.135 2.107 2.081 and the table for the lower critical value.
94. 2.442 2.380 2.325 2.277 2.235 2.197 2.163 The significance level, , is demonstrated with the graph below which
2.133 2.105 2.079 shows a chi-square distribution with 3 degrees of freedom for a
95. 2.440 2.378 2.323 2.275 2.233 2.195 2.161 two-sided test at significance level = 0.05. If the test statistic is
2.130 2.102 2.077 greater than the upper critical value or less than the lower critical value,
96. 2.438 2.375 2.321 2.273 2.231 2.193 2.159 we reject the null hypothesis. Specific instructions are given below.
2.128 2.100 2.075
97. 2.436 2.373 2.319 2.271 2.229 2.191 2.157
2.126 2.098 2.073
98. 2.434 2.371 2.317 2.269 2.227 2.189 2.155
2.124 2.096 2.071
99. 2.432 2.369 2.315 2.267 2.225 2.187 2.153
2.122 2.094 2.069
100. 2.430 2.368 2.313 2.265 2.223 2.185 2.151
2.120 2.092 2.067

Given a specified value for :


1. For a two-sided test, find the column corresponding to /2 in the
table for upper critical values and reject the null hypothesis if the
test statistic is greater than the tabled value. Similarly, find the

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3673.htm (29 of 29) [11/13/2003 5:33:09 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm (1 of 7) [11/13/2003 5:33:10 PM]
1.3.6.7.4. Critical Values of the Chi-Square Distribution 1.3.6.7.4. Critical Values of the Chi-Square Distribution

column corresponding to 1 - /2 in the table for lower critical 22 30.813 33.924 36.781 40.289 48.268
values and reject the null hypothesis if the test statistic is less than 23 32.007 35.172 38.076 41.638 49.728
the tabled value. 24 33.196 36.415 39.364 42.980 51.179
2. For an upper one-sided test, find the column corresponding to 25 34.382 37.652 40.646 44.314 52.620
in the upper critical values table and reject the null hypothesis if
the test statistic is greater than the tabled value. 26 35.563 38.885 41.923 45.642 54.052
3. For a lower one-sided test, find the column corresponding to 1 - 27 36.741 40.113 43.195 46.963 55.476
in the lower critical values table and reject the null hypothesis 28 37.916 41.337 44.461 48.278 56.892
if the computed test statistic is less than the tabled value. 29 39.087 42.557 45.722 49.588 58.301
30 40.256 43.773 46.979 50.892 59.703
31 41.422 44.985 48.232 52.191 61.098
32 42.585 46.194 49.480 53.486 62.487
33 43.745 47.400 50.725 54.776 63.870
34 44.903 48.602 51.966 56.061 65.247
Upper critical values of chi-square distribution with degrees of freedom 35 46.059 49.802 53.203 57.342 66.619
36 47.212 50.998 54.437 58.619 67.985
37 48.363 52.192 55.668 59.893 69.347
38 49.513 53.384 56.896 61.162 70.703
Probability of exceeding the critical value 39 50.660 54.572 58.120 62.428 72.055
0.10 0.05 0.025 0.01 0.001 40 51.805 55.758 59.342 63.691 73.402
41 52.949 56.942 60.561 64.950 74.745
1 2.706 3.841 5.024 6.635 10.828 42 54.090 58.124 61.777 66.206 76.084
2 4.605 5.991 7.378 9.210 13.816 43 55.230 59.304 62.990 67.459 77.419
3 6.251 7.815 9.348 11.345 16.266 44 56.369 60.481 64.201 68.710 78.750
4 7.779 9.488 11.143 13.277 18.467 45 57.505 61.656 65.410 69.957 80.077
5 9.236 11.070 12.833 15.086 20.515 46 58.641 62.830 66.617 71.201 81.400
6 10.645 12.592 14.449 16.812 22.458 47 59.774 64.001 67.821 72.443 82.720
7 12.017 14.067 16.013 18.475 24.322 48 60.907 65.171 69.023 73.683 84.037
8 13.362 15.507 17.535 20.090 26.125 49 62.038 66.339 70.222 74.919 85.351
9 14.684 16.919 19.023 21.666 27.877 50 63.167 67.505 71.420 76.154 86.661
10 15.987 18.307 20.483 23.209 29.588 51 64.295 68.669 72.616 77.386 87.968
11 17.275 19.675 21.920 24.725 31.264 52 65.422 69.832 73.810 78.616 89.272
12 18.549 21.026 23.337 26.217 32.910 53 66.548 70.993 75.002 79.843 90.573
13 19.812 22.362 24.736 27.688 34.528 54 67.673 72.153 76.192 81.069 91.872
14 21.064 23.685 26.119 29.141 36.123 55 68.796 73.311 77.380 82.292 93.168
15 22.307 24.996 27.488 30.578 37.697 56 69.919 74.468 78.567 83.513 94.461
16 23.542 26.296 28.845 32.000 39.252 57 71.040 75.624 79.752 84.733 95.751
17 24.769 27.587 30.191 33.409 40.790 58 72.160 76.778 80.936 85.950 97.039
18 25.989 28.869 31.526 34.805 42.312 59 73.279 77.931 82.117 87.166 98.324
19 27.204 30.144 32.852 36.191 43.820 60 74.397 79.082 83.298 88.379 99.607
20 28.412 31.410 34.170 37.566 45.315 61 75.514 80.232 84.476 89.591 100.888
21 29.615 32.671 35.479 38.932 46.797 62 76.630 81.381 85.654 90.802 102.166

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm (2 of 7) [11/13/2003 5:33:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm (3 of 7) [11/13/2003 5:33:10 PM]


1.3.6.7.4. Critical Values of the Chi-Square Distribution 1.3.6.7.4. Critical Values of the Chi-Square Distribution

63 77.745 82.529 86.830 92.010 103.442


64 78.860 83.675 88.004 93.217 104.716
65 79.973 84.821 89.177 94.422 105.988
Lower critical values of chi-square distribution with degrees of freedom
66 81.085 85.965 90.349 95.626 107.258
67 82.197 87.108 91.519 96.828 108.526
68 83.308 88.250 92.689 98.028 109.791
69 84.418 89.391 93.856 99.228 111.055 Probability of exceeding the critical value
70 85.527 90.531 95.023 100.425 112.317 0.90 0.95 0.975 0.99 0.999
71 86.635 91.670 96.189 101.621 113.577
72 87.743 92.808 97.353 102.816 114.835
73 88.850 93.945 98.516 104.010 116.092
74 89.956 95.081 99.678 105.202 117.346 1. .016 .004 .001 .000 .000
75 91.061 96.217 100.839 106.393 118.599 2. .211 .103 .051 .020 .002
76 92.166 97.351 101.999 107.583 119.850 3. .584 .352 .216 .115 .024
77 93.270 98.484 103.158 108.771 121.100 4. 1.064 .711 .484 .297 .091
78 94.374 99.617 104.316 109.958 122.348 5. 1.610 1.145 .831 .554 .210
79 95.476 100.749 105.473 111.144 123.594 6. 2.204 1.635 1.237 .872 .381
80 96.578 101.879 106.629 112.329 124.839 7. 2.833 2.167 1.690 1.239 .598
81 97.680 103.010 107.783 113.512 126.083 8. 3.490 2.733 2.180 1.646 .857
82 98.780 104.139 108.937 114.695 127.324 9. 4.168 3.325 2.700 2.088 1.152
83 99.880 105.267 110.090 115.876 128.565 10. 4.865 3.940 3.247 2.558 1.479
84 100.980 106.395 111.242 117.057 129.804 11. 5.578 4.575 3.816 3.053 1.834
85 102.079 107.522 112.393 118.236 131.041 12. 6.304 5.226 4.404 3.571 2.214
86 103.177 108.648 113.544 119.414 132.277 13. 7.042 5.892 5.009 4.107 2.617
87 104.275 109.773 114.693 120.591 133.512 14. 7.790 6.571 5.629 4.660 3.041
88 105.372 110.898 115.841 121.767 134.746 15. 8.547 7.261 6.262 5.229 3.483
89 106.469 112.022 116.989 122.942 135.978 16. 9.312 7.962 6.908 5.812 3.942
90 107.565 113.145 118.136 124.116 137.208 17. 10.085 8.672 7.564 6.408 4.416
91 108.661 114.268 119.282 125.289 138.438 18. 10.865 9.390 8.231 7.015 4.905
92 109.756 115.390 120.427 126.462 139.666 19. 11.651 10.117 8.907 7.633 5.407
93 110.850 116.511 121.571 127.633 140.893 20. 12.443 10.851 9.591 8.260 5.921
94 111.944 117.632 122.715 128.803 142.119 21. 13.240 11.591 10.283 8.897 6.447
95 113.038 118.752 123.858 129.973 143.344 22. 14.041 12.338 10.982 9.542 6.983
96 114.131 119.871 125.000 131.141 144.567 23. 14.848 13.091 11.689 10.196 7.529
97 115.223 120.990 126.141 132.309 145.789 24. 15.659 13.848 12.401 10.856 8.085
98 116.315 122.108 127.282 133.476 147.010 25. 16.473 14.611 13.120 11.524 8.649
99 117.407 123.225 128.422 134.642 148.230 26. 17.292 15.379 13.844 12.198 9.222
100 118.498 124.342 129.561 135.807 149.449 27. 18.114 16.151 14.573 12.879 9.803
100 118.498 124.342 129.561 135.807 149.449 28. 18.939 16.928 15.308 13.565 10.391
29. 19.768 17.708 16.047 14.256 10.986
30. 20.599 18.493 16.791 14.953 11.588

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm (4 of 7) [11/13/2003 5:33:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm (5 of 7) [11/13/2003 5:33:10 PM]


1.3.6.7.4. Critical Values of the Chi-Square Distribution 1.3.6.7.4. Critical Values of the Chi-Square Distribution

31. 21.434 19.281 17.539 15.655 12.196 72. 57.113 53.462 50.428 47.051 40.519
32. 22.271 20.072 18.291 16.362 12.811 73. 58.006 54.325 51.265 47.858 41.264
33. 23.110 20.867 19.047 17.074 13.431 74. 58.900 55.189 52.103 48.666 42.010
34. 23.952 21.664 19.806 17.789 14.057 75. 59.795 56.054 52.942 49.475 42.757
35. 24.797 22.465 20.569 18.509 14.688 76. 60.690 56.920 53.782 50.286 43.507
36. 25.643 23.269 21.336 19.233 15.324 77. 61.586 57.786 54.623 51.097 44.258
37. 26.492 24.075 22.106 19.960 15.965 78. 62.483 58.654 55.466 51.910 45.010
38. 27.343 24.884 22.878 20.691 16.611 79. 63.380 59.522 56.309 52.725 45.764
39. 28.196 25.695 23.654 21.426 17.262 80. 64.278 60.391 57.153 53.540 46.520
40. 29.051 26.509 24.433 22.164 17.916 81. 65.176 61.261 57.998 54.357 47.277
41. 29.907 27.326 25.215 22.906 18.575 82. 66.076 62.132 58.845 55.174 48.036
42. 30.765 28.144 25.999 23.650 19.239 83. 66.976 63.004 59.692 55.993 48.796
43. 31.625 28.965 26.785 24.398 19.906 84. 67.876 63.876 60.540 56.813 49.557
44. 32.487 29.787 27.575 25.148 20.576 85. 68.777 64.749 61.389 57.634 50.320
45. 33.350 30.612 28.366 25.901 21.251 86. 69.679 65.623 62.239 58.456 51.085
46. 34.215 31.439 29.160 26.657 21.929 87. 70.581 66.498 63.089 59.279 51.850
47. 35.081 32.268 29.956 27.416 22.610 88. 71.484 67.373 63.941 60.103 52.617
48. 35.949 33.098 30.755 28.177 23.295 89. 72.387 68.249 64.793 60.928 53.386
49. 36.818 33.930 31.555 28.941 23.983 90. 73.291 69.126 65.647 61.754 54.155
50. 37.689 34.764 32.357 29.707 24.674 91. 74.196 70.003 66.501 62.581 54.926
51. 38.560 35.600 33.162 30.475 25.368 92. 75.100 70.882 67.356 63.409 55.698
52. 39.433 36.437 33.968 31.246 26.065 93. 76.006 71.760 68.211 64.238 56.472
53. 40.308 37.276 34.776 32.018 26.765 94. 76.912 72.640 69.068 65.068 57.246
54. 41.183 38.116 35.586 32.793 27.468 95. 77.818 73.520 69.925 65.898 58.022
55. 42.060 38.958 36.398 33.570 28.173 96. 78.725 74.401 70.783 66.730 58.799
56. 42.937 39.801 37.212 34.350 28.881 97. 79.633 75.282 71.642 67.562 59.577
57. 43.816 40.646 38.027 35.131 29.592 98. 80.541 76.164 72.501 68.396 60.356
58. 44.696 41.492 38.844 35.913 30.305 99. 81.449 77.046 73.361 69.230 61.137
59. 45.577 42.339 39.662 36.698 31.020 100. 82.358 77.929 74.222 70.065 61.918
60. 46.459 43.188 40.482 37.485 31.738
61. 47.342 44.038 41.303 38.273 32.459
62. 48.226 44.889 42.126 39.063 33.181
63. 49.111 45.741 42.950 39.855 33.906
64. 49.996 46.595 43.776 40.649 34.633
65. 50.883 47.450 44.603 41.444 35.362
66. 51.770 48.305 45.431 42.240 36.093
67. 52.659 49.162 46.261 43.038 36.826
68. 53.548 50.020 47.092 43.838 37.561
69. 54.438 50.879 47.924 44.639 38.298
70. 55.329 51.739 48.758 45.442 39.036
71. 56.221 52.600 49.592 46.246 39.777

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm (6 of 7) [11/13/2003 5:33:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3674.htm (7 of 7) [11/13/2003 5:33:10 PM]


1.3.6.7.5. Critical Values of the t* Distribution 1.3.6.7.5. Critical Values of the t* Distribution

16 2.665 76 2.441
17 2.647 77 2.441
18 2.631 78 2.440
19 2.617 79 2.439
20 2.605 80 2.439
1. Exploratory Data Analysis
21 2.594 81 2.438
1.3. EDA Techniques
1.3.6. Probability Distributions
22 2.584 82 2.437
1.3.6.7. Tables for Probability Distributions 23 2.574 83 2.437
24 2.566 84 2.436
25 2.558 85 2.436
1.3.6.7.5. Critical Values of the t* 26 2.551 86 2.435
27 2.545 87 2.435
Distribution 28 2.539 88 2.434
29 2.534 89 2.434
30 2.528 90 2.433
How to Use This table contains upper critical values of the t* distribution that are
31 2.524 91 2.432
This Table appropriate for determining whether or not a calibration line is in a state
32 2.519 92 2.432
of statistical control from measurements on a check standard at three
33 2.515 93 2.431
points in the calibration interval. A test statistic with degrees of 34 2.511 94 2.431
freedom is compared with the critical value. If the absolute value of the 35 2.507 95 2.431
test statistic exceeds the tabled value, the calibration of the instrument is 36 2.504 96 2.430
judged to be out of control. 37 2.501 97 2.430
38 2.498 98 2.429
Upper critical values of t* distribution at significance level 0.05 39 2.495 99 2.429
for testing the output of a linear calibration line at 3 points 40 2.492 100 2.428
41 2.489 101 2.428
42 2.487 102 2.428
43 2.484 103 2.427
44 2.482 104 2.427
45 2.480 105 2.426
46 2.478 106 2.426
1 37.544 61 2.455 47 2.476 107 2.426
2 7.582 62 2.454 48 2.474 108 2.425
3 4.826 63 2.453 49 2.472 109 2.425
4 3.941 64 2.452 50 2.470 110 2.425
5 3.518 65 2.451 51 2.469 111 2.424
6 3.274 66 2.450 52 2.467 112 2.424
7 3.115 67 2.449 53 2.466 113 2.424
8 3.004 68 2.448 54 2.464 114 2.423
9 2.923 69 2.447 55 2.463 115 2.423
10 2.860 70 2.446 56 2.461 116 2.423
11 2.811 71 2.445 57 2.460 117 2.422
12 2.770 72 2.445 58 2.459 118 2.422
13 2.737 73 2.444 59 2.457 119 2.422
14 2.709 74 2.443 60 2.456 120 2.422
15 2.685 75 2.442

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3675.htm (1 of 3) [11/13/2003 5:33:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3675.htm (2 of 3) [11/13/2003 5:33:10 PM]


1.3.6.7.5. Critical Values of the t* Distribution 1.3.6.7.6. Critical Values of the Normal PPCC Distribution

1. Exploratory Data Analysis


1.3. EDA Techniques
1.3.6. Probability Distributions
1.3.6.7. Tables for Probability Distributions

1.3.6.7.6. Critical Values of the Normal


PPCC Distribution
How to Use This table contains the critical values of the normal probability plot
This Table correlation coefficient (PPCC) distribution that are appropriate for
determining whether or not a data set came from a population with
approximately a normal distribution. It is used in conjuction with a
normal probability plot. The test statistic is the correlation coefficient of
the points that make up a normal probability plot. This test statistic is
compared with the critical value below. If the test statistic is less than
the tabulated value, the null hypothesis that the data came from a
population with a normal distribution is rejected.
For example, suppose a set of 50 data points had a correlation
coefficient of 0.985 from the normal probability plot. At the 5%
significance level, the critical value is 0.965. Since 0.985 is greater than
0.965, we cannot reject the null hypothesis that the data came from a
population with a normal distribution.
Since perferct normality implies perfect correlation (i.e., a correlation
value of 1), we are only interested in rejecting normality for correlation
values that are too low. That is, this is a lower one-tailed test.
The values in this table were determined from simulation studies by
Filliben and Devaney.

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3675.htm (3 of 3) [11/13/2003 5:33:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3676.htm (1 of 4) [11/13/2003 5:33:10 PM]


1.3.6.7.6. Critical Values of the Normal PPCC Distribution 1.3.6.7.6. Critical Values of the Normal PPCC Distribution

40 0.9576 0.9712
41 0.9589 0.9719
Critical values of the normal PPCC for testing if data come from 42 0.9593 0.9723
a normal distribution 43 0.9609 0.9730
44 0.9611 0.9734
45 0.9620 0.9739
N 0.01 0.05 46 0.9629 0.9744
47 0.9637 0.9748
48 0.9640 0.9753
3 0.8687 0.8790 49 0.9643 0.9758
4 0.8234 0.8666 50 0.9654 0.9761
5 0.8240 0.8786 55 0.9683 0.9781
6 0.8351 0.8880 60 0.9706 0.9797
7 0.8474 0.8970 65 0.9723 0.9809
8 0.8590 0.9043 70 0.9742 0.9822
9 0.8689 0.9115 75 0.9758 0.9831
10 0.8765 0.9173 80 0.9771 0.9841
11 0.8838 0.9223 85 0.9784 0.9850
12 0.8918 0.9267 90 0.9797 0.9857
13 0.8974 0.9310 95 0.9804 0.9864
14 0.9029 0.9343 100 0.9814 0.9869
15 0.9080 0.9376 110 0.9830 0.9881
16 0.9121 0.9405 120 0.9841 0.9889
17 0.9160 0.9433 130 0.9854 0.9897
18 0.9196 0.9452 140 0.9865 0.9904
19 0.9230 0.9479 150 0.9871 0.9909
20 0.9256 0.9498 160 0.9879 0.9915
21 0.9285 0.9515 170 0.9887 0.9919
22 0.9308 0.9535 180 0.9891 0.9923
23 0.9334 0.9548 190 0.9897 0.9927
24 0.9356 0.9564 200 0.9903 0.9930
25 0.9370 0.9575 210 0.9907 0.9933
26 0.9393 0.9590 220 0.9910 0.9936
27 0.9413 0.9600 230 0.9914 0.9939
28 0.9428 0.9615 240 0.9917 0.9941
29 0.9441 0.9622 250 0.9921 0.9943
30 0.9462 0.9634 260 0.9924 0.9945
31 0.9476 0.9644 270 0.9926 0.9947
32 0.9490 0.9652 280 0.9929 0.9949
33 0.9505 0.9661 290 0.9931 0.9951
34 0.9521 0.9671 300 0.9933 0.9952
35 0.9530 0.9678 310 0.9936 0.9954
36 0.9540 0.9686 320 0.9937 0.9955
37 0.9551 0.9693 330 0.9939 0.9956
38 0.9555 0.9700 340 0.9941 0.9957
39 0.9568 0.9704 350 0.9942 0.9958

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3676.htm (2 of 4) [11/13/2003 5:33:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section3/eda3676.htm (3 of 4) [11/13/2003 5:33:10 PM]


1.3.6.7.6. Critical Values of the Normal PPCC Distribution 1.4. EDA Case Studies

360 0.9944 0.9959


370 0.9945 0.9960
380 0.9947 0.9961
390 0.9948 0.9962
400 0.9949 0.9963
1. Exploratory Data Analysis
410 0.9950 0.9964
420 0.9951 0.9965
430
440
0.9953
0.9954
0.9966
0.9966
1.4. EDA Case Studies
450 0.9954 0.9967
460 0.9955 0.9968 Summary This section presents a series of case studies that demonstrate the
470 0.9956 0.9968 application of EDA methods to specific problems. In some cases, we
480 0.9957 0.9969 have focused on just one EDA technique that uncovers virtually all there
490 0.9958 0.9969 is to know about the data. For other case studies, we need several EDA
500 0.9959 0.9970 techniques, the selection of which is dictated by the outcome of the
525 0.9961 0.9972 previous step in the analaysis sequence. Note in these case studies how
550 0.9963 0.9973 the flow of the analysis is motivated by the focus on underlying
575 0.9964 0.9974 assumptions and general EDA principles.
600 0.9965 0.9975
625 0.9967 0.9976 Table of 1. Introduction
650 0.9968 0.9977 Contents for
Section 4 2. By Problem Category
675 0.9969 0.9977
700 0.9970 0.9978
725 0.9971 0.9979
750 0.9972 0.9980
775 0.9973 0.9980
800 0.9974 0.9981
825 0.9975 0.9981
850 0.9975 0.9982
875 0.9976 0.9982
900 0.9977 0.9983
925 0.9977 0.9983
950 0.9978 0.9984
975 0.9978 0.9984
1000 0.9979 0.9984

http://www.itl.nist.gov/div898/handbook/eda/section3/eda3676.htm (4 of 4) [11/13/2003 5:33:10 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4.htm [11/13/2003 5:33:11 PM]


1.4.1. Case Studies Introduction 1.4.1. Case Studies Introduction

the usual formula for the uncertainty of :

valid? Here, s is the standard deviation of the data and N is the


1. Exploratory Data Analysis sample size.
1.4. EDA Case Studies 3. If is not the best estimator for C, what is a better estimator
for C (for example, median, midrange, midmean)?
1.4.1. Case Studies Introduction 4. If there is a better estimator, , what is its uncertainty? That is,
what is ?
Purpose The purpose of the first eight case studies is to show how EDA EDA and the routine checking of underlying assumptions provides
graphics and quantitative measures and tests are applied to data from insight into all of the above.
scientific processes and to critique those data with regard to the 1. Location and variation checks provide information as to
following assumptions that typically underlie a measurement process;
whether C is really constant.
namely, that the data behave like:
● random drawings 2. Distributional checks indicate whether is the best estimator.
● from a fixed distribution
Techniques for distributional checking include histograms,
normal probability plots, and probability plot correlation
● with a fixed location
coefficient plots.
● with a fixed standard deviation
3. Randomness checks ascertain whether the usual
Case studies 9 and 10 show the use of EDA techniques in
distributional modeling and the analysis of a designed experiment,
respectively.
is valid.
Yi = C + Ei 4. Distributional tests assist in determining a better estimator, if
If the above assumptions are satisfied, the process is said to be
needed.
statistically "in control" with the core characteristic of having
"predictability". That is, probability statements can be made about the 5. Simulator tools (namely bootstrapping) provide values for the
process, not only in the past, but also in the future. uncertainty of alternative estimators.

An appropriate model for an "in control" process is Assumptions If one or more of the above assumptions is not satisfied, then we use
Yi = C + Ei not satisfied EDA techniques, or some mix of EDA and classical techniques, to
where C is a constant (the "deterministic" or "structural" component), find a more appropriate model for the data. That is,
and where Ei is the error term (or "random" component). Yi = D + Ei
where D is the deterministic part and E is an error component.
The constant C is the average value of the process--it is the primary
summary number which shows up on any report. Although C is If the data are not random, then we may investigate fitting some
(assumed) fixed, it is unknown, and so a primary analysis objective of simple time series models to the data. If the constant location and
the engineer is to arrive at an estimate of C. scale assumptions are violated, we may need to investigate the
measurement process to see if there is an explanation.
This goal partitions into 4 sub-goals:
1. Is the most common estimator of C, , the best estimator for The assumptions on the error term are still quite relevant in the sense
C? What does "best" mean? that for an appropriate model the error component should follow the
assumptions. The criterion for validating the model, or comparing
2. If is best, what is the uncertainty for . In particular, is competing models, is framed in terms of these assumptions.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda41.htm (1 of 4) [11/13/2003 5:33:11 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda41.htm (2 of 4) [11/13/2003 5:33:11 PM]


1.4.1. Case Studies Introduction 1.4.1. Case Studies Introduction

assumptions. However, the four plots listed above provide an


Multivariable Although the case studies in this chapter utilize univariate data, the excellent opportunity for addressing all of the assumptions on a single
data assumptions above are relevant for multivariable data as well. page of graphics.

If the data are not univariate, then we are trying to find a model Additional graphical techniques are used in certain case studies to
develop models that do have error components that satisfy the
Yi = F(X1, ..., Xk) + Ei
underlying assumptions.
where F is some function based on one or more variables. The error
component, which is a univariate data set, of a good model should Quantitative The normal and uniform random number data sets are also analyzed
satisfy the assumptions given above. The criterion for validating and methods that with the following quantitative techniques, which are explained in
comparing models is based on how well the error component follows are applied to more detail in an earlier section:
these assumptions. the data 1. Summary statistics which include:
The load cell calibration case study in the process modeling chapter ❍ mean
shows an example of this in the regression context. ❍ standard deviation

First three The first three case studies utilize data that are randomly generated ❍ autocorrelation coefficient to test for randomness
case studies from the following distributions: ❍ normal and uniform probability plot correlation
utilize data ● normal distribution with mean 0 and standard deviation 1 coefficients (ppcc) to test for a normal or uniform
with known distribution, respectively
● uniform distribution with mean 0 and standard deviation
characteristics
(uniform over the interval (0,1)) ❍ Wilk-Shapiro test for a normal distribution

2. Linear fit of the data as a function of time to assess drift (test


random walk
● for fixed location)
The other univariate case studies utilize data from scientific processes. 3. Bartlett test for fixed variance
The goal is to determine if
4. Autocorrelation plot and coefficient to test for randomness
Yi = C + Ei
5. Runs test to test for lack of randomness
is a reasonable model. This is done by testing the underlying
6. Anderson-Darling test for a normal distribution
assumptions. If the assumptions are satisfied, then an estimate of C
and an estimate of the uncertainty of C are computed. If the 7. Grubbs test for outliers
assumptions are not satisfied, we attempt to find a model where the 8. Summary report
error component does satisfy the underlying assumptions.
Although the graphical methods applied to the normal and uniform
Graphical To test the underlying assumptions, each data set is analyzed using random numbers are sufficient to assess the validity of the underlying
methods that four graphical methods that are particularly suited for this purpose: assumptions, the quantitative techniques are used to show the different
are applied to flavor of the graphical and quantitative approaches.
1. run sequence plot which is useful for detecting shifts of location
the data or scale The remaining case studies intermix one or more of these quantitative
2. lag plot which is useful for detecting non-randomness in the techniques into the analysis where appropriate.
data
3. histogram which is useful for trying to determine the underlying
distribution
4. normal probability plot for deciding whether the data follow the
normal distribution
There are a number of other techniques for addressing the underlying

http://www.itl.nist.gov/div898/handbook/eda/section4/eda41.htm (3 of 4) [11/13/2003 5:33:11 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda41.htm (4 of 4) [11/13/2003 5:33:11 PM]


1.4.2. Case Studies 1.4.2. Case Studies

Multi-Factor

1. Exploratory Data Analysis


1.4. EDA Case Studies
Ceramic Strength

1.4.2. Case Studies


Univariate
Yi = C + Ei

Normal Random Uniform Random Random Walk


Numbers Numbers

Josephson Junction Beam Deflections Filter Transmittance


Cryothermometry

Standard Resistor Heat Flow Meter 1

Reliability

Airplane Glass
Failure Time

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42.htm (1 of 2) [11/13/2003 5:33:11 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42.htm (2 of 2) [11/13/2003 5:33:11 PM]


1.4.2.1. Normal Random Numbers 1.4.2.1.1. Background and Data

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.4. EDA Case Studies 1.4. EDA Case Studies
1.4.2. Case Studies 1.4.2. Case Studies
1.4.2.1. Normal Random Numbers

1.4.2.1. Normal Random Numbers


1.4.2.1.1. Background and Data
Normal This example illustrates the univariate analysis of a set of normal
Random random numbers. Generation The normal random numbers used in this case study are from a Rand
Numbers Corporation publication.

1. Background and Data The motivation for studying a set of normal random numbers is to
illustrate the ideal case where all four underlying assumptions hold.
2. Graphical Output and Interpretation
3. Quantitative Output and Interpretation Software Most general purpose statistical software programs, including Dataplot,
4. Work This Example Yourself can generate normal random numbers.

Resulting The following is the set of normal random numbers used for this case
Data study.

-1.2760 -1.2180 -0.4530 -0.3500 0.7230


0.6760 -1.0990 -0.3140 -0.3940 -0.6330
-0.3180 -0.7990 -1.6640 1.3910 0.3820
0.7330 0.6530 0.2190 -0.6810 1.1290
-1.3770 -1.2570 0.4950 -0.1390 -0.8540
0.4280 -1.3220 -0.3150 -0.7320 -1.3480
2.3340 -0.3370 -1.9550 -0.6360 -1.3180
-0.4330 0.5450 0.4280 -0.2970 0.2760
-1.1360 0.6420 3.4360 -1.6670 0.8470
-1.1730 -0.3550 0.0350 0.3590 0.9300
0.4140 -0.0110 0.6660 -1.1320 -0.4100
-1.0770 0.7340 1.4840 -0.3400 0.7890
-0.4940 0.3640 -1.2370 -0.0440 -0.1110
-0.2100 0.9310 0.6160 -0.3770 -0.4330
1.0480 0.0370 0.7590 0.6090 -2.0430
-0.2900 0.4040 -0.5430 0.4860 0.8690
0.3470 2.8160 -0.4640 -0.6320 -1.6140
0.3720 -0.0740 -0.9160 1.3140 -0.0380
0.6370 0.5630 -0.1070 0.1310 -1.8080
-1.1260 0.3790 0.6100 -0.3640 -2.6260

http://www.itl.nist.gov/div898/handbook/eda/section4/eda421.htm [11/13/2003 5:33:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4211.htm (1 of 3) [11/13/2003 5:33:12 PM]


1.4.2.1.1. Background and Data 1.4.2.1.1. Background and Data

2.1760 0.3930 -0.9240 1.9110 -1.0400 0.2480 -0.0880 -1.3790 0.2950 -0.1150
-1.1680 0.4850 0.0760 -0.7690 1.6070 -0.6210 -0.6180 0.2090 0.9790 0.9060
-1.1850 -0.9440 -1.6040 0.1850 -0.2580 -0.0990 -1.3760 1.0470 -0.8720 -2.2000
-0.3000 -0.5910 -0.5450 0.0180 -0.4850 -1.3840 1.4250 -0.8120 0.7480 -1.0930
0.9720 1.7100 2.6820 2.8130 -1.5310 -0.4630 -1.2810 -2.5140 0.6750 1.1450
-0.4900 2.0710 1.4440 -1.0920 0.4780 1.0830 -0.6670 -0.2230 -1.5920 -1.2780
1.2100 0.2940 -0.2480 0.7190 1.1030 0.5030 1.4340 0.2900 0.3970 -0.8370
1.0900 0.2120 -1.1850 -0.3380 -1.1340 -0.9730 -0.1200 -1.5940 -0.9960 -1.2440
2.6470 0.7770 0.4500 2.2470 1.1510 -0.8570 -0.3710 -0.2160 0.1480 -2.1060
-1.6760 0.3840 1.1330 1.3930 0.8140 -1.4530 0.6860 -0.0750 -0.2430 -0.1700
0.3980 0.3180 -0.9280 2.4160 -0.9360 -0.1220 1.1070 -1.0390 -0.6360 -0.8600
1.0360 0.0240 -0.5600 0.2030 -0.8710 -0.8950 -1.4580 -0.5390 -0.1590 -0.4200
0.8460 -0.6990 -0.3680 0.3440 -0.9260 1.6320 0.5860 -0.4680 -0.3860 -0.3540
-0.7970 -1.4040 -1.4720 -0.1180 1.4560 0.2030 -1.2340 2.3810 -0.3880 -0.0630
0.6540 -0.9550 2.9070 1.6880 0.7520 2.0720 -1.4450 -0.6800 0.2240 -0.1200
-0.4340 0.7460 0.1490 -0.1700 -0.4790 1.7530 -0.5710 1.2230 -0.1260 0.0340
0.5220 0.2310 -0.6190 -0.2650 0.4190 -0.4350 -0.3750 -0.9850 -0.5850 -0.2030
0.5580 -0.5490 0.1920 -0.3340 1.3730 -0.5560 0.0240 0.1260 1.2500 -0.6150
-1.2880 -0.5390 -0.8240 0.2440 -1.0700 0.8760 -1.2270 -2.6470 -0.7450 1.7970
0.0100 0.4820 -0.4690 -0.0900 1.1710 -1.2310 0.5470 -0.6340 -0.8360 -0.7190
1.3720 1.7690 -1.0570 1.6460 0.4810 0.8330 1.2890 -0.0220 -0.4310 0.5820
-0.6000 -0.5920 0.6100 -0.0960 -1.3750 0.7660 -0.5740 -1.1530 0.5200 -1.0180
0.8540 -0.5350 1.6070 0.4280 -0.6150 -0.8910 0.3320 -0.4530 -1.1270 2.0850
0.3310 -0.3360 -1.1520 0.5330 -0.8330 -0.7220 -1.5080 0.4890 -0.4960 -0.0250
-0.1480 -1.1440 0.9130 0.6840 1.0430 0.6440 -0.2330 -0.1530 1.0980 0.7570
0.5540 -0.0510 -0.9440 -0.4400 -0.2120 -0.0390 -0.4600 0.3930 2.0120 1.3560
-1.1480 -1.0560 0.6350 -0.3280 -1.2210 0.1050 -0.1710 -0.1100 -1.1450 0.8780
0.1180 -2.0450 -1.9770 -1.1330 0.3380 -0.9090 -0.3280 1.0210 -1.6130 1.5600
0.3480 0.9700 -0.0170 1.2170 -0.9740 -1.1920 1.7700 -0.0030 0.3690 0.0520
-1.2910 -0.3990 -1.2090 -0.2480 0.4800 0.6470 1.0290 1.5260 0.2370 -1.3280
0.2840 0.4580 1.3070 -1.6250 -0.6290 -0.0420 0.5530 0.7700 0.3240 -0.4890
-0.5040 -0.0560 -0.1310 0.0480 1.8790 -0.3670 0.3780 0.6010 -1.9960 -0.7380
-1.0160 0.3600 -0.1190 2.3310 1.6720 0.4980 1.0720 1.5670 0.3020 1.1570
-1.0530 0.8400 -0.2460 0.2370 -1.3120 -0.7200 1.4030 0.6980 -0.3700 -0.5510
1.6030 -0.9520 -0.5660 1.6000 0.4650
1.9510 0.1100 0.2510 0.1160 -0.9570
-0.1900 1.4790 -0.9860 1.2490 1.9340
0.0700 -1.3580 -1.2460 -0.9590 -1.2970
-0.7220 0.9250 0.7830 -0.4020 0.6190
1.8260 1.2720 -0.9450 0.4940 0.0500
-1.6960 1.8790 0.0630 0.1320 0.6820
0.5440 -0.4170 -0.6660 -0.1040 -0.2530
-2.5430 -1.3330 1.9870 0.6680 0.3600
1.9270 1.1830 1.2110 1.7650 0.3500
-0.3590 0.1930 -1.0230 -0.2220 -0.6160
-0.0600 -1.3190 0.7850 -0.4300 -0.2980

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4211.htm (2 of 3) [11/13/2003 5:33:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4211.htm (3 of 3) [11/13/2003 5:33:12 PM]


1.4.2.1.2. Graphical Output and Interpretation 1.4.2.1.2. Graphical Output and Interpretation

4-Plot of
Data

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.1. Normal Random Numbers

1.4.2.1.2. Graphical Output and


Interpretation
Goal The goal of this analysis is threefold:
1. Determine if the univariate model:

is appropriate and valid.


2. Determine if the typical underlying assumptions for an "in Interpretation The assumptions are addressed by the graphics shown above:
control" measurement process are valid. These assumptions are:
1. The run sequence plot (upper left) indicates that the data do not
1. random drawings; have any significant shifts in location or scale over time. The run
2. from a fixed distribution; sequence plot does not show any obvious outliers.
3. with the distribution having a fixed location; and 2. The lag plot (upper right) does not indicate any non-random
4. the distribution having a fixed scale. pattern in the data.
3. Determine if the confidence interval 3. The histogram (lower left) shows that the data are reasonably
symmetric, there do not appear to be significant outliers in the
tails, and that it is reasonable to assume that the data are from
is appropriate and valid where s is the standard deviation of the approximately a normal distribution.
original data. 4. The normal probability plot (lower right) verifies that an
assumption of normality is in fact reasonable.
From the above plots, we conclude that the underlying assumptions are
valid and the data follow approximately a normal distribution.
Therefore, the confidence interval form given previously is appropriate
for quantifying the uncertainty of the population mean. The numerical
values for this model are given in the Quantitative Output and
Interpretation section.

Individual Although it is usually not necessary, the plots can be generated


Plots individually to give more detail.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4212.htm (1 of 4) [11/13/2003 5:33:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4212.htm (2 of 4) [11/13/2003 5:33:12 PM]


1.4.2.1.2. Graphical Output and Interpretation 1.4.2.1.2. Graphical Output and Interpretation

Run Histogram
Sequence (with
Plot overlaid
Normal PDF)

Lag Plot Normal


Probability
Plot

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4212.htm (3 of 4) [11/13/2003 5:33:12 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4212.htm (4 of 4) [11/13/2003 5:33:12 PM]


1.4.2.1.3. Quantitative Output and Interpretation 1.4.2.1.3. Quantitative Output and Interpretation

Location One way to quantify a change in location over time is to fit a straight line to the data set,
using the index variable X = 1, 2, ..., N, with N denoting the number of observations. If
there is no significant drift in the location, the slope parameter should be zero. For this data
1. Exploratory Data Analysis set, Dataplot generated the following output:
1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.1. Normal Random Numbers LEAST SQUARES MULTILINEAR FIT
SAMPLE SIZE N = 500
NUMBER OF VARIABLES = 1
NO REPLICATION CASE
1.4.2.1.3. Quantitative Output and Interpretation
PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
Summary As a first step in the analysis, a table of summary statistics is computed from the data. The 1 A0 0.699127E-02 (0.9155E-01) 0.7636E-01
Statistics following table, generated by Dataplot, shows a typical set of statistics. 2 A1 X -0.396298E-04 (0.3167E-03) -0.1251

RESIDUAL STANDARD DEVIATION = 1.02205


RESIDUAL DEGREES OF FREEDOM = 498
SUMMARY

NUMBER OF OBSERVATIONS = 500 The slope parameter, A1, has a t value of -0.13 which is statistically not significant. This
indicates that the slope can in fact be considered zero.
***********************************************************************
* LOCATION MEASURES * DISPERSION MEASURES * Variation One simple way to detect a change in variation is with a Bartlett test, after dividing the data
*********************************************************************** set into several equal-sized intervals. The choice of the number of intervals is somewhat
* MIDRANGE = 0.3945000E+00 * RANGE = 0.6083000E+01 *
* MEAN = -0.2935997E-02 * STAND. DEV. = 0.1021041E+01 * arbitrary, although values of 4 or 8 are reasonable. Dataplot generated the following output
* MIDMEAN = 0.1623600E-01 * AV. AB. DEV. = 0.8174360E+00 * for the Bartlett test.
* MEDIAN = -0.9300000E-01 * MINIMUM = -0.2647000E+01 *
* = * LOWER QUART. = -0.7204999E+00 * BARTLETT TEST
* = * LOWER HINGE = -0.7210000E+00 * (STANDARD DEFINITION)
* = * UPPER HINGE = 0.6455001E+00 * NULL HYPOTHESIS UNDER TEST--ALL SIGMA(I) ARE EQUAL
* = * UPPER QUART. = 0.6447501E+00 *
* = * MAXIMUM = 0.3436000E+01 * TEST:
*********************************************************************** DEGREES OF FREEDOM = 3.000000
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES *
*********************************************************************** TEST STATISTIC VALUE = 2.373660
* AUTOCO COEF = 0.4505888E-01 * ST. 3RD MOM. = 0.3072273E+00 * CUTOFF: 95% PERCENT POINT = 7.814727
* = 0.0000000E+00 * ST. 4TH MOM. = 0.2990314E+01 * CUTOFF: 99% PERCENT POINT = 11.34487
* = 0.0000000E+00 * ST. WILK-SHA = 0.7515639E+01 *
* = * UNIFORM PPCC = 0.9756625E+00 * CHI-SQUARE CDF VALUE = 0.501443
* = * NORMAL PPCC = 0.9961721E+00 *
* = * TUK -.5 PPCC = 0.8366451E+00 * NULL NULL HYPOTHESIS NULL HYPOTHESIS
* = * CAUCHY PPCC = 0.4922674E+00 * HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
*********************************************************************** ALL SIGMA EQUAL (0.000,0.950) ACCEPT

In this case, the Bartlett test indicates that the standard deviations are not significantly
different in the 4 intervals.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4213.htm (1 of 7) [11/13/2003 5:33:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4213.htm (2 of 7) [11/13/2003 5:33:13 PM]


1.4.2.1.3. Quantitative Output and Interpretation 1.4.2.1.3. Quantitative Output and Interpretation
1 161.0 166.5000 6.6546 -0.83
2 63.0 62.2917 4.4454 0.16
Randomness 3 20.0 16.5750 3.4338 1.00
There are many ways in which data can be non-random. However, most common forms of
4 7.0 3.4458 1.7786 2.00
non-randomness can be detected with a few simple tests. The lag plot in the 4-plot above is 5 1.0 0.5895 0.7609 0.54
a simple graphical technique. 6 0.0 0.0858 0.2924 -0.29
7 0.0 0.0109 0.1042 -0.10
Another check is an autocorrelation plot that shows the autocorrelations for various lags. 8 0.0 0.0012 0.0349 -0.03
Confidence bands can be plotted at the 95% and 99% confidence levels. Points outside this 9 0.0 0.0001 0.0111 -0.01
10 0.0 0.0000 0.0034 0.00
band indicate statistically significant values (lag 0 is always 1). Dataplot generated the RUNS DOWN
following autocorrelation plot. STATISTIC = NUMBER OF RUNS DOWN
OF LENGTH EXACTLY I
I STAT EXP(STAT) SD(STAT) Z

1 91.0 104.2083 10.2792 -1.28


2 55.0 45.7167 5.2996 1.75
3 14.0 13.1292 3.2297 0.27
4 1.0 2.8563 1.6351 -1.14
5 0.0 0.5037 0.7045 -0.71
6 0.0 0.0749 0.2733 -0.27
7 0.0 0.0097 0.0982 -0.10
8 0.0 0.0011 0.0331 -0.03
9 0.0 0.0001 0.0106 -0.01
10 0.0 0.0000 0.0032 0.00
STATISTIC = NUMBER OF RUNS DOWN
OF LENGTH I OR MORE
I STAT EXP(STAT) SD(STAT) Z

1 161.0 166.5000 6.6546 -0.83


2 70.0 62.2917 4.4454 1.73
3 15.0 16.5750 3.4338 -0.46
4 1.0 3.4458 1.7786 -1.38
5 0.0 0.5895 0.7609 -0.77
6 0.0 0.0858 0.2924 -0.29
The lag 1 autocorrelation, which is generally the one of most interest, is 0.045. The critical 7 0.0 0.0109 0.1042 -0.10
8 0.0 0.0012 0.0349 -0.03
values at the 5% significance level are -0.087 and 0.087. Thus, since 0.045 is in the interval, 9 0.0 0.0001 0.0111 -0.01
the lag 1 autocorrelation is not statistically significant, so there is no evidence of 10 0.0 0.0000 0.0034 0.00
non-randomness. RUNS TOTAL = RUNS UP + RUNS DOWN
STATISTIC = NUMBER OF RUNS TOTAL
A common test for randomness is the runs test. OF LENGTH EXACTLY I
I STAT EXP(STAT) SD(STAT) Z
RUNS UP 1 189.0 208.4167 14.5370 -1.34
STATISTIC = NUMBER OF RUNS UP 2 98.0 91.4333 7.4947 0.88
OF LENGTH EXACTLY I 3 27.0 26.2583 4.5674 0.16
I STAT EXP(STAT) SD(STAT) Z 4 7.0 5.7127 2.3123 0.56
5 1.0 1.0074 0.9963 -0.01
1 98.0 104.2083 10.2792 -0.60 6 0.0 0.1498 0.3866 -0.39
2 43.0 45.7167 5.2996 -0.51 7 0.0 0.0193 0.1389 -0.14
3 13.0 13.1292 3.2297 -0.04 8 0.0 0.0022 0.0468 -0.05
4 6.0 2.8563 1.6351 1.92 9 0.0 0.0002 0.0150 -0.01
5 1.0 0.5037 0.7045 0.70 10 0.0 0.0000 0.0045 0.00
6 0.0 0.0749 0.2733 -0.27 STATISTIC = NUMBER OF RUNS TOTAL
7 0.0 0.0097 0.0982 -0.10 OF LENGTH I OR MORE
8 0.0 0.0011 0.0331 -0.03 I STAT EXP(STAT) SD(STAT) Z
9 0.0 0.0001 0.0106 -0.01
10 0.0 0.0000 0.0032 0.00 1 322.0 333.0000 9.4110 -1.17
STATISTIC = NUMBER OF RUNS UP 2 133.0 124.5833 6.2868 1.34
OF LENGTH I OR MORE 3 35.0 33.1500 4.8561 0.38
I STAT EXP(STAT) SD(STAT) Z 4 8.0 6.8917 2.5154 0.44

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4213.htm (3 of 7) [11/13/2003 5:33:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4213.htm (4 of 7) [11/13/2003 5:33:13 PM]


1.4.2.1.3. Quantitative Output and Interpretation 1.4.2.1.3. Quantitative Output and Interpretation
5 1.0 1.1790 1.0761 -0.17
6 0.0 0.1716 0.4136 -0.41
7 0.0 0.0217 0.1474 -0.15 Outlier A test for outliers is the Grubbs test. Dataplot generated the following output for Grubbs'
8 0.0 0.0024 0.0494 -0.05 Analysis test.
9 0.0 0.0002 0.0157 -0.02
10 0.0 0.0000 0.0047 0.00
GRUBBS TEST FOR OUTLIERS
LENGTH OF THE LONGEST RUN UP = 5
(ASSUMPTION: NORMALITY)
LENGTH OF THE LONGEST RUN DOWN = 4
LENGTH OF THE LONGEST RUN UP OR DOWN = 5
1. STATISTICS:
NUMBER OF OBSERVATIONS = 500
NUMBER OF POSITIVE DIFFERENCES = 252
MINIMUM = -2.647000
NUMBER OF NEGATIVE DIFFERENCES = 247
MEAN = -0.2935997E-02
NUMBER OF ZERO DIFFERENCES = 0
MAXIMUM = 3.436000
STANDARD DEVIATION = 1.021041
Values in the column labeled "Z" greater than 1.96 or less than -1.96 are statistically
significant at the 5% level. The runs test does not indicate any significant non-randomness. GRUBBS TEST STATISTIC = 3.368068

2. PERCENT POINTS OF THE REFERENCE DISTRIBUTION


Distributional Probability plots are a graphical test for assessing if a particular distribution provides an FOR GRUBBS TEST STATISTIC
Analysis adequate fit to a data set. 0 % POINT = 0.0000000E+00
50 % POINT = 3.274338
A quantitative enhancement to the probability plot is the correlation coefficient of the points 75 % POINT = 3.461431
90 % POINT = 3.695134
on the probability plot. For this data set the correlation coefficient is 0.996. Since this is 95 % POINT = 3.863087
greater than the critical value of 0.987 (this is a tabulated value), the normality assumption 99 % POINT = 4.228033
is not rejected.
3. CONCLUSION (AT THE 5% LEVEL):
Chi-square and Kolmogorov-Smirnov goodness-of-fit tests are alternative methods for THERE ARE NO OUTLIERS.
assessing distributional adequacy. The Wilk-Shapiro and Anderson-Darling tests can be For this data set, Grubbs' test does not detect any outliers at the 25%, 10%, 5%, and 1%
used to test for normality. Dataplot generates the following output for the Anderson-Darling significance levels.
normality test.
Model Since the underlying assumptions were validated both graphically and analytically, we
ANDERSON-DARLING 1-SAMPLE TEST conclude that a reasonable model for the data is:
THAT THE DATA CAME FROM A NORMAL DISTRIBUTION
Yi = -0.00294 + Ei
1. STATISTICS: We can express the uncertainty for C as the 95% confidence interval (-0.09266,0.086779).
NUMBER OF OBSERVATIONS = 500
MEAN = -0.2935997E-02
STANDARD DEVIATION = 1.021041 Univariate It is sometimes useful and convenient to summarize the above results in a report. The report
Report for the 500 normal random numbers follows.
ANDERSON-DARLING TEST STATISTIC VALUE = 1.061249
ADJUSTED TEST STATISTIC VALUE = 1.069633
Analysis for 500 normal random numbers
2. CRITICAL VALUES:
90 % POINT = 0.6560000 1: Sample Size = 500
95 % POINT = 0.7870000
97.5 % POINT = 0.9180000 2: Location
99 % POINT = 1.092000 Mean = -0.00294
Standard Deviation of Mean = 0.045663
3. CONCLUSION (AT THE 5% LEVEL): 95% Confidence Interval for Mean = (-0.09266,0.086779)
THE DATA DO NOT COME FROM A NORMAL DISTRIBUTION. Drift with respect to location? = NO
The Anderson-Darling test rejects the normality assumption at the 5% level but accepts it at 3: Variation
the 1% level. Standard Deviation = 1.021042
95% Confidence Interval for SD = (0.961437,1.088585)
Drift with respect to variation?
(based on Bartletts test on quarters
of the data) = NO

4: Distribution
Normal PPCC = 0.996173

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4213.htm (5 of 7) [11/13/2003 5:33:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4213.htm (6 of 7) [11/13/2003 5:33:13 PM]


1.4.2.1.3. Quantitative Output and Interpretation 1.4.2.1.4. Work This Example Yourself

Data are Normal?


(as measured by Normal PPCC) = YES

5: Randomness
Autocorrelation = 0.045059 1. Exploratory Data Analysis
Data are Random? 1.4. EDA Case Studies
(as measured by autocorrelation) = YES 1.4.2. Case Studies
1.4.2.1. Normal Random Numbers
6: Statistical Control
(i.e., no drift in location or scale,
data are random, distribution is 1.4.2.1.4. Work This Example Yourself
fixed, here we are testing only for
fixed normal) View This page allows you to repeat the analysis outlined in the case study
Data Set is in Statistical Control? = YES Dataplot description on the previous page using Dataplot . It is required that you
Macro for have already downloaded and installed Dataplot and configured your
7: Outliers?
this Case browser. to run Dataplot. Output from each analysis step below will be
(as determined by Grubbs' test) = NO
Study displayed in one or more of the Dataplot windows. The four main
windows are the Output window, the Graphics window, the Command
History window, and the data sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in.

Data Analysis Steps Results and Conclusions

Click on the links below to start Dataplot and run this case study
The links in this column will connect you with more detailed
yourself. Each step may use results from previous steps, so please be
information about each analysis step from the case study
patient. Wait until the software verifies that the current step is
description.
complete before clicking on the next step.

1. Invoke Dataplot and read data.

1. Read in the data. 1. You have read 1 column of numbers


into Dataplot, variable Y.

2. 4-plot of the data.

1. 4-plot of Y. 1. Based on the 4-plot, there are no shifts


in location or scale, and the data seem to
follow a normal distribution.

3. Generate the individual plots.

1. Generate a run sequence plot. 1. The run sequence plot indicates that
there are no shifts of location or
scale.

2. Generate a lag plot. 2. The lag plot does not indicate any
significant patterns (which would
show the data were not random).

3. Generate a histogram with an


3. The histogram indicates that a
overlaid normal pdf.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4213.htm (7 of 7) [11/13/2003 5:33:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4214.htm (1 of 2) [11/13/2003 5:33:13 PM]


1.4.2.1.4. Work This Example Yourself 1.4.2.2. Uniform Random Numbers
normal distribution is a good
distribution for these data.
4. Generate a normal probability
plot. 4. The normal probability plot verifies
that the normal distribution is a
reasonable distribution for these data.
1. Exploratory Data Analysis
1.4. EDA Case Studies
4. Generate summary statistics, quantitative 1.4.2. Case Studies
analysis, and print a univariate report.

1. Generate a table of summary


statistics.
1. The summary statistics table displays
25+ statistics. 1.4.2.2. Uniform Random Numbers
2. Generate the mean, a confidence 2. The mean is -0.00294 and a 95%
Uniform This example illustrates the univariate analysis of a set of uniform
interval for the mean, and compute confidence interval is (-0.093,0.087). Random random numbers.
a linear fit to detect drift in The linear fit indicates no drift in Numbers
location. location since the slope parameter is
statistically not significant.
1. Background and Data
2. Graphical Output and Interpretation
3. Generate the standard deviation, a 3. The standard deviation is 1.02 with
confidence interval for the standard a 95% confidence interval of (0.96,1.09). 3. Quantitative Output and Interpretation
deviation, and detect drift in variation Bartlett's test indicates no significant
by dividing the data into quarters and change in variation. 4. Work This Example Yourself
computing Barltett's test for equal
standard deviations.

4. Check for randomness by generating an 4. The lag 1 autocorrelation is 0.04.


autocorrelation plot and a runs test. From the autocorrelation plot, this is
within the 95% confidence interval
bands.

5. Check for normality by computing the 5. The normal probability plot correlation
normal probability plot correlation coefficient is 0.996. At the 5% level,
coefficient. we cannot reject the normality assumption.

6. Check for outliers using Grubbs' test. 6. Grubbs' test detects no outliers at the
5% level.

7. Print a univariate report (this assumes 7. The results are summarized in a


steps 2 thru 6 have already been run). convenient report.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4214.htm (2 of 2) [11/13/2003 5:33:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda422.htm [11/13/2003 5:33:13 PM]


1.4.2.2.1. Background and Data 1.4.2.2.1. Background and Data

.670078 .184754 .061068 .711778 .886854


.020086 .507584 .013676 .667951 .903647
.649329 .609110 .995946 .734887 .517649
.699182 .608928 .937856 .136823 .478341
.654811 .767417 .468509 .505804 .776974
1. Exploratory Data Analysis
.730395 .718640 .218165 .801243 .563517
1.4. EDA Case Studies
1.4.2. Case Studies
.727080 .154531 .822374 .211157 .825314
1.4.2.2. Uniform Random Numbers .385537 .743509 .981777 .402772 .144323
.600210 .455216 .423796 .286026 .699162
.680366 .252291 .483693 .687203 .766211
1.4.2.2.1. Background and Data .399094 .400564 .098932 .050514 .225685
.144642 .756788 .962977 .882254 .382145
.914991 .452368 .479276 .864616 .283554
Generation The uniform random numbers used in this case study are from a Rand .947508 .992337 .089200 .803369 .459826
Corporation publication. .940368 .587029 .734135 .531403 .334042
.050823 .441048 .194985 .157479 .543297
The motivation for studying a set of uniform random numbers is to
.926575 .576004 .088122 .222064 .125507
illustrate the effects of a known underlying non-normal distribution.
.374211 .100020 .401286 .074697 .966448
.943928 .707258 .636064 .932916 .505344
Software Most general purpose statistical software programs, including Dataplot, .844021 .952563 .436517 .708207 .207317
can generate uniform random numbers. .611969 .044626 .457477 .745192 .433729
.653945 .959342 .582605 .154744 .526695
Resulting The following is the set of uniform random numbers used for this case .270799 .535936 .783848 .823961 .011833
Data study. .211594 .945572 .857367 .897543 .875462
.244431 .911904 .259292 .927459 .424811
.100973 .253376 .520135 .863467 .354876 .621397 .344087 .211686 .848767 .030711
.809590 .911739 .292749 .375420 .480564 .205925 .701466 .235237 .831773 .208898
.894742 .962480 .524037 .206361 .040200 .376893 .591416 .262522 .966305 .522825
.822916 .084226 .895319 .645093 .032320 .044935 .249475 .246338 .244586 .251025
.902560 .159533 .476435 .080336 .990190 .619627 .933565 .337124 .005499 .765464
.252909 .376707 .153831 .131165 .886767 .051881 .599611 .963896 .546928 .239123
.439704 .436276 .128079 .997080 .157361 .287295 .359631 .530726 .898093 .543335
.476403 .236653 .989511 .687712 .171768 .135462 .779745 .002490 .103393 .598080
.660657 .471734 .072768 .503669 .736170 .839145 .427268 .428360 .949700 .130212
.658133 .988511 .199291 .310601 .080545 .489278 .565201 .460588 .523601 .390922
.571824 .063530 .342614 .867990 .743923 .867728 .144077 .939108 .364770 .617429
.403097 .852697 .760202 .051656 .926866 .321790 .059787 .379252 .410556 .707007
.574818 .730538 .524718 .623885 .635733 .867431 .715785 .394118 .692346 .140620
.213505 .325470 .489055 .357548 .284682 .117452 .041595 .660000 .187439 .242397
.870983 .491256 .737964 .575303 .529647 .118963 .195654 .143001 .758753 .794041
.783580 .834282 .609352 .034435 .273884 .921585 .666743 .680684 .962852 .451551
.985201 .776714 .905686 .072210 .940558 .493819 .476072 .464366 .794543 .590479
.609709 .343350 .500739 .118050 .543139 .003320 .826695 .948643 .199436 .168108
.808277 .325072 .568248 .294052 .420152 .513488 .881553 .015403 .545605 .014511
.775678 .834529 .963406 .288980 .831374 .980862 .482645 .240284 .044499 .908896
.390947 .340735 .441318 .331851 .623241

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4221.htm (1 of 3) [11/13/2003 5:33:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4221.htm (2 of 3) [11/13/2003 5:33:13 PM]


1.4.2.2.1. Background and Data 1.4.2.2.2. Graphical Output and Interpretation

.941509 .498943 .548581 .886954 .199437


.548730 .809510 .040696 .382707 .742015
.123387 .250162 .529894 .624611 .797524
.914071 .961282 .966986 .102591 .748522
.053900 .387595 .186333 .253798 .145065
1. Exploratory Data Analysis
.713101 .024674 .054556 .142777 .938919
1.4. EDA Case Studies
.740294 .390277 .557322 .709779 .017119 1.4.2. Case Studies
.525275 .802180 .814517 .541784 .561180 1.4.2.2. Uniform Random Numbers
.993371 .430533 .512969 .561271 .925536
.040903 .116644 .988352 .079848 .275938
.171539 .099733 .344088 .461233 .483247 1.4.2.2.2. Graphical Output and
.792831 .249647 .100229 .536870 .323075
.754615 .020099 .690749 .413887 .637919 Interpretation
.763558 .404401 .105182 .161501 .848769
.091882 .009732 .825395 .270422 .086304 Goal The goal of this analysis is threefold:
.833898 .737464 .278580 .900458 .549751
1. Determine if the univariate model:
.981506 .549493 .881997 .918707 .615068
.476646 .731895 .020747 .677262 .696229
.064464 .271246 .701841 .361827 .757687
is appropriate and valid.
.649020 .971877 .499042 .912272 .953750
.587193 .823431 .540164 .405666 .281310 2. Determine if the typical underlying assumptions for an "in
.030068 .227398 .207145 .329507 .706178 control" measurement process are valid. These assumptions are:
.083586 .991078 .542427 .851366 .158873 1. random drawings;
.046189 .755331 .223084 .283060 .326481 2. from a fixed distribution;
.333105 .914051 .007893 .326046 .047594
3. with the distribution having a fixed location; and
.119018 .538408 .623381 .594136 .285121
.590290 .284666 .879577 .762207 .917575 4. the distribution having a fixed scale.
.374161 .613622 .695026 .390212 .557817 3. Determine if the confidence interval
.651483 .483470 .894159 .269400 .397583
.911260 .717646 .489497 .230694 .541374
.775130 .382086 .864299 .016841 .482774 is appropriate and valid where s is the standard deviation of the
.519081 .398072 .893555 .195023 .717469 original data.
.979202 .885521 .029773 .742877 .525165
.344674 .218185 .931393 .278817 .570568

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4221.htm (3 of 3) [11/13/2003 5:33:13 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4222.htm (1 of 7) [11/13/2003 5:33:14 PM]


1.4.2.2.2. Graphical Output and Interpretation 1.4.2.2.2. Graphical Output and Interpretation

4-Plot of Run
Data Sequence
Plot

Interpretation The assumptions are addressed by the graphics shown above: Lag Plot
1. The run sequence plot (upper left) indicates that the data do not
have any significant shifts in location or scale over time.
2. The lag plot (upper right) does not indicate any non-random
pattern in the data.
3. The histogram shows that the frequencies are relatively flat
across the range of the data. This suggests that the uniform
distribution might provide a better distributional fit than the
normal distribution.
4. The normal probability plot verifies that an assumption of
normality is not reasonable. In this case, the 4-plot should be
followed up by a uniform probability plot to determine if it
provides a better fit to the data. This is shown below.
From the above plots, we conclude that the underlying assumptions are
valid. Therefore, the model Yi = C + Ei is valid. However, since the
data are not normally distributed, using the mean as an estimate of C
and the confidence interval cited above for quantifying its uncertainty
are not valid or appropriate.

Individual Although it is usually not necessary, the plots can be generated


Plots individually to give more detail.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4222.htm (2 of 7) [11/13/2003 5:33:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4222.htm (3 of 7) [11/13/2003 5:33:14 PM]


1.4.2.2.2. Graphical Output and Interpretation 1.4.2.2.2. Graphical Output and Interpretation

Histogram Normal
(with Probability
overlaid Plot
Normal PDF)

This plot shows that a normal distribution is a poor fit. The flatness of As with the histogram, the normal probability plot shows that the
the histogram suggests that a uniform distribution might be a better fit. normal distribution does not fit these data well.

Histogram Uniform
(with Probability
overlaid Plot
Uniform
PDF)

Since the histogram from the 4-plot suggested that the uniform Since the above plots suggested that a uniform distribution might be
distribution might be a good fit, we overlay a uniform distribution on appropriate, we generate a uniform probability plot. This plot shows
top of the histogram. This indicates a much better fit than a normal that the uniform distribution provides an excellent fit to the data.
distribution.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4222.htm (4 of 7) [11/13/2003 5:33:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4222.htm (5 of 7) [11/13/2003 5:33:14 PM]


1.4.2.2.2. Graphical Output and Interpretation 1.4.2.2.2. Graphical Output and Interpretation

(and comparing) confidence intervals in these cases.


Better Model Since the data follow the underlying assumptions, but with a uniform
distribution rather than a normal distribution, we would still like to
characterize C by a typical value plus or minus a confidence interval.
In this case, we would like to find a location estimator with the
smallest variability.
The bootstrap plot is an ideal tool for this purpose. The following plots
show the bootstrap plot, with the corresponding histogram, for the
mean, median, mid-range, and median absolute deviation.

Bootstrap
Plots

Mid-Range is From the above histograms, it is obvious that for these data, the
Best mid-range is far superior to the mean or median as an estimate for
location.
Using the mean, the location estimate is 0.507 and a 95% confidence
interval for the mean is (0.482,0.534). Using the mid-range, the
location estimate is 0.499 and the 95% confidence interval for the
mid-range is (0.497,0.503).
Although the values for the location are similar, the difference in the
uncertainty intervals is quite large.
Note that in the case of a uniform distribution it is known theoretically
that the mid-range is the best linear unbiased estimator for location.
However, in many applications, the most appropriate estimator will not
be known or it will be mathematically intractable to determine a valid
condfidence interval. The bootstrap provides a method for determining

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4222.htm (6 of 7) [11/13/2003 5:33:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4222.htm (7 of 7) [11/13/2003 5:33:14 PM]


1.4.2.2.3. Quantitative Output and Interpretation 1.4.2.2.3. Quantitative Output and Interpretation

Location One way to quantify a change in location over time is to fit a straight line to the data set
using the index variable X = 1, 2, ..., N, with N denoting the number of observations. If
there is no significant drift in the location, the slope parameter should be zero. For this data
1. Exploratory Data Analysis set, Dataplot generated the following output:
1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.2. Uniform Random Numbers LEAST SQUARES MULTILINEAR FIT
SAMPLE SIZE N = 500
NUMBER OF VARIABLES = 1
NO REPLICATION CASE
1.4.2.2.3. Quantitative Output and Interpretation
PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
Summary As a first step in the analysis, a table of summary statistics is computed from the data. The 1 A0 0.522923 (0.2638E-01) 19.82
Statistics following table, generated by Dataplot, shows a typical set of statistics. 2 A1 X -0.602478E-04 (0.9125E-04) -0.6603

RESIDUAL STANDARD DEVIATION = 0.2944917


SUMMARY RESIDUAL DEGREES OF FREEDOM = 498
NUMBER OF OBSERVATIONS = 500
The slope parameter, A1, has a t value of -0.66 which is statistically not significant. This
indicates that the slope can in fact be considered zero.
***********************************************************************
* LOCATION MEASURES * DISPERSION MEASURES *
*********************************************************************** Variation One simple way to detect a change in variation is with a Bartlett test after dividing the data
* MIDRANGE = 0.4997850E+00 * RANGE = 0.9945900E+00 *
* MEAN = 0.5078304E+00 * STAND. DEV. = 0.2943252E+00 * set into several equal-sized intervals. However, the Bartlett test is not robust for
* MIDMEAN = 0.5045621E+00 * AV. AB. DEV. = 0.2526468E+00 * non-normality. Since we know this data set is not approximated well by the normal
* MEDIAN = 0.5183650E+00 * MINIMUM = 0.2490000E-02 * distribution, we use the alternative Levene test. In partiuclar, we use the Levene test based
* = * LOWER QUART. = 0.2508093E+00 *
* = * LOWER HINGE = 0.2505935E+00 * on the median rather the mean. The choice of the number of intervals is somewhat arbitrary,
* = * UPPER HINGE = 0.7594775E+00 * although values of 4 or 8 are reasonable. Dataplot generated the following output for the
* = * UPPER QUART. = 0.7591152E+00 * Levene test.
* = * MAXIMUM = 0.9970800E+00 *
***********************************************************************
LEVENE F-TEST FOR SHIFT IN VARIATION
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES *
(ASSUMPTION: NORMALITY)
***********************************************************************
* AUTOCO COEF = -0.3098569E-01 * ST. 3RD MOM. = -0.3443941E-01 *
1. STATISTICS
* = 0.0000000E+00 * ST. 4TH MOM. = 0.1796969E+01 *
NUMBER OF OBSERVATIONS = 500
* = 0.0000000E+00 * ST. WILK-SHA = -0.2004886E+02 *
NUMBER OF GROUPS = 4
* = * UNIFORM PPCC = 0.9995682E+00 *
LEVENE F TEST STATISTIC = 0.7983007E-01
* = * NORMAL PPCC = 0.9771602E+00 *
* = * TUK -.5 PPCC = 0.7229201E+00 *
* = * CAUCHY PPCC = 0.3591767E+00 *
FOR LEVENE TEST STATISTIC
***********************************************************************
0 % POINT = 0.0000000E+00
50 % POINT = 0.7897459
Note that under the distributional measures the uniform probability plot correlation 75 % POINT = 1.373753
coefficient (PPCC) value is significantly larger than the normal PPCC value. This is 90 % POINT = 2.094885
evidence that the uniform distribution fits these data better than does a normal distribution. 95 % POINT = 2.622929
99 % POINT = 3.821479
99.9 % POINT = 5.506884

2.905608 % Point: 0.7983007E-01

3. CONCLUSION (AT THE 5% LEVEL):


THERE IS NO SHIFT IN VARIATION.
THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4223.htm (1 of 6) [11/13/2003 5:33:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4223.htm (2 of 6) [11/13/2003 5:33:14 PM]


1.4.2.2.3. Quantitative Output and Interpretation 1.4.2.2.3. Quantitative Output and Interpretation

In this case, the Levene test indicates that the standard deviations are not significantly
different in the 4 intervals. 1 169.0 166.5000 6.6546 0.38
2 66.0 62.2917 4.4454 0.83
3 18.0 16.5750 3.4338 0.41
Randomness 4 7.0 3.4458 1.7786 2.00
There are many ways in which data can be non-random. However, most common forms of 5 1.0 0.5895 0.7609 0.54
non-randomness can be detected with a few simple tests. The lag plot in the 4-plot in the 6 1.0 0.0858 0.2924 3.13
previous section is a simple graphical technique. 7 1.0 0.0109 0.1042 9.49
8 0.0 0.0012 0.0349 -0.03
Another check is an autocorrelation plot that shows the autocorrelations for various lags. 9 0.0 0.0001 0.0111 -0.01
10 0.0 0.0000 0.0034 0.00
Confidence bands can be plotted using 95% and 99% confidence levels. Points outside this RUNS DOWN
band indicate statistically significant values (lag 0 is always 1). Dataplot generated the STATISTIC = NUMBER OF RUNS DOWN
following autocorrelation plot. OF LENGTH EXACTLY I
I STAT EXP(STAT) SD(STAT) Z

1 113.0 104.2083 10.2792 0.86


2 43.0 45.7167 5.2996 -0.51
3 11.0 13.1292 3.2297 -0.66
4 1.0 2.8563 1.6351 -1.14
5 0.0 0.5037 0.7045 -0.71
6 0.0 0.0749 0.2733 -0.27
7 0.0 0.0097 0.0982 -0.10
8 0.0 0.0011 0.0331 -0.03
9 0.0 0.0001 0.0106 -0.01
10 0.0 0.0000 0.0032 0.00
STATISTIC = NUMBER OF RUNS DOWN
OF LENGTH I OR MORE
I STAT EXP(STAT) SD(STAT) Z

1 168.0 166.5000 6.6546 0.23


2 55.0 62.2917 4.4454 -1.64
3 12.0 16.5750 3.4338 -1.33
4 1.0 3.4458 1.7786 -1.38
5 0.0 0.5895 0.7609 -0.77
6 0.0 0.0858 0.2924 -0.29
7 0.0 0.0109 0.1042 -0.10
8 0.0 0.0012 0.0349 -0.03
The lag 1 autocorrelation, which is generally the one of most interest, is 0.03. The critical 9 0.0 0.0001 0.0111 -0.01
values at the 5% significance level are -0.087 and 0.087. This indicates that the lag 1 10 0.0 0.0000 0.0034 0.00
autocorrelation is not statistically significant, so there is no evidence of non-randomness. RUNS TOTAL = RUNS UP + RUNS DOWN
STATISTIC = NUMBER OF RUNS TOTAL
A common test for randomness is the runs test. OF LENGTH EXACTLY I
I STAT EXP(STAT) SD(STAT) Z
RUNS UP 1 216.0 208.4167 14.5370 0.52
STATISTIC = NUMBER OF RUNS UP 2 91.0 91.4333 7.4947 -0.06
OF LENGTH EXACTLY I 3 22.0 26.2583 4.5674 -0.93
I STAT EXP(STAT) SD(STAT) Z 4 7.0 5.7127 2.3123 0.56
5 0.0 1.0074 0.9963 -1.01
1 103.0 104.2083 10.2792 -0.12 6 0.0 0.1498 0.3866 -0.39
2 48.0 45.7167 5.2996 0.43 7 1.0 0.0193 0.1389 7.06
3 11.0 13.1292 3.2297 -0.66 8 0.0 0.0022 0.0468 -0.05
4 6.0 2.8563 1.6351 1.92 9 0.0 0.0002 0.0150 -0.01
5 0.0 0.5037 0.7045 -0.71 10 0.0 0.0000 0.0045 0.00
6 0.0 0.0749 0.2733 -0.27 STATISTIC = NUMBER OF RUNS TOTAL
7 1.0 0.0097 0.0982 10.08 OF LENGTH I OR MORE
8 0.0 0.0011 0.0331 -0.03 I STAT EXP(STAT) SD(STAT) Z
9 0.0 0.0001 0.0106 -0.01
10 0.0 0.0000 0.0032 0.00 1 337.0 333.0000 9.4110 0.43
STATISTIC = NUMBER OF RUNS UP 2 121.0 124.5833 6.2868 -0.57
OF LENGTH I OR MORE 3 30.0 33.1500 4.8561 -0.65
I STAT EXP(STAT) SD(STAT) Z

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4223.htm (3 of 6) [11/13/2003 5:33:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4223.htm (4 of 6) [11/13/2003 5:33:14 PM]


1.4.2.2.3. Quantitative Output and Interpretation 1.4.2.2.3. Quantitative Output and Interpretation
4 8.0 6.8917 2.5154 0.44
5 1.0 1.1790 1.0761 -0.17
6 1.0 0.1716 0.4136 2.00 Model Based on the graphical and quantitative analysis, we use the model
7 1.0 0.0217 0.1474 6.64 Yi = C + Ei
8 0.0 0.0024 0.0494 -0.05
9 0.0 0.0002 0.0157 -0.02 where C is estimated by the mid-range and the uncertainty interval for C is based on a
10 0.0 0.0000 0.0047 0.00 bootstrap analysis. Specifically,
LENGTH OF THE LONGEST RUN UP = 7
LENGTH OF THE LONGEST RUN DOWN = 4 C = 0.499
LENGTH OF THE LONGEST RUN UP OR DOWN = 7 95% confidence limit for C = (0.497,0.503)
NUMBER OF POSITIVE DIFFERENCES = 263
NUMBER OF NEGATIVE DIFFERENCES = 236 Univariate It is sometimes useful and convenient to summarize the above results in a report. The report
NUMBER OF ZERO DIFFERENCES = 0 Report for the 500 uniform random numbers follows.

Values in the column labeled "Z" greater than 1.96 or less than -1.96 are statistically
significant at the 5% level. This runs test does not indicate any significant non-randomness. Analysis for 500 uniform random numbers
There is a statistically significant value for runs of length 7. However, further examination 1: Sample Size = 500
of the table shows that there is in fact a single run of length 7 when near 0 are expected.
This is not sufficient evidence to conclude that the data are non-random. 2: Location
Mean = 0.50783
Standard Deviation of Mean = 0.013163
Distributional Probability plots are a graphical test of assessing whether a particular distribution provides 95% Confidence Interval for Mean = (0.48197,0.533692)
Analysis an adequate fit to a data set. Drift with respect to location? = NO

A quantitative enhancement to the probability plot is the correlation coefficient of the points 3: Variation
on the probability plot. For this data set the correlation coefficient, from the summary table Standard Deviation = 0.294326
95% Confidence Interval for SD = (0.277144,0.313796)
above, is 0.977. Since this is less than the critical value of 0.987 (this is a tabulated value), Drift with respect to variation?
the normality assumption is rejected. (based on Levene's test on quarters
of the data) = NO
Chi-square and Kolmogorov-Smirnov goodness-of-fit tests are alternative methods for
assessing distributional adequacy. The Wilk-Shapiro and Anderson-Darling tests can be 4: Distribution
Normal PPCC = 0.999569
used to test for normality. Dataplot generates the following output for the Anderson-Darling Data are Normal?
normality test. (as measured by Normal PPCC) = NO

ANDERSON-DARLING 1-SAMPLE TEST Uniform PPCC = 0.9995


THAT THE DATA CAME FROM A NORMAL DISTRIBUTION Data are Uniform?
(as measured by Uniform PPCC) = YES
1. STATISTICS:
NUMBER OF OBSERVATIONS = 500 5: Randomness
MEAN = 0.5078304 Autocorrelation = -0.03099
STANDARD DEVIATION = 0.2943252 Data are Random?
(as measured by autocorrelation) = YES
ANDERSON-DARLING TEST STATISTIC VALUE = 5.719849
ADJUSTED TEST STATISTIC VALUE = 5.765036 6: Statistical Control
(i.e., no drift in location or scale,
2. CRITICAL VALUES: data is random, distribution is
90 % POINT = 0.6560000 fixed, here we are testing only for
95 % POINT = 0.7870000 fixed uniform)
97.5 % POINT = 0.9180000 Data Set is in Statistical Control? = YES
99 % POINT = 1.092000

3. CONCLUSION (AT THE 5% LEVEL):


THE DATA DO NOT COME FROM A NORMAL DISTRIBUTION.
The Anderson-Darling test rejects the normality assumption because the value of the test
statistic, 5.72, is larger than the critical value of 1.092 at the 1% significance level.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4223.htm (5 of 6) [11/13/2003 5:33:14 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4223.htm (6 of 6) [11/13/2003 5:33:14 PM]


1.4.2.2.4. Work This Example Yourself 1.4.2.2.4. Work This Example Yourself
distribution for these data.

4. Generate a histogram with an 4. The histogram indicates that a


overlaid uniform pdf. uniform distribution is a good
1. Exploratory Data Analysis distribution for these data.
1.4. EDA Case Studies
1.4.2. Case Studies 5. Generate a normal probability
1.4.2.2. Uniform Random Numbers 5. The normal probability plot verifies
plot. that the normal distribution is not a
reasonable distribution for these data.
1.4.2.2.4. Work This Example Yourself
6. Generate a uniform probability 6. The uniform probability plot verifies
View This page allows you to repeat the analysis outlined in the case study plot. that the uniform distribution is a
Dataplot description on the previous page using Dataplot . It is required that you reasonable distribution for these data.
Macro for have already downloaded and installed Dataplot and configured your
this Case browser. to run Dataplot. Output from each analysis step below will be
Study displayed in one or more of the Dataplot windows. The four main 4. Generate the bootstrap plot.
windows are the Output window, the Graphics window, the Command
History window, and the data sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the 1. Generate a bootstrap plot. 1. The bootstrap plot clearly shows
bottom is a command entry window where commands can be typed in. the superiority of the mid-range
over the mean and median as the
location estimator of choice for
Data Analysis Steps Results and Conclusions this problem.

Click on the links below to start Dataplot and run this case study
5. Generate summary statistics, quantitative
yourself. Each step may use results from previous steps, so please be The links in this column will connect you with more detailed
analysis, and print a univariate report.
patient. Wait until the software verifies that the current step is information about each analysis step from the case study description.
complete before clicking on the next step.
1. Generate a table of summary 1. The summary statistics table displays
statistics. 25+ statistics.

1. Invoke Dataplot and read data.


2. Generate the mean, a confidence 2. The mean is 0.5078 and a 95%
1. Read in the data. 1. You have read 1 column of numbers interval for the mean, and compute confidence interval is (0.482,0.534).
into Dataplot, variable Y. a linear fit to detect drift in The linear fit indicates no drift in
location. location since the slope parameter is
statistically not significant.

2. 4-plot of the data.


3. Generate the standard deviation, a 3. The standard deviation is 0.29 with
1. 4-plot of Y. 1. Based on the 4-plot, there are no shifts confidence interval for the standard a 95% confidence interval of (0.277,0.314).
in location or scale, and the data do not deviation, and detect drift in variation Levene's test indicates no significant
seem to follow a normal distribution. by dividing the data into quarters and drift in variation.
computing Barltetts test for equal
standard deviations.
3. Generate the individual plots.

1. Generate a run sequence plot. 1. The run sequence plot indicates that 4. Check for randomness by generating an 4. The lag 1 autocorrelation is -0.03.
there are no shifts of location or autocorrelation plot and a runs test. From the autocorrelation plot, this is
scale. within the 95% confidence interval
bands.
2. Generate a lag plot. 2. The lag plot does not indicate any
significant patterns (which would
5. Check for normality by computing the 5. The uniform probability plot correlation
show the data were not random).
normal probability plot correlation coefficient is 0.9995. This indicates that
3. Generate a histogram with an coefficient. the uniform distribution is a good fit.
3. The histogram indicates that a
overlaid normal pdf.
normal distribution is not a good

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4224.htm (1 of 3) [11/13/2003 5:33:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4224.htm (2 of 3) [11/13/2003 5:33:15 PM]


1.4.2.2.4. Work This Example Yourself 1.4.2.3. Random Walk

6. Print a univariate report (this assumes 6. The results are summarized in a


steps 2 thru 6 have already been run). convenient report.

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies

1.4.2.3. Random Walk


Random This example illustrates the univariate analysis of a set of numbers
Walk derived from a random walk.

1. Background and Data


2. Test Underlying Assumptions
3. Develop Better Model
4. Validate New Model
5. Work This Example Yourself

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4224.htm (3 of 3) [11/13/2003 5:33:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda423.htm [11/13/2003 5:33:15 PM]


1.4.2.3.1. Background and Data 1.4.2.3.1. Background and Data

0.413625
-0.002149
0.393170
0.538263
0.070583
1. Exploratory Data Analysis
0.473143
1.4. EDA Case Studies
1.4.2. Case Studies
0.132676
1.4.2.3. Random Walk 0.109111
-0.310553
0.179637
1.4.2.3.1. Background and Data -0.067454
-0.190747
-0.536916
Generation A random walk can be generated from a set of uniform random numbers -0.905751
by the formula: -0.518984
-0.579280
-0.643004
-1.014925
where U is a set of uniform random numbers. -0.517845
-0.860484
The motivation for studying a set of random walk data is to illustrate the -0.884081
effects of a known underlying autocorrelation structure (i.e., -1.147428
non-randomness) in the data. -0.657917
-0.470205
Software Most general purpose statistical software programs, including Dataplot, -0.798437
can generate data for a random walk. -0.637780
-0.666046
Resulting The following is the set of random walk numbers used for this case -1.093278
Data study. -1.089609
-0.853439
-0.695306
-0.399027 -0.206795
-0.645651 -0.507504
-0.625516 -0.696903
-0.262049 -1.116358
-0.407173 -1.044534
-0.097583 -1.481004
0.314156 -1.638390
0.106905 -1.270400
-0.017675 -1.026477
-0.037111 -1.123380
0.357631 -0.770683
0.820111 -0.510481
0.844148 -0.958825
0.550509 -0.531959
0.090709 -0.457141

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (1 of 12) [11/13/2003 5:33:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (2 of 12) [11/13/2003 5:33:15 PM]
1.4.2.3.1. Background and Data 1.4.2.3.1. Background and Data

-0.226603 0.929681
-0.201885 1.097632
-0.078000 1.501279
0.057733 1.650608
-0.228762 1.759718
-0.403292 2.255664
-0.414237 2.490551
-0.556689 2.508200
-0.772007 2.707382
-0.401024 2.816310
-0.409768 3.254166
-0.171804 2.890989
-0.096501 2.869330
-0.066854 3.024141
0.216726 3.291558
0.551008 3.260067
0.660360 3.265871
0.194795 3.542845
-0.031321 3.773240
0.453880 3.991880
0.730594 3.710045
1.136280 4.011288
0.708490 4.074805
1.149048 4.301885
1.258757 3.956416
1.102107 4.278790
1.102846 3.989947
0.720896 4.315261
0.764035 4.200798
1.072312 4.444307
0.897384 4.926084
0.965632 4.828856
0.759684 4.473179
0.679836 4.573389
0.955514 4.528605
1.290043 4.452401
1.753449 4.238427
1.542429 4.437589
1.873803 4.617955
2.043881 4.370246
1.728635 4.353939
1.289703 4.541142
1.501481 4.807353
1.888335 4.706447
1.408421 4.607011
1.416005 4.205943

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (3 of 12) [11/13/2003 5:33:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (4 of 12) [11/13/2003 5:33:15 PM]
1.4.2.3.1. Background and Data 1.4.2.3.1. Background and Data

3.756457 5.641670
3.482142 5.753639
3.126784 5.298265
3.383572 5.255743
3.846550 5.500935
4.228803 5.434664
4.110948 5.588610
4.525939 6.047952
4.478307 6.130557
4.457582 5.785299
4.822199 5.811995
4.605752 5.582793
5.053262 5.618730
5.545598 5.902576
5.134798 6.226537
5.438168 5.738371
5.397993 5.449965
5.838361 5.895537
5.925389 6.252904
6.159525 6.650447
6.190928 7.025909
6.024970 6.770340
5.575793 7.182244
5.516840 6.941536
5.211826 7.368996
4.869306 7.293807
4.912601 7.415205
5.339177 7.259291
5.415182 6.970976
5.003303 7.319743
4.725367 6.850454
4.350873 6.556378
4.225085 6.757845
3.825104 6.493083
3.726391 6.824855
3.301088 6.533753
3.767535 6.410646
4.211463 6.502063
4.418722 6.264585
4.554786 6.730889
4.987701 6.753715
4.993045 6.298649
5.337067 6.048126
5.789629 5.794463
5.726147 5.539049
5.934353 5.290072

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (5 of 12) [11/13/2003 5:33:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (6 of 12) [11/13/2003 5:33:15 PM]
1.4.2.3.1. Background and Data 1.4.2.3.1. Background and Data

5.409699 4.076168
5.843266 4.236168
5.680389 3.923607
5.185889 3.666004
5.451353 3.284967
5.003233 2.980621
5.102844 2.623622
5.566741 2.882375
5.613668 3.176416
5.352791 3.598001
5.140087 3.764744
4.999718 3.945428
5.030444 4.408280
5.428537 4.359831
5.471872 4.353650
5.107334 4.329722
5.387078 4.294088
4.889569 4.588631
4.492962 4.679111
4.591042 4.182430
4.930187 4.509125
4.857455 4.957768
4.785815 4.657204
5.235515 4.325313
4.865727 4.338800
4.855005 4.720353
4.920206 4.235756
4.880794 4.281361
4.904395 3.795872
4.795317 4.276734
5.163044 4.259379
4.807122 3.999663
5.246230 3.544163
5.111000 3.953058
5.228429 3.844006
5.050220 3.684740
4.610006 3.626058
4.489258 3.457909
4.399814 3.581150
4.606821 4.022659
4.974252 4.021602
5.190037 4.070183
5.084155 4.457137
5.276501 4.156574
4.917121 4.205304
4.534573 4.514814

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (7 of 12) [11/13/2003 5:33:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (8 of 12) [11/13/2003 5:33:15 PM]
1.4.2.3.1. Background and Data 1.4.2.3.1. Background and Data

4.055510 2.761665
3.938217 2.744913
4.180232 3.037743
3.803619 2.787390
3.553781 2.387619
3.583675 2.424489
3.708286 2.247564
4.005810 2.502179
4.419880 2.022278
4.881163 2.213027
5.348149 2.126914
4.950740 2.264833
5.199262 2.528391
4.753162 2.432792
4.640757 2.037974
4.327090 1.699475
4.080888 2.048244
3.725953 1.640126
3.939054 1.149858
3.463728 1.475253
3.018284 1.245675
2.661061 0.831979
3.099980 1.165877
3.340274 1.403341
3.230551 1.181921
3.287873 1.582379
3.497652 1.632130
3.014771 2.113636
3.040046 2.163129
3.342226 2.545126
3.656743 2.963833
3.698527 3.078901
3.759707 3.055547
4.253078 3.287442
4.183611 2.808189
4.196580 2.985451
4.257851 3.181679
4.683387 2.746144
4.224290 2.517390
3.840934 2.719231
4.329286 2.581058
3.909134 2.838745
3.685072 2.987765
3.356611 3.459642
2.956344 3.458684
2.800432 3.870956

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (9 of 12) [11/13/2003 5:33:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (10 of 12) [11/13/2003 5:33:15 PM]
1.4.2.3.1. Background and Data 1.4.2.3.1. Background and Data

4.324706 3.185503
4.411899 3.403148
4.735330 3.392646
4.775494 3.123339
4.681160 3.164713
4.462470 3.439843
3.992538 3.321929
3.719936 3.686229
3.427081 3.203069
3.256588 3.185843
3.462766 3.204924
3.046353 3.102996
3.537430 3.496552
3.579857 3.191575
3.931223 3.409044
3.590096 3.888246
3.136285 4.273767
3.391616 3.803540
3.114700 4.046417
2.897760 4.071581
2.724241 3.916256
2.557346 3.634441
2.971397 4.065834
2.479290 3.844651
2.305336 3.915219
1.852930
1.471948
1.510356
1.633737
1.727873
1.512994
1.603284
1.387950
1.767527
2.029734
2.447309
2.321470
2.435092
2.630118
2.520330
2.578147
2.729630
2.713100
3.107260
2.876659
2.774242

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (11 of 12) [11/13/2003 5:33:15 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4231.htm (12 of 12) [11/13/2003 5:33:15 PM]
1.4.2.3.2. Test Underlying Assumptions 1.4.2.3.2. Test Underlying Assumptions

Interpretation The assumptions are addressed by the graphics shown above:


1. The run sequence plot (upper left) indicates significant shifts in location over time.
2. The lag plot (upper right) indicates significant non-randomness in the data.
1. Exploratory Data Analysis
1.4. EDA Case Studies 3. When the assumptions of randomness and constant location and scale are not satisfied,
1.4.2. Case Studies the distributional assumptions are not meaningful. Therefore we do not attempt to make
1.4.2.3. Random Walk any interpretation of the histogram (lower left) or the normal probability plot (lower
right).
From the above plots, we conclude that the underlying assumptions are seriously violated.
1.4.2.3.2. Test Underlying Assumptions Therefore the Yi = C + Ei model is not valid.

Goal The goal of this analysis is threefold: When the randomness assumption is seriously violated, a time series model may be
1. Determine if the univariate model: appropriate. The lag plot often suggests a reasonable model. For example, in this case the
strongly linear appearance of the lag plot suggests a model fitting Yi versus Yi-1 might be
appropriate. When the data are non-random, it is helpful to supplement the lag plot with an
is appropriate and valid. autocorrelation plot and a spectral plot. Although in this case the lag plot is enough to suggest
2. Determine if the typical underlying assumptions for an "in control" measurement an appropriate model, we provide the autocorrelation and spectral plots for comparison.
process are valid. These assumptions are:
1. random drawings; Autocorrelation When the lag plot indicates significant non-randomness, it can be helpful to follow up with a
Plot an autocorrelation plot.
2. from a fixed distribution;
3. with the distribution having a fixed location; and
4. the distribution having a fixed scale.
3. Determine if the confidence interval

is appropriate and valid, with s denoting the standard deviation of the original data.

4-Plot of Data

This autocorrelation plot shows significant autocorrelation at lags 1 through 100 in a linearly
decreasing fashion.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4232.htm (1 of 7) [11/13/2003 5:33:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4232.htm (2 of 7) [11/13/2003 5:33:16 PM]


1.4.2.3.2. Test Underlying Assumptions 1.4.2.3.2. Test Underlying Assumptions

Spectral Plot Another useful plot for non-random data is the spectral plot. The value of the autocorrelation statistic, 0.987, is evidence of a very strong autocorrelation.

Location One way to quantify a change in location over time is to fit a straight line to the data set using
the index variable X = 1, 2, ..., N, with N denoting the number of observations. If there is no
significant drift in the location, the slope parameter should be zero. For this data set, Dataplot
generates the following output:

LEAST SQUARES MULTILINEAR FIT


SAMPLE SIZE N = 500
NUMBER OF VARIABLES = 1
NO REPLICATION CASE

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 1.83351 (0.1721 ) 10.65
2 A1 X 0.552164E-02 (0.5953E-03) 9.275

RESIDUAL STANDARD DEVIATION = 1.921416


RESIDUAL DEGREES OF FREEDOM = 498

COEF AND SD(COEF) WRITTEN OUT TO FILE DPST1F.DAT


SD(PRED),95LOWER,95UPPER,99LOWER,99UPPER
WRITTEN OUT TO FILE DPST2F.DAT
This spectral plot shows a single dominant low frequency peak. REGRESSION DIAGNOSTICS WRITTEN OUT TO FILE DPST3F.DAT
PARAMETER VARIANCE-COVARIANCE MATRIX AND
Quantitative Although the 4-plot above clearly shows the violation of the assumptions, we supplement the INVERSE OF X-TRANSPOSE X MATRIX
WRITTEN OUT TO FILE DPST4F.DAT
Output graphical output with some quantitative measures.
The slope parameter, A1, has a t value of 9.3 which is statistically significant. This indicates
Summary that the slope cannot in fact be considered zero and so the conclusion is that we do not have
As a first step in the analysis, a table of summary statistics is computed from the data. The
Statistics constant location.
following table, generated by Dataplot, shows a typical set of statistics.

SUMMARY
Variation One simple way to detect a change in variation is with a Bartlett test after dividing the data set
into several equal-sized intervals. However, the Bartlett test is not robust for non-normality.
NUMBER OF OBSERVATIONS = 500
Since we know this data set is not approximated well by the normal distribution, we use the
alternative Levene test. In partiuclar, we use the Levene test based on the median rather the
*********************************************************************** mean. The choice of the number of intervals is somewhat arbitrary, although values of 4 or 8
* LOCATION MEASURES * DISPERSION MEASURES * are reasonable. Dataplot generated the following output for the Levene test.
***********************************************************************
* MIDRANGE = 0.2888407E+01 * RANGE = 0.9053595E+01 *
* MEAN = 0.3216681E+01 * STAND. DEV. = 0.2078675E+01 * LEVENE F-TEST FOR SHIFT IN VARIATION
* MIDMEAN = 0.4791331E+01 * AV. AB. DEV. = 0.1660585E+01 * (ASSUMPTION: NORMALITY)
* MEDIAN = 0.3612030E+01 * MINIMUM = -0.1638390E+01 *
* = * LOWER QUART. = 0.1747245E+01 * 1. STATISTICS
* = * LOWER HINGE = 0.1741042E+01 * NUMBER OF OBSERVATIONS = 500
* = * UPPER HINGE = 0.4682273E+01 * NUMBER OF GROUPS = 4
* = * UPPER QUART. = 0.4681717E+01 * LEVENE F TEST STATISTIC = 10.45940
* = * MAXIMUM = 0.7415205E+01 *
***********************************************************************
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES * FOR LEVENE TEST STATISTIC
*********************************************************************** 0 % POINT = 0.0000000E+00
* AUTOCO COEF = 0.9868608E+00 * ST. 3RD MOM. = -0.4448926E+00 * 50 % POINT = 0.7897459
* = 0.0000000E+00 * ST. 4TH MOM. = 0.2397789E+01 * 75 % POINT = 1.373753
* = 0.0000000E+00 * ST. WILK-SHA = -0.1279870E+02 * 90 % POINT = 2.094885
* = * UNIFORM PPCC = 0.9765666E+00 * 95 % POINT = 2.622929
* = * NORMAL PPCC = 0.9811183E+00 * 99 % POINT = 3.821479
* = * TUK -.5 PPCC = 0.7754489E+00 * 99.9 % POINT = 5.506884
* = * CAUCHY PPCC = 0.4165502E+00 *
***********************************************************************

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4232.htm (3 of 7) [11/13/2003 5:33:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4232.htm (4 of 7) [11/13/2003 5:33:16 PM]


1.4.2.3.2. Test Underlying Assumptions 1.4.2.3.2. Test Underlying Assumptions
99.99989 % Point: 10.45940 7 2.0 0.0097 0.0982 20.26
8 0.0 0.0011 0.0331 -0.03
3. CONCLUSION (AT THE 5% LEVEL): 9 0.0 0.0001 0.0106 -0.01
THERE IS A SHIFT IN VARIATION. 10 0.0 0.0000 0.0032 0.00
THUS: NOT HOMOGENEOUS WITH RESPECT TO VARIATION.

STATISTIC = NUMBER OF RUNS DOWN


In this case, the Levene test indicates that the standard deviations are significantly different in OF LENGTH I OR MORE
the 4 intervals since the test statistic of 10.46 is greater than the 95% critical value of 2.62.
Therefore we conclude that the scale is not constant.
I STAT EXP(STAT) SD(STAT) Z
Randomness Although the lag 1 autocorrelation coefficient above clearly shows the non-randomness, we 1 127.0 166.5000 6.6546 -5.94
show the output from a runs test as well. 2 58.0 62.2917 4.4454 -0.97
3 26.0 16.5750 3.4338 2.74
4 15.0 3.4458 1.7786 6.50
RUNS UP 5 9.0 0.5895 0.7609 11.05
6 4.0 0.0858 0.2924 13.38
STATISTIC = NUMBER OF RUNS UP 7 2.0 0.0109 0.1042 19.08
OF LENGTH EXACTLY I 8 0.0 0.0012 0.0349 -0.03
9 0.0 0.0001 0.0111 -0.01
I STAT EXP(STAT) SD(STAT) Z 10 0.0 0.0000 0.0034 0.00
1 63.0 104.2083 10.2792 -4.01
2 34.0 45.7167 5.2996 -2.21 RUNS TOTAL = RUNS UP + RUNS DOWN
3 17.0 13.1292 3.2297 1.20
4 4.0 2.8563 1.6351 0.70 STATISTIC = NUMBER OF RUNS TOTAL
5 1.0 0.5037 0.7045 0.70 OF LENGTH EXACTLY I
6 5.0 0.0749 0.2733 18.02
7 1.0 0.0097 0.0982 10.08 I STAT EXP(STAT) SD(STAT) Z
8 1.0 0.0011 0.0331 30.15
9 0.0 0.0001 0.0106 -0.01 1 132.0 208.4167 14.5370 -5.26
10 1.0 0.0000 0.0032 311.40 2 66.0 91.4333 7.4947 -3.39
3 28.0 26.2583 4.5674 0.38
4 10.0 5.7127 2.3123 1.85
STATISTIC = NUMBER OF RUNS UP 5 6.0 1.0074 0.9963 5.01
OF LENGTH I OR MORE 6 7.0 0.1498 0.3866 17.72
7 3.0 0.0193 0.1389 21.46
I STAT EXP(STAT) SD(STAT) Z 8 1.0 0.0022 0.0468 21.30
9 0.0 0.0002 0.0150 -0.01
1 127.0 166.5000 6.6546 -5.94 10 1.0 0.0000 0.0045 220.19
2 64.0 62.2917 4.4454 0.38
3 30.0 16.5750 3.4338 3.91
4 13.0 3.4458 1.7786 5.37 STATISTIC = NUMBER OF RUNS TOTAL
5 9.0 0.5895 0.7609 11.05 OF LENGTH I OR MORE
6 8.0 0.0858 0.2924 27.06
7 3.0 0.0109 0.1042 28.67 I STAT EXP(STAT) SD(STAT) Z
8 2.0 0.0012 0.0349 57.21
9 1.0 0.0001 0.0111 90.14 1 254.0 333.0000 9.4110 -8.39
10 1.0 0.0000 0.0034 298.08 2 122.0 124.5833 6.2868 -0.41
3 56.0 33.1500 4.8561 4.71
4 28.0 6.8917 2.5154 8.39
RUNS DOWN 5 18.0 1.1790 1.0761 15.63
6 12.0 0.1716 0.4136 28.60
STATISTIC = NUMBER OF RUNS DOWN 7 5.0 0.0217 0.1474 33.77
OF LENGTH EXACTLY I 8 2.0 0.0024 0.0494 40.43
9 1.0 0.0002 0.0157 63.73
I STAT EXP(STAT) SD(STAT) Z 10 1.0 0.0000 0.0047 210.77
1 69.0 104.2083 10.2792 -3.43
2 32.0 45.7167 5.2996 -2.59 LENGTH OF THE LONGEST RUN UP = 10
3 11.0 13.1292 3.2297 -0.66 LENGTH OF THE LONGEST RUN DOWN = 7
4 6.0 2.8563 1.6351 1.92 LENGTH OF THE LONGEST RUN UP OR DOWN = 10
5 5.0 0.5037 0.7045 6.38
6 2.0 0.0749 0.2733 7.04 NUMBER OF POSITIVE DIFFERENCES = 258

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4232.htm (5 of 7) [11/13/2003 5:33:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4232.htm (6 of 7) [11/13/2003 5:33:16 PM]


1.4.2.3.2. Test Underlying Assumptions 1.4.2.3.3. Develop A Better Model
NUMBER OF NEGATIVE DIFFERENCES = 241
NUMBER OF ZERO DIFFERENCES = 0

Values in the column labeled "Z" greater than 1.96 or less than -1.96 are statistically
significant at the 5% level. Numerous values in this column are much larger than +/-1.96, so
we conclude that the data are not random. 1. Exploratory Data Analysis
1.4. EDA Case Studies
Distributional Since the quantitative tests show that the assumptions of randomness and constant location and 1.4.2. Case Studies
Assumptions scale are not met, the distributional measures will not be meaningful. Therefore these 1.4.2.3. Random Walk
quantitative tests are omitted.

1.4.2.3.3. Develop A Better Model


Lag Plot Since the underlying assumptions did not hold, we need to develop a better model.
Suggests
Better The lag plot showed a distinct linear pattern. Given the definition of the lag plot, Yi versus
Model Yi-1, a good candidate model is a model of the form

Fit A linear fit of this model generated the following output.


Output
LEAST SQUARES MULTILINEAR FIT
SAMPLE SIZE N = 499
NUMBER OF VARIABLES = 1
NO REPLICATION CASE

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 0.501650E-01 (0.2417E-01) 2.075
2 A1 YIM1 0.987087 (0.6313E-02) 156.4

RESIDUAL STANDARD DEVIATION = 0.2931194


RESIDUAL DEGREES OF FREEDOM = 497

The slope parameter, A1, has a t value of 156.4 which is statistically significant. Also, the
residual standard deviation is 0.29. This can be compared to the standard deviation shown in
the summary table, which is 2.08. That is, the fit to the autoregressive model has reduced the
variability by a factor of 7.

Time This model is an example of a time series model. More extensive discussion of time series is
Series given in the Process Monitoring chapter.
Model

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4232.htm (7 of 7) [11/13/2003 5:33:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4233.htm [11/13/2003 5:33:16 PM]


1.4.2.3.4. Validate New Model 1.4.2.3.4. Validate New Model

4-Plot of
Residuals

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.3. Random Walk

1.4.2.3.4. Validate New Model


Plot The first step in verifying the model is to plot the predicted values from
Predicted the fit with the original data.
with Original
Data

Interpretation The assumptions are addressed by the graphics shown above:


1. The run sequence plot (upper left) indicates no significant shifts
in location or scale over time.
2. The lag plot (upper right) exhibits a random appearance.
3. The histogram shows a relatively flat appearance. This indicates
that a uniform probability distribution may be an appropriate
model for the error component (or residuals).
4. The normal probability plot clearly shows that the normal
distribution is not an appropriate model for the error component.
A uniform probability plot can be used to further test the suggestion
that a uniform distribution might be a good model for the error
component.
This plot indicates a reasonably good fit.

Test In addition to the plot of the predicted values, the residual standard
Underlying deviation from the fit also indicates a significant improvement for the
Assumptions model. The next step is to validate the underlying assumptions for the
on the error component, or residuals, from this model.
Residuals

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4234.htm (1 of 4) [11/13/2003 5:33:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4234.htm (2 of 4) [11/13/2003 5:33:16 PM]


1.4.2.3.4. Validate New Model 1.4.2.3.4. Validate New Model

Uniform Time Series This model is an example of a time series model. More extensive
Probability Model discussion of time series is given in the Process Monitoring chapter.
Plot of
Residuals

Since the uniform probability plot is nearly linear, this verifies that a
uniform distribution is a good model for the error component.

Conclusions Since the residuals from our model satisfy the underlying assumptions,
we conlude that

where the Ei follow a uniform distribution is a good model for this data
set. We could simplify this model to

This has the advantage of simplicity (the current point is simply the
previous point plus a uniformly distributed error term).

Using In this case, the above model makes sense based on our definition of
Scientific and the random walk. That is, a random walk is the cumulative sum of
Engineering uniformly distributed data points. It makes sense that modeling the
Knowledge current point as the previous point plus a uniformly distributed error
term is about as good as we can do. Although this case is a bit artificial
in that we knew how the data were constructed, it is common and
desirable to use scientific and engineering knowledge of the process
that generated the data in formulating and testing models for the data.
Quite often, several competing models will produce nearly equivalent
mathematical results. In this case, selecting the model that best
approximates the scientific understanding of the process is a reasonable
choice.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4234.htm (3 of 4) [11/13/2003 5:33:16 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4234.htm (4 of 4) [11/13/2003 5:33:16 PM]


1.4.2.3.5. Work This Example Yourself 1.4.2.3.5. Work This Example Yourself
standard deviations.

5. The runs test indicates significant


5. Check for randomness by generating non-randomness.
1. Exploratory Data Analysis a runs test.
1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.3. Random Walk

1.4.2.3.5. Work This Example Yourself 3. Generate the randomness plots.

View This page allows you to repeat the analysis outlined in the case study 1. Generate an autocorrelation plot. 1. The autocorrelation plot shows
Dataplot description on the previous page using Dataplot . It is required that you significant autocorrelation at lag 1.
Macro for have already downloaded and installed Dataplot and configured your
this Case browser. to run Dataplot. Output from each analysis step below will be 2. Generate a spectral plot. 2. The spectral plot shows a single dominant
Study displayed in one or more of the Dataplot windows. The four main low frequency peak.
windows are the Output window, the Graphics window, the Command
History window, and the data sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in. 4. Fit Yi = A0 + A1*Yi-1 + Ei
and validate.

Data Analysis Steps Results and Conclusions 1. Generate the fit. 1. The residual standard deviation from the
fit is 0.29 (compared to the standard
deviation of 2.08 from the original
Click on the links below to start Dataplot and run this case data).
The links in this column will connect you with more detailed
study yourself. Each step may use results from previous steps,
information about each analysis step from the case study
so please be patient. Wait until the software verifies that the
description.
current step is complete before clicking on the next step. 2. Plot fitted line with original data. 2. The plot of the predicted values with
the original data indicates a good fit.

1. Invoke Dataplot and read data. 3. Generate a 4-plot of the residuals 3. The 4-plot indicates that the assumptions
from the fit. of constant location and scale are valid.
1. Read in the data. 1. You have read 1 column of numbers
The lag plot indicates that the data are
into Dataplot, variable Y.
random. However, the histogram and normal
probability plot indicate that the uniform
disribution might be a better model for
2. Validate assumptions. the residuals than the normal
distribution.
1. 4-plot of Y. 1. Based on the 4-plot, there are shifts
in location and scale and the data are not 4. Generate a uniform probability plot
random. of the residuals. 4. The uniform probability plot verifies
that the residuals can be fit by a
2. Generate a table of summary 2. The summary statistics table displays uniform distribution.
statistics. 25+ statistics.

3. Generate a linear fit to detect 3. The linear fit indicates drift in


drift in location. location since the slope parameter
is statistically significant.

4. Detect drift in variation by 4. Levene's test indicates significant


dividing the data into quarters and drift in variation.
computing Levene's test for equal

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4235.htm (1 of 2) [11/13/2003 5:33:17 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4235.htm (2 of 2) [11/13/2003 5:33:17 PM]


1.4.2.4. Josephson Junction Cryothermometry 1.4.2.4.1. Background and Data

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.4. EDA Case Studies 1.4. EDA Case Studies
1.4.2. Case Studies 1.4.2. Case Studies
1.4.2.4. Josephson Junction Cryothermometry

1.4.2.4. Josephson Junction


1.4.2.4.1. Background and Data
Cryothermometry
Generation This data set was collected by Bob Soulen of NIST in October, 1971 as
Josephson Junction This example illustrates the univariate analysis of Josephson a sequence of observations collected equi-spaced in time from a volt
Cryothermometry junction cyrothermometry. meter to ascertain the process temperature in a Josephson junction
cryothermometry (low temperature) experiment. The response variable
1. Background and Data is voltage counts.
2. Graphical Output and Interpretation
Motivation The motivation for studying this data set is to illustrate the case where
3. Quantitative Output and Interpretation
there is discreteness in the measurements, but the underlying
4. Work This Example Yourself assumptions hold. In this case, the discreteness is due to the data being
integers.

This file can be read by Dataplot with the following commands:


SKIP 25
SET READ FORMAT 5F5.0
READ SOULEN.DAT Y
SET READ FORMAT

Resulting The following are the data used for this case study.
Data
2899 2898 2898 2900 2898
2901 2899 2901 2900 2898
2898 2898 2898 2900 2898
2897 2899 2897 2899 2899
2900 2897 2900 2900 2899
2898 2898 2899 2899 2899
2899 2899 2898 2899 2899
2899 2902 2899 2900 2898
2899 2899 2899 2899 2899
2899 2900 2899 2900 2898
2901 2900 2899 2899 2899
2899 2899 2900 2899 2898
2898 2898 2900 2896 2897

http://www.itl.nist.gov/div898/handbook/eda/section4/eda424.htm [11/13/2003 5:33:22 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4241.htm (1 of 4) [11/13/2003 5:33:23 PM]


1.4.2.4.1. Background and Data 1.4.2.4.1. Background and Data

2899 2899 2900 2898 2900 2897 2899 2900 2899 2897
2901 2898 2899 2901 2900 2898 2900 2900 2898 2898
2898 2900 2899 2899 2897 2899 2900 2898 2900 2900
2899 2898 2899 2899 2898 2898 2900 2898 2898 2898
2899 2897 2899 2899 2897 2898 2898 2899 2898 2900
2899 2897 2899 2897 2897 2897 2899 2898 2899 2898
2899 2897 2898 2898 2899 2897 2900 2901 2899 2898
2897 2898 2897 2899 2899 2898 2901 2898 2899 2897
2898 2898 2897 2898 2895 2899 2897 2896 2898 2898
2897 2898 2898 2896 2898 2899 2900 2896 2897 2897
2898 2897 2896 2898 2898 2898 2899 2899 2898 2898
2897 2897 2898 2898 2896 2897 2897 2898 2897 2897
2898 2898 2896 2899 2898 2898 2898 2898 2896 2895
2898 2898 2899 2899 2898 2898 2898 2898 2896 2898
2898 2899 2899 2899 2900 2898 2898 2897 2897 2899
2900 2901 2899 2898 2898 2896 2900 2897 2897 2898
2900 2899 2898 2901 2897 2896 2897 2898 2898 2898
2898 2898 2900 2899 2899 2897 2897 2898 2899 2897
2898 2898 2899 2898 2901 2898 2899 2897 2900 2896
2900 2897 2897 2898 2898 2899 2897 2898 2897 2900
2900 2898 2899 2898 2898 2899 2900 2897 2897 2898
2898 2896 2895 2898 2898 2897 2899 2899 2898 2897
2898 2898 2897 2897 2895 2901 2900 2898 2901 2899
2897 2897 2900 2898 2896 2900 2899 2898 2900 2900
2897 2898 2898 2899 2898 2899 2898 2897 2900 2898
2897 2898 2898 2896 2900 2898 2897 2899 2898 2900
2899 2898 2896 2898 2896 2899 2898 2899 2897 2900
2896 2896 2897 2897 2896 2898 2902 2897 2898 2899
2897 2897 2896 2898 2896 2899 2899 2898 2897 2898
2898 2896 2897 2896 2897 2897 2898 2899 2900 2900
2897 2898 2897 2896 2895 2899 2898 2899 2900 2899
2898 2896 2896 2898 2896 2900 2899 2899 2899 2899
2898 2898 2897 2897 2898 2899 2898 2899 2899 2900
2897 2899 2896 2897 2899 2902 2899 2900 2900 2901
2900 2898 2898 2897 2898 2899 2901 2899 2899 2902
2899 2899 2900 2900 2900 2898 2898 2898 2898 2899
2900 2899 2899 2899 2898 2899 2900 2900 2900 2898
2900 2901 2899 2898 2900 2899 2899 2900 2899 2900
2901 2901 2900 2899 2898 2899 2900 2898 2898 2898
2901 2899 2901 2900 2901 2900 2898 2899 2900 2899
2898 2900 2900 2898 2900 2899 2900 2898 2898 2899
2900 2898 2899 2901 2900 2899 2899 2899 2898 2898
2899 2899 2900 2900 2899 2897 2898 2899 2897 2897
2900 2901 2899 2898 2898 2901 2898 2897 2898 2899
2899 2896 2898 2897 2898 2898 2897 2899 2898 2897
2898 2897 2897 2897 2898 2898 2898 2897 2898 2899

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4241.htm (2 of 4) [11/13/2003 5:33:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4241.htm (3 of 4) [11/13/2003 5:33:23 PM]


1.4.2.4.1. Background and Data 1.4.2.4.2. Graphical Output and Interpretation

2899 2899 2899 2900 2899


2899 2897 2898 2899 2900
2898 2897 2901 2899 2901
2898 2899 2901 2900 2900
2899 2900 2900 2900 2900
1. Exploratory Data Analysis
2901 2900 2901 2899 2897
1.4. EDA Case Studies
2900 2900 2901 2899 2898 1.4.2. Case Studies
2900 2899 2899 2900 2899 1.4.2.4. Josephson Junction Cryothermometry
2900 2899 2900 2899 2901
2900 2900 2899 2899 2898
2899 2900 2898 2899 2899 1.4.2.4.2. Graphical Output and
2901 2898 2898 2900 2899
2899 2898 2897 2898 2897 Interpretation
2899 2899 2899 2898 2898
2897 2898 2899 2897 2897 Goal The goal of this analysis is threefold:
2899 2898 2898 2899 2899
1. Determine if the univariate model:
2901 2899 2899 2899 2897
2900 2896 2898 2898 2900
2897 2899 2897 2896 2898
is appropriate and valid.
2897 2898 2899 2896 2899
2901 2898 2898 2896 2897 2. Determine if the typical underlying assumptions for an "in
2899 2897 2898 2899 2898 control" measurement process are valid. These assumptions are:
2898 2898 2898 2898 2898 1. random drawings;
2899 2900 2899 2901 2898 2. from a fixed distribution;
2899 2899 2898 2900 2898
3. with the distribution having a fixed location; and
2899 2899 2901 2900 2901
2899 2901 2899 2901 2899 4. the distribution having a fixed scale.
2900 2902 2899 2898 2899 3. Determine if the confidence interval
2900 2899 2900 2900 2901
2900 2899 2901 2901 2899
2898 2901 2897 2898 2901 is appropriate and valid where s is the standard deviation of the
2900 2902 2899 2900 2898 original data.
2900 2899 2900 2899 2899
2899 2898 2900 2898 2899
2899 2899 2899 2898 2900

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4241.htm (4 of 4) [11/13/2003 5:33:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4242.htm (1 of 4) [11/13/2003 5:33:23 PM]


1.4.2.4.2. Graphical Output and Interpretation 1.4.2.4.2. Graphical Output and Interpretation

the Quantitative Output and Interpretation section.


4-Plot of
Data Individual Although it is normally not necessary, the plots can be generated
Plots individually to give more detail.

Run
Sequence
Plot

Interpretation The assumptions are addressed by the graphics shown above:


1. The run sequence plot (upper left) indicates that the data do not
have any significant shifts in location or scale over time.
2. The lag plot (upper right) does not indicate any non-random Lag Plot
pattern in the data.
3. The histogram (lower left) shows that the data are reasonably
symmetric, there does not appear to be significant outliers in the
tails, and that it is reasonable to assume that the data can be fit
with a normal distribution.
4. The normal probability plot (lower right) is difficult to interpret
due to the fact that there are only a few distinct values with
many repeats.
The integer data with only a few distinct values and many repeats
accounts for the discrete appearance of several of the plots (e.g., the lag
plot and the normal probability plot). In this case, the nature of the data
makes the normal probability plot difficult to interpret, especially since
each number is repeated many times. However, the histogram indicates
that a normal distribution should provide an adequate model for the
data.
From the above plots, we conclude that the underlying assumptions are
valid and the data can be reasonably approximated with a normal
distribution. Therefore, the commonly used uncertainty standard is
valid and appropriate. The numerical values for this model are given in

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4242.htm (2 of 4) [11/13/2003 5:33:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4242.htm (3 of 4) [11/13/2003 5:33:23 PM]


1.4.2.4.2. Graphical Output and Interpretation 1.4.2.4.3. Quantitative Output and Interpretation

Histogram
(with
overlaid
1. Exploratory Data Analysis
Normal PDF) 1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.4. Josephson Junction Cryothermometry

1.4.2.4.3. Quantitative Output and Interpretation


Summary As a first step in the analysis, a table of summary statistics is computed from the data. The
Statistics following table, generated by Dataplot, shows a typical set of statistics.

SUMMARY

NUMBER OF OBSERVATIONS = 140

***********************************************************************
* LOCATION MEASURES * DISPERSION MEASURES *
***********************************************************************
* MIDRANGE = 0.2899000E+04 * RANGE = 0.6000000E+01 *
Normal * MEAN = 0.2898721E+04 * STAND. DEV. = 0.1235377E+01 *
Probability * MIDMEAN = 0.2898457E+04 * AV. AB. DEV. = 0.9642857E+00 *
* MEDIAN = 0.2899000E+04 * MINIMUM = 0.2896000E+04 *
Plot * = * LOWER QUART. = 0.2898000E+04 *
* = * LOWER HINGE = 0.2898000E+04 *
* = * UPPER HINGE = 0.2899500E+04 *
* = * UPPER QUART. = 0.2899250E+04 *
* = * MAXIMUM = 0.2902000E+04 *
***********************************************************************
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES *
***********************************************************************
* AUTOCO COEF = 0.2925397E+00 * ST. 3RD MOM. = 0.1271097E+00 *
* = 0.0000000E+00 * ST. 4TH MOM. = 0.2571418E+01 *
* = 0.0000000E+00 * ST. WILK-SHA = -0.3911592E+01 *
* = * UNIFORM PPCC = 0.9580541E+00 *
* = * NORMAL PPCC = 0.9701443E+00 *
* = * TUK -.5 PPCC = 0.8550686E+00 *
* = * CAUCHY PPCC = 0.6239791E+00 *
***********************************************************************

Location One way to quantify a change in location over time is to fit a straight line to the data set using
the index variable X = 1, 2, ..., N, with N denoting the number of observations. If there is no
significant drift in the location, the slope parameter should be zero. For this data set, Dataplot
generates the following output:

LEAST SQUARES MULTILINEAR FIT


SAMPLE SIZE N = 140
NUMBER OF VARIABLES = 1
NO REPLICATION CASE

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 2898.34 (0.2074 ) 0.1398E+05
2 A1 X 0.539896E-02 (0.2552E-02) 2.116

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4242.htm (4 of 4) [11/13/2003 5:33:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4243.htm (1 of 7) [11/13/2003 5:33:23 PM]


1.4.2.4.3. Quantitative Output and Interpretation 1.4.2.4.3. Quantitative Output and Interpretation

RESIDUAL STANDARD DEVIATION = 1.220212


RESIDUAL DEGREES OF FREEDOM = 138 Randomness There are many ways in which data can be non-random. However, most common forms of
non-randomness can be detected with a few simple tests. The lag plot in the previous section is
COEF AND SD(COEF) WRITTEN OUT TO FILE DPST1F.DAT
SD(PRED),95LOWER,95UPPER,99LOWER,99UPPER a simple graphical technique.
WRITTEN OUT TO FILE DPST2F.DAT
REGRESSION DIAGNOSTICS WRITTEN OUT TO FILE DPST3F.DAT Another check is an autocorrelation plot that shows the autocorrelations for various lags.
PARAMETER VARIANCE-COVARIANCE MATRIX AND Confidence bands can be plotted at the 95% and 99% confidence levels. Points outside this
INVERSE OF X-TRANSPOSE X MATRIX band indicate statistically significant values (lag 0 is always 1). Dataplot generated the
WRITTEN OUT TO FILE DPST4F.DAT
following autocorrelation plot.
The slope parameter, A1, has a t value of 2.1 which is statistically significant (the critical value
is 1.98). However, the value of the slope is 0.0054. Given that the slope is nearly zero, the
assumption of constant location is not seriously violated even though it is (just barely)
statistically significant.

Variation One simple way to detect a change in variation is with a Bartlett test after dividing the data set
into several equal-sized intervals. However, the Bartlett test is not robust for non-normality.
Since the nature of the data (a few distinct points repeated many times) makes the normality
assumption questionable, we use the alternative Levene test. In partiuclar, we use the Levene
test based on the median rather the mean. The choice of the number of intervals is somewhat
arbitrary, although values of 4 or 8 are reasonable. Dataplot generated the following output for
the Levene test.

LEVENE F-TEST FOR SHIFT IN VARIATION


(ASSUMPTION: NORMALITY)

1. STATISTICS
NUMBER OF OBSERVATIONS = 140
NUMBER OF GROUPS = 4
LEVENE F TEST STATISTIC = 0.4128718
The lag 1 autocorrelation, which is generally the one of most interest, is 0.29. The critical
values at the 5% level of significance are -0.087 and 0.087. This indicates that the lag 1
FOR LEVENE TEST STATISTIC autocorrelation is statistically significant, so there is some evidence for non-randomness.
0 % POINT = 0.0000000E+00
50 % POINT = 0.7926317 A common test for randomness is the runs test.
75 % POINT = 1.385201
90 % POINT = 2.124494
95 % POINT = 2.671178 RUNS UP
99 % POINT = 3.928924
99.9 % POINT = 5.737571 STATISTIC = NUMBER OF RUNS UP
OF LENGTH EXACTLY I

25.59809 % Point: 0.4128718 I STAT EXP(STAT) SD(STAT) Z

3. CONCLUSION (AT THE 5% LEVEL): 1 15.0 29.2083 5.4233 -2.62


THERE IS NO SHIFT IN VARIATION. 2 10.0 12.7167 2.7938 -0.97
THUS: HOMOGENEOUS WITH RESPECT TO VARIATION. 3 2.0 3.6292 1.6987 -0.96
4 4.0 0.7849 0.8573 3.75
Since the Levene test statistic value of 0.41 is less than the 95% critical value of 2.67, we 5 2.0 0.1376 0.3683 5.06
conclude that the standard deviations are not significantly different in the 4 intervals. 6 0.0 0.0204 0.1425 -0.14
7 1.0 0.0026 0.0511 19.54
8 0.0 0.0003 0.0172 -0.02
9 0.0 0.0000 0.0055 -0.01
10 0.0 0.0000 0.0017 0.00

STATISTIC = NUMBER OF RUNS UP


OF LENGTH I OR MORE

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4243.htm (2 of 7) [11/13/2003 5:33:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4243.htm (3 of 7) [11/13/2003 5:33:23 PM]


1.4.2.4.3. Quantitative Output and Interpretation 1.4.2.4.3. Quantitative Output and Interpretation
I STAT EXP(STAT) SD(STAT) Z 7 1.0 0.0052 0.0722 13.78
8 0.0 0.0006 0.0243 -0.02
1 34.0 46.5000 3.5048 -3.57 9 0.0 0.0001 0.0077 -0.01
2 19.0 17.2917 2.3477 0.73 10 0.0 0.0000 0.0023 0.00
3 9.0 4.5750 1.8058 2.45
4 7.0 0.9458 0.9321 6.49
5 3.0 0.1609 0.3976 7.14 STATISTIC = NUMBER OF RUNS TOTAL
6 1.0 0.0233 0.1524 6.41 OF LENGTH I OR MORE
7 1.0 0.0029 0.0542 18.41
8 0.0 0.0003 0.0181 -0.02 I STAT EXP(STAT) SD(STAT) Z
9 0.0 0.0000 0.0057 -0.01
10 0.0 0.0000 0.0017 0.00 1 68.0 93.0000 4.9565 -5.04
2 37.0 34.5833 3.3202 0.73
3 17.0 9.1500 2.5537 3.07
RUNS DOWN 4 10.0 1.8917 1.3182 6.15
5 5.0 0.3218 0.5623 8.32
STATISTIC = NUMBER OF RUNS DOWN 6 1.0 0.0466 0.2155 4.42
OF LENGTH EXACTLY I 7 1.0 0.0059 0.0766 12.98
8 0.0 0.0007 0.0256 -0.03
I STAT EXP(STAT) SD(STAT) Z 9 0.0 0.0001 0.0081 -0.01
10 0.0 0.0000 0.0024 0.00
1 16.0 29.2083 5.4233 -2.44
2 10.0 12.7167 2.7938 -0.97
3 5.0 3.6292 1.6987 0.81 LENGTH OF THE LONGEST RUN UP = 7
4 1.0 0.7849 0.8573 0.25 LENGTH OF THE LONGEST RUN DOWN = 5
5 2.0 0.1376 0.3683 5.06 LENGTH OF THE LONGEST RUN UP OR DOWN = 7
6 0.0 0.0204 0.1425 -0.14
7 0.0 0.0026 0.0511 -0.05 NUMBER OF POSITIVE DIFFERENCES = 48
8 0.0 0.0003 0.0172 -0.02 NUMBER OF NEGATIVE DIFFERENCES = 49
9 0.0 0.0000 0.0055 -0.01 NUMBER OF ZERO DIFFERENCES = 42
10 0.0 0.0000 0.0017 0.00
Values in the column labeled "Z" greater than 1.96 or less than -1.96 are statistically significant
at the 5% level. The runs test indicates some mild non-randomness.
STATISTIC = NUMBER OF RUNS DOWN
OF LENGTH I OR MORE Although the runs test and lag 1 autocorrelation indicate some mild non-randomness, it is not
sufficient to reject the Yi = C + Ei model. At least part of the non-randomness can be explained
by the discrete nature of the data.
I STAT EXP(STAT) SD(STAT) Z

1 34.0 46.5000 3.5048 -3.57 Distributional Probability plots are a graphical test for assessing if a particular distribution provides an
2 18.0 17.2917 2.3477 0.30 Analysis
3 8.0 4.5750 1.8058 1.90
adequate fit to a data set.
4 3.0 0.9458 0.9321 2.20
5 2.0 0.1609 0.3976 4.63
A quantitative enhancement to the probability plot is the correlation coefficient of the points on
6 0.0 0.0233 0.1524 -0.15 the probability plot. For this data set the correlation coefficient is 0.970. Since this is less than
7 0.0 0.0029 0.0542 -0.05 the critical value of 0.987 (this is a tabulated value), the normality assumption is rejected.
8 0.0 0.0003 0.0181 -0.02
9 0.0 0.0000 0.0057 -0.01 Chi-square and Kolmogorov-Smirnov goodness-of-fit tests are alternative methods for
10 0.0 0.0000 0.0017 0.00
assessing distributional adequacy. The Wilk-Shapiro and Anderson-Darling tests can be used to
test for normality. Dataplot generates the following output for the Anderson-Darling normality
RUNS TOTAL = RUNS UP + RUNS DOWN test.
STATISTIC = NUMBER OF RUNS TOTAL
ANDERSON-DARLING 1-SAMPLE TEST
OF LENGTH EXACTLY I
THAT THE DATA CAME FROM A NORMAL DISTRIBUTION
I STAT EXP(STAT) SD(STAT) Z
1. STATISTICS:
NUMBER OF OBSERVATIONS = 140
1 31.0 58.4167 7.6697 -3.57
MEAN = 2898.721
2 20.0 25.4333 3.9510 -1.38
STANDARD DEVIATION = 1.235377
3 7.0 7.2583 2.4024 -0.11
4 5.0 1.5698 1.2124 2.83
ANDERSON-DARLING TEST STATISTIC VALUE = 3.839233
5 4.0 0.2752 0.5208 7.15
ADJUSTED TEST STATISTIC VALUE = 3.944029
6 0.0 0.0407 0.2015 -0.20

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4243.htm (4 of 7) [11/13/2003 5:33:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4243.htm (5 of 7) [11/13/2003 5:33:23 PM]


1.4.2.4.3. Quantitative Output and Interpretation 1.4.2.4.3. Quantitative Output and Interpretation
2. CRITICAL VALUES:
90 % POINT = 0.6560000
95 % POINT = 0.7870000 Univariate It is sometimes useful and convenient to summarize the above results in a report.
97.5 % POINT = 0.9180000 Report
99 % POINT = 1.092000 Analysis for Josephson Junction Cryothermometry Data

3. CONCLUSION (AT THE 5% LEVEL): 1: Sample Size = 140


THE DATA DO NOT COME FROM A NORMAL DISTRIBUTION.
The Anderson-Darling test rejects the normality assumption because the test statistic, 3.84, is 2: Location
Mean = 2898.722
greater than the 99% critical value 1.092. Standard Deviation of Mean = 0.104409
95% Confidence Interval for Mean = (2898.515,2898.928)
Although the data are not strictly normal, the violation of the normality assumption is not severe Drift with respect to location? = YES
enough to conclude that the Yi = C + Ei model is unreasonable. At least part of the (Further analysis indicates that
non-normality can be explained by the discrete nature of the data. the drift, while statistically
significant, is not practically
significant)
Outlier A test for outliers is the Grubbs test. Dataplot generated the following output for Grubbs' test.
Analysis 3: Variation
Standard Deviation = 1.235377
GRUBBS TEST FOR OUTLIERS
95% Confidence Interval for SD = (1.105655,1.399859)
(ASSUMPTION: NORMALITY)
Drift with respect to variation?
(based on Levene's test on quarters
1. STATISTICS:
of the data) = NO
NUMBER OF OBSERVATIONS = 140
MINIMUM = 2896.000
4: Distribution
MEAN = 2898.721
Normal PPCC = 0.970145
MAXIMUM = 2902.000
Data are Normal?
STANDARD DEVIATION = 1.235377
(as measured by Normal PPCC) = NO
GRUBBS TEST STATISTIC = 2.653898
5: Randomness
Autocorrelation = 0.29254
2. PERCENT POINTS OF THE REFERENCE DISTRIBUTION
Data are Random?
FOR GRUBBS TEST STATISTIC
(as measured by autocorrelation) = NO
0 % POINT = 0.0000000E+00
50 % POINT = 2.874578
6: Statistical Control
75 % POINT = 3.074733
(i.e., no drift in location or scale,
90 % POINT = 3.320834
data are random, distribution is
95 % POINT = 3.495103
fixed, here we are testing only for
99 % POINT = 3.867364
fixed normal)
Data Set is in Statistical Control? = NO
3. CONCLUSION (AT THE 5% LEVEL):
THERE ARE NO OUTLIERS.
Note: Although we have violations of
For this data set, Grubbs' test does not detect any outliers at the 10%, 5%, and 1% significance the assumptions, they are mild enough,
levels. and at least partially explained by the
discrete nature of the data, so we may model
the data as if it were in statistical
Model Although the randomness and normality assumptions were mildly violated, we conclude that a control
reasonable model for the data is:
7: Outliers?
(as determined by Grubbs test) = NO
In addition, a 95% confidence interval for the mean value is (2898.515,2898.928).

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4243.htm (6 of 7) [11/13/2003 5:33:23 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4243.htm (7 of 7) [11/13/2003 5:33:23 PM]


1.4.2.4.4. Work This Example Yourself 1.4.2.4.4. Work This Example Yourself
overlaid normal pdf. 3. The histogram indicates that a
normal distribution is a good
distribution for these data.
4. Generate a normal probability
1. Exploratory Data Analysis plot. 4. The discrete nature of the data masks
1.4. EDA Case Studies the normality or non-normality of the
1.4.2. Case Studies
data somewhat. The plot indicates that
1.4.2.4. Josephson Junction Cryothermometry
a normal distribution provides a rough
approximation for the data.
1.4.2.4.4. Work This Example Yourself
View This page allows you to repeat the analysis outlined in the case study 4. Generate summary statistics, quantitative
Dataplot description on the previous page using Dataplot . It is required that you analysis, and print a univariate report.
Macro for have already downloaded and installed Dataplot and configured your
this Case browser. to run Dataplot. Output from each analysis step below will be 1. Generate a table of summary 1. The summary statistics table displays
Study displayed in one or more of the Dataplot windows. The four main statistics. 25+ statistics.
windows are the Output window, the Graphics window, the Command
History window, and the data sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the 2. Generate the mean, a confidence 2. The mean is 2898.27 and a 95%
bottom is a command entry window where commands can be typed in.
interval for the mean, and compute confidence interval is (2898.52,2898.93).
a linear fit to detect drift in The linear fit indicates no meaningful drift
location. in location since the value of the slope
Data Analysis Steps Results and Conclusions
parameter is near zero.

Click on the links below to start Dataplot and run this case study
3. Generate the standard deviation, a 3. The standard devaition is 1.24 with
yourself. Each step may use results from previous steps, so please be The links in this column will connect you with more detailed
patient. Wait until the software verifies that the current step is information about each analysis step from the case study description. confidence interval for the standard a 95% confidence interval of (1.11,1.40).
complete before clicking on the next step. deviation, and detect drift in variation Levene's test indicates no significant
by dividing the data into quarters and drift in variation.
computing Levene's test for equal
standard deviations.
1. Invoke Dataplot and read data.

1. Read in the data. 1. You have read 1 column of numbers 4. Check for randomness by generating an 4. The lag 1 autocorrelation is 0.29.
into Dataplot, variable Y. autocorrelation plot and a runs test. This indicates some mild non-randomness.

5. Check for normality by computing the 5. The normal probability plot correlation
2. 4-plot of the data. normal probability plot correlation coefficient is 0.970. At the 5% level,
coefficient. we reject the normality assumption.
1. 4-plot of Y. 1. Based on the 4-plot, there are no shifts
in location or scale. Due to the nature
of the data (a few distinct points with 6. Check for outliers using Grubbs' test. 6. Grubbs' test detects no outliers at the
many repeats), the normality assumption is 5% level.
questionable.

7. Print a univariate report (this assumes 7. The results are summarized in a


3. Generate the individual plots. steps 2 thru 6 have already been run). convenient report.

1. Generate a run sequence plot. 1. The run sequence plot indicates that
there are no shifts of location or
scale.

2. Generate a lag plot. 2. The lag plot does not indicate any
significant patterns (which would
show the data were not random).

3. Generate a histogram with an

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4244.htm (1 of 2) [11/13/2003 5:33:24 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4244.htm (2 of 2) [11/13/2003 5:33:24 PM]


1.4.2.5. Beam Deflections 1.4.2.5.1. Background and Data

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.4. EDA Case Studies 1.4. EDA Case Studies
1.4.2. Case Studies 1.4.2. Case Studies
1.4.2.5. Beam Deflections

1.4.2.5. Beam Deflections


1.4.2.5.1. Background and Data
Beam
This example illustrates the univariate analysis of beam deflection data. Generation This data set was collected by H. S. Lew of NIST in 1969 to measure
Deflection
steel-concrete beam deflections. The response variable is the deflection
1. Background and Data of a beam from the center point.
2. Test Underlying Assumptions The motivation for studying this data set is to show how the underlying
3. Develop a Better Model assumptions are affected by periodic data.
4. Validate New Model This file can be read by Dataplot with the following commands:
5. Work This Example Yourself SKIP 25
READ LEW.DAT Y

Resulting The following are the data used for this case study.
Data
-213
-564
-35
-15
141
115
-420
-360
203
-338
-431
194
-220
-513
154
-125
-559
92
-21
-579

http://www.itl.nist.gov/div898/handbook/eda/section4/eda425.htm [11/13/2003 5:33:24 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4251.htm (1 of 6) [11/13/2003 5:33:24 PM]


1.4.2.5.1. Background and Data 1.4.2.5.1. Background and Data

-52 17
99 48
-543 -568
-175 -135
162 162
-457 -430
-346 -422
204 172
-300 -74
-474 -577
164 -13
-107 92
-572 -534
-8 -243
83 194
-541 -355
-224 -465
180 156
-420 -81
-374 -578
201 -64
-236 139
-531 -449
83 -384
27 193
-564 -198
-112 -538
131 110
-507 -44
-254 -577
199 -6
-311 66
-495 -552
143 -164
-46 161
-579 -460
-90 -344
136 205
-472 -281
-338 -504
202 134
-287 -28
-477 -576
169 -118
-124 156
-568 -437

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4251.htm (2 of 6) [11/13/2003 5:33:24 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4251.htm (3 of 6) [11/13/2003 5:33:24 PM]


1.4.2.5.1. Background and Data 1.4.2.5.1. Background and Data

-381 -506
200 131
-220 -45
-540 -578
83 -80
11 138
-568 -462
-160 -361
172 201
-414 -211
-408 -554
188 32
-125 74
-572 -533
-32 -235
139 187
-492 -372
-321 -442
205 182
-262 -147
-504 -566
142 25
-83 68
-574 -535
0 -244
48 194
-571 -351
-106 -463
137 174
-501 -125
-266 -570
190 15
-391 72
-406 -550
194 -190
-186 172
-553 -424
83 -385
-13 198
-577 -218
-49 -536
103 96
-515
-280
201
300

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4251.htm (4 of 6) [11/13/2003 5:33:24 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4251.htm (5 of 6) [11/13/2003 5:33:24 PM]


1.4.2.5.1. Background and Data 1.4.2.5.2. Test Underlying Assumptions

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.5. Beam Deflections

1.4.2.5.2. Test Underlying Assumptions


Goal The goal of this analysis is threefold:
1. Determine if the univariate model:

is appropriate and valid.


2. Determine if the typical underlying assumptions for an "in control" measurement process
are valid. These assumptions are:
1. random drawings;
2. from a fixed distribution;
3. with the distribution having a fixed location; and
4. the distribution having a fixed scale.
3. Determine if the confidence interval

is appropriate and valid where s is the standard deviation of the original data.

4-Plot of Data

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4251.htm (6 of 6) [11/13/2003 5:33:24 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4252.htm (1 of 8) [11/13/2003 5:33:25 PM]


1.4.2.5.2. Test Underlying Assumptions 1.4.2.5.2. Test Underlying Assumptions

Interpretation The assumptions are addressed by the graphics shown above: Lag Plot
1. The run sequence plot (upper left) indicates that the data do not have any significant
shifts in location or scale over time.
2. The lag plot (upper right) shows that the data are not random. The lag plot further
indicates the presence of a few outliers.
3. When the randomness assumption is thus seriously violated, the histogram (lower left)
and normal probability plot (lower right) are ignored since determining the distribution of
the data is only meaningful when the data are random.
From the above plots we conclude that the underlying randomness assumption is not valid.
Therefore, the model

is not appropriate.
We need to develop a better model. Non-random data can frequently be modeled using time
series mehtodology. Specifically, the circular pattern in the lag plot indicates that a sinusoidal
model might be appropriate. The sinusoidal model will be developed in the next section.
We have drawn some lines and boxes on the plot to better isolate the outliers. The following
Individual The plots can be generated individually for more detail. In this case, only the run sequence plot output helps identify the points that are generating the outliers on the lag plot.
Plots and the lag plot are drawn since the distributional plots are not meaningful.

Run Sequence ****************************************************


** print y index xplot yplot subset yplot > 250 **
Plot ****************************************************

VARIABLES--Y INDEX XPLOT YPLOT

300.00 158.00 -506.00 300.00

****************************************************
** print y index xplot yplot subset xplot > 250 **
****************************************************

VARIABLES--Y INDEX XPLOT YPLOT

201.00 157.00 300.00 201.00

********************************************************
** print y index xplot yplot subset yplot -100 to 0
subset xplot -100 to 0 **
********************************************************

VARIABLES--Y INDEX XPLOT YPLOT

-35.00 3.00 -15.00 -35.00

*********************************************************
** print y index xplot yplot subset yplot 100 to 200
subset xplot 100 to 200 **
*********************************************************

VARIABLES--Y INDEX XPLOT YPLOT

141.00 5.00 115.00 141.00

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4252.htm (2 of 8) [11/13/2003 5:33:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4252.htm (3 of 8) [11/13/2003 5:33:25 PM]


1.4.2.5.2. Test Underlying Assumptions 1.4.2.5.2. Test Underlying Assumptions

That is, the third, fifth, and 158th points appear to be outliers. Summary As a first step in the analysis, a table of summary statistics is computed from the data. The
Statistics following table, generated by Dataplot, shows a typical set of statistics.
Autocorrelation When the lag plot indicates significant non-randomness, it can be helpful to follow up with a an
Plot autocorrelation plot.
SUMMARY

NUMBER OF OBSERVATIONS = 200

***********************************************************************
* LOCATION MEASURES * DISPERSION MEASURES *
***********************************************************************
* MIDRANGE = -0.1395000E+03 * RANGE = 0.8790000E+03 *
* MEAN = -0.1774350E+03 * STAND. DEV. = 0.2773322E+03 *
* MIDMEAN = -0.1797600E+03 * AV. AB. DEV. = 0.2492250E+03 *
* MEDIAN = -0.1620000E+03 * MINIMUM = -0.5790000E+03 *
* = * LOWER QUART. = -0.4510000E+03 *
* = * LOWER HINGE = -0.4530000E+03 *
* = * UPPER HINGE = 0.9400000E+02 *
* = * UPPER QUART. = 0.9300000E+02 *
* = * MAXIMUM = 0.3000000E+03 *
***********************************************************************
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES *
***********************************************************************
* AUTOCO COEF = -0.3073048E+00 * ST. 3RD MOM. = -0.5010057E-01 *
* = 0.0000000E+00 * ST. 4TH MOM. = 0.1503684E+01 *
* = 0.0000000E+00 * ST. WILK-SHA = -0.1883372E+02 *
* = * UNIFORM PPCC = 0.9925535E+00 *
This autocorrelation plot shows a distinct cyclic pattern. As with the lag plot, this suggests a * = * NORMAL PPCC = 0.9540811E+00 *
sinusoidal model. * = * TUK -.5 PPCC = 0.7313794E+00 *
* = * CAUCHY PPCC = 0.4408355E+00 *
Spectral Plot Another useful plot for non-random data is the spectral plot. ***********************************************************************

Location One way to quantify a change in location over time is to fit a straight line to the data set using
the index variable X = 1, 2, ..., N, with N denoting the number of observations. If there is no
significant drift in the location, the slope parameter should be zero. For this data set, Dataplot
generates the following output:

LEAST SQUARES MULTILINEAR FIT


SAMPLE SIZE N = 200
NUMBER OF VARIABLES = 1
NO REPLICATION CASE

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 -178.175 ( 39.47 ) -4.514
2 A1 X 0.736593E-02 (0.3405 ) 0.2163E-01

RESIDUAL STANDARD DEVIATION = 278.0313


RESIDUAL DEGREES OF FREEDOM = 198
The slope parameter, A1, has a t value of 0.022 which is statistically not significant. This
indicates that the slope can in fact be considered zero.
This spectral plot shows a single dominant peak at a frequency of 0.3. This frequency of 0.3
will be used in fitting the sinusoidal model in the next section.

Quantitative Although the lag plot, autocorrelation plot, and spectral plot clearly show the violation of the
Output randomness assumption, we supplement the graphical output with some quantitative measures.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4252.htm (4 of 8) [11/13/2003 5:33:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4252.htm (5 of 8) [11/13/2003 5:33:25 PM]


1.4.2.5.2. Test Underlying Assumptions 1.4.2.5.2. Test Underlying Assumptions
I STAT EXP(STAT) SD(STAT) Z
Variation One simple way to detect a change in variation is with a Bartlett test after dividing the data set 1 127.0 166.5000 6.6546 -5.94
into several equal-sized intervals. However, the Bartlett the non-randomness of this data does 2 64.0 62.2917 4.4454 0.38
not allows us to assume normality, we use the alternative Levene test. In partiuclar, we use the 3 30.0 16.5750 3.4338 3.91
4 13.0 3.4458 1.7786 5.37
Levene test based on the median rather the mean. The choice of the number of intervals is 5 9.0 0.5895 0.7609 11.05
somewhat arbitrary, although values of 4 or 8 are reasonable. Dataplot generated the following 6 8.0 0.0858 0.2924 27.06
output for the Levene test. 7 3.0 0.0109 0.1042 28.67
8 2.0 0.0012 0.0349 57.21
9 1.0 0.0001 0.0111 90.14
LEVENE F-TEST FOR SHIFT IN VARIATION 10 1.0 0.0000 0.0034 298.08
(ASSUMPTION: NORMALITY)

1. STATISTICS RUNS DOWN


NUMBER OF OBSERVATIONS = 200
NUMBER OF GROUPS = 4 STATISTIC = NUMBER OF RUNS DOWN
LEVENE F TEST STATISTIC = 0.9378599E-01 OF LENGTH EXACTLY I

I STAT EXP(STAT) SD(STAT) Z


FOR LEVENE TEST STATISTIC
0 % POINT = 0.0000000E+00 1 69.0 104.2083 10.2792 -3.43
50 % POINT = 0.7914120 2 32.0 45.7167 5.2996 -2.59
75 % POINT = 1.380357 3 11.0 13.1292 3.2297 -0.66
90 % POINT = 2.111936 4 6.0 2.8563 1.6351 1.92
95 % POINT = 2.650676 5 5.0 0.5037 0.7045 6.38
99 % POINT = 3.883083 6 2.0 0.0749 0.2733 7.04
99.9 % POINT = 5.638597 7 2.0 0.0097 0.0982 20.26
8 0.0 0.0011 0.0331 -0.03
9 0.0 0.0001 0.0106 -0.01
3.659895 % Point: 0.9378599E-01 10 0.0 0.0000 0.0032 0.00
3. CONCLUSION (AT THE 5% LEVEL):
THERE IS NO SHIFT IN VARIATION. STATISTIC = NUMBER OF RUNS DOWN
THUS: HOMOGENEOUS WITH RESPECT TO VARIATION. OF LENGTH I OR MORE
In this case, the Levene test indicates that the standard deviations are significantly different in
the 4 intervals since the test statistic of 13.2 is greater than the 95% critical value of 2.6.
I STAT EXP(STAT) SD(STAT) Z
Therefore we conclude that the scale is not constant.
1 127.0 166.5000 6.6546 -5.94
Randomness A runs test is used to check for randomness 2 58.0 62.2917 4.4454 -0.97
3 26.0 16.5750 3.4338 2.74
4 15.0 3.4458 1.7786 6.50
5 9.0 0.5895 0.7609 11.05
RUNS UP 6 4.0 0.0858 0.2924 13.38
7 2.0 0.0109 0.1042 19.08
STATISTIC = NUMBER OF RUNS UP 8 0.0 0.0012 0.0349 -0.03
OF LENGTH EXACTLY I 9 0.0 0.0001 0.0111 -0.01
10 0.0 0.0000 0.0034 0.00
I STAT EXP(STAT) SD(STAT) Z

1 63.0 104.2083 10.2792 -4.01 RUNS TOTAL = RUNS UP + RUNS DOWN


2 34.0 45.7167 5.2996 -2.21
3 17.0 13.1292 3.2297 1.20 STATISTIC = NUMBER OF RUNS TOTAL
4 4.0 2.8563 1.6351 0.70 OF LENGTH EXACTLY I
5 1.0 0.5037 0.7045 0.70
6 5.0 0.0749 0.2733 18.02 I STAT EXP(STAT) SD(STAT) Z
7 1.0 0.0097 0.0982 10.08
8 1.0 0.0011 0.0331 30.15 1 132.0 208.4167 14.5370 -5.26
9 0.0 0.0001 0.0106 -0.01 2 66.0 91.4333 7.4947 -3.39
10 1.0 0.0000 0.0032 311.40 3 28.0 26.2583 4.5674 0.38
4 10.0 5.7127 2.3123 1.85
5 6.0 1.0074 0.9963 5.01
STATISTIC = NUMBER OF RUNS UP 6 7.0 0.1498 0.3866 17.72
OF LENGTH I OR MORE 7 3.0 0.0193 0.1389 21.46

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4252.htm (6 of 8) [11/13/2003 5:33:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4252.htm (7 of 8) [11/13/2003 5:33:25 PM]


1.4.2.5.2. Test Underlying Assumptions 1.4.2.5.3. Develop a Better Model
8 1.0 0.0022 0.0468 21.30
9 0.0 0.0002 0.0150 -0.01
10 1.0 0.0000 0.0045 220.19

STATISTIC = NUMBER OF RUNS TOTAL


OF LENGTH I OR MORE
1. Exploratory Data Analysis
I STAT EXP(STAT) SD(STAT) Z 1.4. EDA Case Studies
1.4.2. Case Studies
1 254.0 333.0000 9.4110 -8.39
2 122.0 124.5833 6.2868 -0.41
1.4.2.5. Beam Deflections
3 56.0 33.1500 4.8561 4.71
4 28.0 6.8917 2.5154 8.39
5
6
18.0
12.0
1.1790
0.1716
1.0761
0.4136
15.63
28.60 1.4.2.5.3. Develop a Better Model
7 5.0 0.0217 0.1474 33.77
8 2.0 0.0024 0.0494 40.43
9 1.0 0.0002 0.0157 63.73 Sinusoidal The lag plot and autocorrelation plot in the previous section strongly suggested a
10 1.0 0.0000 0.0047 210.77 Model sinusoidal model might be appropriate. The basic sinusoidal model is:

LENGTH OF THE LONGEST RUN UP = 10


LENGTH OF THE LONGEST RUN DOWN = 7 where C is constant defining a mean level, is an amplitude for the sine
LENGTH OF THE LONGEST RUN UP OR DOWN = 10
function, is the frequency, Ti is a time variable, and is the phase. This
NUMBER OF POSITIVE DIFFERENCES = 258 sinusoidal model can be fit using non-linear least squares.
NUMBER OF NEGATIVE DIFFERENCES = 241
NUMBER OF ZERO DIFFERENCES = 0
To obtain a good fit, sinusoidal models require good starting values for C, the
Values in the column labeled "Z" greater than 1.96 or less than -1.96 are statistically significant amplitude, and the frequency.
at the 5% level. Numerous values in this column are much larger than +/-1.96, so we conclude
that the data are not random. Good Starting A good starting value for C can be obtained by calculating the mean of the data.
Value for C If the data show a trend, i.e., the assumption of constant location is violated, we
Distributional Since the quantitative tests show that the assumptions of constant scale and non-randomness are can replace C with a linear or quadratic least squares fit. That is, the model
Assumptions not met, the distributional measures will not be meaningful. Therefore these quantitative tests
are omitted.
becomes

or

Since our data did not have any meaningful change of location, we can fit the
simpler model with C equal to the mean. From the summary output in the
previous page, the mean is -177.44.

Good Starting The starting value for the frequency can be obtained from the spectral plot,
Value for which shows the dominant frequency is about 0.3.
Frequency

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4252.htm (8 of 8) [11/13/2003 5:33:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4253.htm (1 of 4) [11/13/2003 5:33:25 PM]


1.4.2.5.3. Develop a Better Model 1.4.2.5.3. Develop a Better Model

Complex The complex demodulation phase plot can be used to refine this initial estimate Complex
Demodulation for the frequency. Demodulation
Phase Plot Amplitude
For the complex demodulation plot, if the lines slope from left to right, the Plot
frequency should be increased. If the lines slope from right to left, it should be
decreased. A relatively flat (i.e., horizontal) slope indicates a good frequency.
We could generate the demodulation phase plot for 0.3 and then use trial and
error to obtain a better estimate for the frequency. To simplify this, we generate
16 of these plots on a single page starting with a frequency of 0.28, increasing in
increments of 0.0025, and stopping at 0.3175.

The complex demodulation amplitude plot for this data shows that:
1. The amplitude is fixed at approximately 390.
2. There is a short start-up effect.
3. There is a change in amplitude at around x=160 that should be
investigated for an outlier.
In terms of a non-linear model, the plot indicates that fitting a single constant for
should be adequate for this data set.

Fit Output Using starting estimates of 0.3025 for the frequency, 390 for the amplitude, and
-177.44 for C, Dataplot generated the following output for the fit.
Interpretation The plots start with lines sloping from left to right but gradually change to a right
to left slope. The relatively flat slope occurs for frequency 0.3025 (third row,
second column). The complex demodulation phase plot restricts the range from
LEAST SQUARES NON-LINEAR FIT
to . This is why the plot appears to show some breaks. SAMPLE SIZE N = 200
MODEL--Y =C + AMP*SIN(2*3.14159*FREQ*T + PHASE)
NO REPLICATION CASE
Good Starting The complex demodulation amplitude plot is used to find a good starting value
Values for for the amplitude. In addition, this plot indicates whether or not the amplitude is ITERATION CONVERGENCE
RESIDUAL * PARAMETER
Amplitude constant over the entire range of the data or if it varies. If the plot is essentially NUMBER MEASURE
STANDARD * ESTIMATES
DEVIATION *
flat, i.e., zero slope, then it is reasonable to assume a constant amplitude in the ----------------------------------*-----------
non-linear model. However, if the slope varies over the range of the plot, we 1-- 0.10000E-01 0.52903E+03 *-0.17743E+03 0.39000E+03 0.30250E+00 0.10000E+01
may need to adjust the model to be: 2-- 0.50000E-02 0.22218E+03 *-0.17876E+03-0.33137E+03 0.30238E+00 0.71471E+00
3-- 0.25000E-02 0.15634E+03 *-0.17886E+03-0.24523E+03 0.30233E+00 0.14022E+01
4-- 0.96108E-01 0.15585E+03 *-0.17879E+03-0.36177E+03 0.30260E+00 0.14654E+01
That is, we replace with a function of time. A linear fit is specified in the FINAL PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
model above, but this can be replaced with a more elaborate function if needed. 1 C -178.786 ( 11.02 ) -16.22
2 AMP -361.766 ( 26.19 ) -13.81

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4253.htm (2 of 4) [11/13/2003 5:33:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4253.htm (3 of 4) [11/13/2003 5:33:25 PM]


1.4.2.5.3. Develop a Better Model 1.4.2.5.4. Validate New Model

3 FREQ 0.302596 (0.1510E-03) 2005.


4 PHASE 1.46536 (0.4909E-01) 29.85

RESIDUAL STANDARD DEVIATION = 155.8484


RESIDUAL DEGREES OF FREEDOM = 196
1. Exploratory Data Analysis
1.4. EDA Case Studies
Model From the fit output, our proposed model is: 1.4.2. Case Studies
1.4.2.5. Beam Deflections

We will evaluate the adequacy of this model in the next section.


1.4.2.5.4. Validate New Model
4-Plot of The first step in evaluating the fit is to generate a 4-plot of the
Residuals residuals.

Interpretation The assumptions are addressed by the graphics shown above:


1. The run sequence plot (upper left) indicates that the data do not
have any significant shifts in location. There does seem to be
some shifts in scale. A start-up effect was detected previously by
the complex demodulation amplitude plot. There does appear to
be a few outliers.
2. The lag plot (upper right) shows that the data are random. The
outliers also appear in the lag plot.
3. The histogram (lower left) and the normal probability plot
(lower right) do not show any serious non-normality in the
residuals. However, the bend in the left portion of the normal
probability plot shows some cause for concern.
The 4-plot indicates that this fit is reasonably good. However, we will
attempt to improve the fit by removing the outliers.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4253.htm (4 of 4) [11/13/2003 5:33:25 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4254.htm (1 of 3) [11/13/2003 5:33:26 PM]


1.4.2.5.4. Validate New Model 1.4.2.5.4. Validate New Model

Fit Output Dataplot generated the following fit output after removing 3 outliers.
with Outliers
Removed

LEAST SQUARES NON-LINEAR FIT


SAMPLE SIZE N = 197
MODEL--Y =C + AMP*SIN(2*3.14159*FREQ*T + PHASE)
NO REPLICATION CASE

ITERATION CONVERGENCE
RESIDUAL * PARAMETER
NUMBER MEASURE
STANDARD * ESTIMATES
DEVIATION *
----------------------------------*-----------
1-- 0.10000E-01 0.14834E+03 *-0.17879E+03-0.36177E+03 0.30260E+00 0.14654E+01

2-- 0.37409E+02 0.14834E+03 *-0.17879E+03-0.36176E+03 0.30260E+00 0.14653E+01

FINAL PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE This plot shows that the underlying assumptions are satisfied and therefore the
1 C -178.788 ( 10.57 ) -16.91
2 AMP -361.759 ( 25.45 ) -14.22 new fit is a good descriptor of the data.
3 FREQ 0.302597 (0.1457E-03) 2077.
4 PHASE 1.46533 (0.4715E-01) 31.08 In this case, it is a judgment call whether to use the fit with or without the
outliers removed.
RESIDUAL STANDARD DEVIATION = 148.3398
RESIDUAL DEGREES OF FREEDOM = 193

New The original fit, with a residual standard deviation of 155.84, was:
Fit to
Edited
Data The new fit, with a residual standard deviation of 148.34, is:

There is minimal change in the parameter estimates and about a 5% reduction in


the residual standard deviation. In this case, removing the residuals has a modest
benefit in terms of reducing the variability of the model.

4-Plot
for
New
Fit

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4254.htm (2 of 3) [11/13/2003 5:33:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4254.htm (3 of 3) [11/13/2003 5:33:26 PM]


1.4.2.5.5. Work This Example Yourself 1.4.2.5.5. Work This Example Yourself
low frequency peak.

5. Generate a spectral plot. 6. The summary statistics table displays


25+ statistics.
1. Exploratory Data Analysis
1.4. EDA Case Studies
7. The linear fit indicates no drift in
1.4.2. Case Studies
6. Generate a table of summary location since the slope parameter
1.4.2.5. Beam Deflections
statistics. is not statistically significant.

1.4.2.5.5. Work This Example Yourself 8. Levene's test indicates no


7. Generate a linear fit to detect significant drift in variation.
View This page allows you to repeat the analysis outlined in the case study drift in location.
Dataplot description on the previous page using Dataplot . It is required that you
Macro for have already downloaded and installed Dataplot and configured your
this Case browser. to run Dataplot. Output from each analysis step below will be
Study displayed in one or more of the Dataplot windows. The four main 8. Detect drift in variation by 9. The runs test indicates significant
windows are the Output window, the Graphics window, the Command non-randomness.
dividing the data into quarters and
History window, and the data sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the computing Levene's test statistic for
bottom is a command entry window where commands can be typed in. equal standard deviations.

Data Analysis Steps Results and Conclusions


9. Check for randomness by generating
a runs test.
Click on the links below to start Dataplot and run this case
study yourself. Each step may use results from previous steps, The links in this column will connect you with more detailed
so please be patient. Wait until the software verifies that the information about each analysis step from the case study description.
current step is complete before clicking on the next step.

3. Fit
Yi = C + A*SIN(2*PI*omega*ti+phi).
1. Invoke Dataplot and read data.
1. Complex demodulation phase plot
1. Generate a complex demodulation indicates a starting frequency
1. Read in the data. 1. You have read 1 column of numbers
phase plot. of 0.3025.
into Dataplot, variable Y.

2. Generate a complex demodulation 2. Complex demodulation amplitude


2. Validate assumptions. amplitude plot. plot indicates an amplitude of
390 (but there is a short start-up
1. 4-plot of Y. 1. Based on the 4-plot, there are no effect).
obvious shifts in location and scale,
but the data are not random.
3. Fit the non-linear model. 3. Non-linear fit generates final
parameter estimates. The
2. Based on the run sequence plot, there
2. Generate a run sequence plot. residual standard deviation from
are no obvious shifts in location and
the fit is 155.85 (compared to the
scale.
standard deviation of 277.73 from
the original data).
3. Based on the lag plot, the data
are not random.
3. Generate a lag plot.
4. The autocorrelation plot shows
significant autocorrelation at lag 1.
4. Generate an autocorrelation plot.
5. The spectral plot shows a single dominant

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4255.htm (1 of 3) [11/13/2003 5:33:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4255.htm (2 of 3) [11/13/2003 5:33:26 PM]


1.4.2.5.5. Work This Example Yourself 1.4.2.6. Filter Transmittance

4. Validate fit.

1. Generate a 4-plot of the residuals 1. The 4-plot indicates that the assumptions
from the fit. of constant location and scale are valid. 1. Exploratory Data Analysis
The lag plot indicates that the data are
1.4. EDA Case Studies
random. The histogram and normal
probability plot indicate that the residuals 1.4.2. Case Studies
that the normality assumption for the
residuals are not seriously violated,
although there is a bend on the probablity
plot that warrants attention.
1.4.2.6. Filter Transmittance
2. Generate a nonlinear fit with Filter This example illustrates the univariate analysis of filter transmittance
outliers removed. 2. The fit after removing 3 outliers shows Transmittance data.
some marginal improvement in the model
(a 5% reduction in the residual standard
deviation). 1. Background and Data

3. Generate a 4-plot of the residuals


2. Graphical Output and Interpretation
from the fit with the outliers 3. The 4-plot of the model fit after 3. Quantitative Output and Interpretation
removed. 3 outliers removed shows marginal
improvement in satisfying model 4. Work This Example Yourself
assumptions.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4255.htm (3 of 3) [11/13/2003 5:33:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda426.htm [11/13/2003 5:33:26 PM]


1.4.2.6.1. Background and Data 1.4.2.6.1. Background and Data

2.00130
2.00130
2.00150
2.00150
2.00160
1. Exploratory Data Analysis
2.00150
1.4. EDA Case Studies
1.4.2. Case Studies
2.00140
1.4.2.6. Filter Transmittance 2.00130
2.00140
2.00150
1.4.2.6.1. Background and Data 2.00140
2.00150
2.00160
Generation This data set was collected by NIST chemist Radu Mavrodineaunu in 2.00150
the 1970's from an automatic data acquisition system for a filter 2.00160
transmittance experiment. The response variable is transmittance. 2.00190
The motivation for studying this data set is to show how the underlying 2.00200
autocorrelation structure in a relatively small data set helped the 2.00200
scientist detect problems with his automatic data acquisition system. 2.00210
2.00220
This file can be read by Dataplot with the following commands: 2.00230
SKIP 25 2.00240
READ MAVRO.DAT Y 2.00250
2.00270
Resulting The following are the data used for this case study. 2.00260
Data 2.00260
2.00260
2.00180 2.00270
2.00170 2.00260
2.00180 2.00250
2.00190 2.00240
2.00180
2.00170
2.00150
2.00140
2.00150
2.00150
2.00170
2.00180
2.00180
2.00190
2.00190
2.00210
2.00200
2.00160
2.00140

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4261.htm (1 of 2) [11/13/2003 5:33:26 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4261.htm (2 of 2) [11/13/2003 5:33:26 PM]


1.4.2.6.2. Graphical Output and Interpretation 1.4.2.6.2. Graphical Output and Interpretation

4-Plot of
Data

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.6. Filter Transmittance

1.4.2.6.2. Graphical Output and


Interpretation
Goal The goal of this analysis is threefold:
1. Determine if the univariate model:

is appropriate and valid.


2. Determine if the typical underlying assumptions for an "in Interpretation The assumptions are addressed by the graphics shown above:
control" measurement process are valid. These assumptions are:
1. The run sequence plot (upper left) indicates a significant shift in
1. random drawings;
location around x=35.
2. from a fixed distribution;
2. The linear appearance in the lag plot (upper right) indicates a
3. with the distribution having a fixed location; and non-random pattern in the data.
4. the distribution having a fixed scale. 3. Since the lag plot indicates significant non-randomness, we do
3. Determine if the confidence interval not make any interpretation of either the histogram (lower left)
or the normal probability plot (lower right).
The serious violation of the non-randomness assumption means that
is appropriate and valid where s is the standard deviation of the the univariate model
original data.

is not valid. Given the linear appearance of the lag plot, the first step
might be to consider a model of the type

However, in this case discussions with the scientist revealed that


non-randomness was entirely unexpected. An examination of the
experimental process revealed that the sampling rate for the automatic
data acquisition system was too fast. That is, the equipment did not
have sufficient time to reset before the next sample started, resulting in
the current measurement being contaminated by the previous
measurement. The solution was to rerun the experiment allowing more
time between samples.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4262.htm (1 of 4) [11/13/2003 5:33:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4262.htm (2 of 4) [11/13/2003 5:33:27 PM]


1.4.2.6.2. Graphical Output and Interpretation 1.4.2.6.2. Graphical Output and Interpretation

Simple graphical techniques can be quite effective in revealing


unexpected results in the data. When this occurs, it is important to
investigate whether the unexpected result is due to problems in the
experiment and data collection or is indicative of unexpected
underlying structure in the data. This determination cannot be made on
the basis of statistics alone. The role of the graphical and statistical
analysis is to detect problems or unexpected results in the data.
Resolving the issues requires the knowledge of the scientist or
engineer.

Individual Although it is generally unnecessary, the plots can be generated


Plots individually to give more detail. Since the lag plot indicates significant
non-randomness, we omit the distributional plots.

Run
Sequence
Plot

Lag Plot

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4262.htm (3 of 4) [11/13/2003 5:33:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4262.htm (4 of 4) [11/13/2003 5:33:27 PM]


1.4.2.6.3. Quantitative Output and Interpretation 1.4.2.6.3. Quantitative Output and Interpretation

Location One way to quantify a change in location over time is to fit a straight line to the data set using
the index variable X = 1, 2, ..., N, with N denoting the number of observations. If there is no
significant drift in the location, the slope parameter should be zero. For this data set, Dataplot
1. Exploratory Data Analysis generates the following output:
1.4. EDA Case Studies
1.4.2. Case Studies LEAST SQUARES MULTILINEAR FIT
1.4.2.6. Filter Transmittance SAMPLE SIZE N = 50
NUMBER OF VARIABLES = 1
NO REPLICATION CASE
1.4.2.6.3. Quantitative Output and Interpretation
PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
1 A0 2.00138 (0.9695E-04) 0.2064E+05
Summary As a first step in the analysis, a table of summary statistics is computed from the data. The 2 A1 X 0.184685E-04 (0.3309E-05) 5.582
Statistics following table, generated by Dataplot, shows a typical set of statistics.
RESIDUAL STANDARD DEVIATION = 0.3376404E-03
RESIDUAL DEGREES OF FREEDOM = 48
SUMMARY The slope parameter, A1, has a t value of 5.6, which is statistically significant. The value of the
slope parameter is 0.0000185. Although this number is nearly zero, we need to take into
NUMBER OF OBSERVATIONS = 50
account that the original scale of the data is from about 2.0012 to 2.0028. In this case, we
conclude that there is a drift in location, although by a relatively minor amount.
***********************************************************************
* LOCATION MEASURES * DISPERSION MEASURES * Variation One simple way to detect a change in variation is with a Bartlett test after dividing the data set
***********************************************************************
* MIDRANGE = 0.2002000E+01 * RANGE = 0.1399994E-02 * into several equal sized intervals. However, the Bartlett test is not robust for non-normality.
* MEAN = 0.2001856E+01 * STAND. DEV. = 0.4291329E-03 * Since the normality assumption is questionable for these data, we use the alternative Levene
* MIDMEAN = 0.2001638E+01 * AV. AB. DEV. = 0.3480196E-03 * test. In partiuclar, we use the Levene test based on the median rather the mean. The choice of
* MEDIAN = 0.2001800E+01 * MINIMUM = 0.2001300E+01 *
* = * LOWER QUART. = 0.2001500E+01 * the number of intervals is somewhat arbitrary, although values of 4 or 8 are reasonable.
* = * LOWER HINGE = 0.2001500E+01 * Dataplot generated the following output for the Levene test.
* = * UPPER HINGE = 0.2002100E+01 *
* = * UPPER QUART. = 0.2002175E+01 *
LEVENE F-TEST FOR SHIFT IN VARIATION
* = * MAXIMUM = 0.2002700E+01 *
(ASSUMPTION: NORMALITY)
***********************************************************************
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES *
1. STATISTICS
***********************************************************************
NUMBER OF OBSERVATIONS = 50
* AUTOCO COEF = 0.9379919E+00 * ST. 3RD MOM. = 0.6191616E+00 *
NUMBER OF GROUPS = 4
* = 0.0000000E+00 * ST. 4TH MOM. = 0.2098746E+01 *
LEVENE F TEST STATISTIC = 0.9714893
* = 0.0000000E+00 * ST. WILK-SHA = -0.4995516E+01 *
* = * UNIFORM PPCC = 0.9666610E+00 *
* = * NORMAL PPCC = 0.9558001E+00 *
FOR LEVENE TEST STATISTIC
* = * TUK -.5 PPCC = 0.8462552E+00 *
0 % POINT = 0.0000000E+00
* = * CAUCHY PPCC = 0.6822084E+00 *
50 % POINT = 0.8004835
***********************************************************************
75 % POINT = 1.416631
90 % POINT = 2.206890
95 % POINT = 2.806845
99 % POINT = 4.238307
99.9 % POINT = 6.424733

58.56597 % Point: 0.9714893

3. CONCLUSION (AT THE 5% LEVEL):


THERE IS NO SHIFT IN VARIATION.
THUS: HOMOGENEOUS WITH RESPECT TO VARIATION.
In this case, since the Levene test statistic value of 0.971 is less than the critical value of 2.806
at the 5% level, we conclude that there is no evidence of a change in variation.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4263.htm (1 of 6) [11/13/2003 5:33:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4263.htm (2 of 6) [11/13/2003 5:33:27 PM]


1.4.2.6.3. Quantitative Output and Interpretation 1.4.2.6.3. Quantitative Output and Interpretation

I STAT EXP(STAT) SD(STAT) Z


Randomness There are many ways in which data can be non-random. However, most common forms of
non-randomness can be detected with a few simple tests. The lag plot in the 4-plot in the 1 7.0 16.5000 2.0696 -4.59
2 6.0 6.0417 1.3962 -0.03
previous seciton is a simple graphical technique. 3 3.0 1.5750 1.0622 1.34
4 2.0 0.3208 0.5433 3.09
One check is an autocorrelation plot that shows the autocorrelations for various lags. 5 2.0 0.0538 0.2299 8.47
Confidence bands can be plotted at the 95% and 99% confidence levels. Points outside this 6 2.0 0.0077 0.0874 22.79
band indicate statistically significant values (lag 0 is always 1). Dataplot generated the 7 2.0 0.0010 0.0308 64.85
8 2.0 0.0001 0.0102 195.70
following autocorrelation plot. 9 1.0 0.0000 0.0032 311.64
10 1.0 0.0000 0.0010 1042.19

RUNS DOWN

STATISTIC = NUMBER OF RUNS DOWN


OF LENGTH EXACTLY I

I STAT EXP(STAT) SD(STAT) Z

1 3.0 10.4583 3.2170 -2.32


2 0.0 4.4667 1.6539 -2.70
3 3.0 1.2542 0.9997 1.75
4 1.0 0.2671 0.5003 1.46
5 1.0 0.0461 0.2132 4.47
6 0.0 0.0067 0.0818 -0.08
7 0.0 0.0008 0.0291 -0.03
8 0.0 0.0001 0.0097 -0.01
9 0.0 0.0000 0.0031 0.00
10 0.0 0.0000 0.0009 0.00

STATISTIC = NUMBER OF RUNS DOWN


The lag 1 autocorrelation, which is generally the one of most interest, is 0.93. The critical OF LENGTH I OR MORE
values at the 5% level are -0.277 and 0.277. This indicates that the lag 1 autocorrelation is
statistically significant, so there is strong evidence of non-randomness.
I STAT EXP(STAT) SD(STAT) Z
A common test for randomness is the runs test.
1 8.0 16.5000 2.0696 -4.11
2 5.0 6.0417 1.3962 -0.75
3 5.0 1.5750 1.0622 3.22
RUNS UP 4 2.0 0.3208 0.5433 3.09
5 1.0 0.0538 0.2299 4.12
STATISTIC = NUMBER OF RUNS UP 6 0.0 0.0077 0.0874 -0.09
OF LENGTH EXACTLY I 7 0.0 0.0010 0.0308 -0.03
8 0.0 0.0001 0.0102 -0.01
I STAT EXP(STAT) SD(STAT) Z 9 0.0 0.0000 0.0032 0.00
10 0.0 0.0000 0.0010 0.00
1 1.0 10.4583 3.2170 -2.94
2 3.0 4.4667 1.6539 -0.89
3 1.0 1.2542 0.9997 -0.25 RUNS TOTAL = RUNS UP + RUNS DOWN
4 0.0 0.2671 0.5003 -0.53
5 0.0 0.0461 0.2132 -0.22 STATISTIC = NUMBER OF RUNS TOTAL
6 0.0 0.0067 0.0818 -0.08 OF LENGTH EXACTLY I
7 0.0 0.0008 0.0291 -0.03
8 1.0 0.0001 0.0097 103.06 I STAT EXP(STAT) SD(STAT) Z
9 0.0 0.0000 0.0031 0.00
10 1.0 0.0000 0.0009 1087.63 1 4.0 20.9167 4.5496 -3.72
2 3.0 8.9333 2.3389 -2.54
3 4.0 2.5083 1.4138 1.06
STATISTIC = NUMBER OF RUNS UP 4 1.0 0.5341 0.7076 0.66
OF LENGTH I OR MORE 5 1.0 0.0922 0.3015 3.01

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4263.htm (3 of 6) [11/13/2003 5:33:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4263.htm (4 of 6) [11/13/2003 5:33:27 PM]


1.4.2.6.3. Quantitative Output and Interpretation 1.4.2.6.3. Quantitative Output and Interpretation
6 0.0 0.0134 0.1157 -0.12 4: Distribution
7 0.0 0.0017 0.0411 -0.04 Distributional tests omitted due to
8 1.0 0.0002 0.0137 72.86 non-randomness of the data
9 0.0 0.0000 0.0043 0.00
10 1.0 0.0000 0.0013 769.07 5: Randomness
Lag One Autocorrelation = 0.937998
Data are Random?
STATISTIC = NUMBER OF RUNS TOTAL (as measured by autocorrelation) = NO
OF LENGTH I OR MORE
6: Statistical Control
I STAT EXP(STAT) SD(STAT) Z (i.e., no drift in location or scale,
data are random, distribution is
1 15.0 33.0000 2.9269 -6.15 fixed, here we are testing only for
2 11.0 12.0833 1.9745 -0.55 normal)
3 8.0 3.1500 1.5022 3.23 Data Set is in Statistical Control? = NO
4 4.0 0.6417 0.7684 4.37
5 3.0 0.1075 0.3251 8.90 7: Outliers?
6 2.0 0.0153 0.1236 16.05 (Grubbs' test omitted) = NO
7 2.0 0.0019 0.0436 45.83
8 2.0 0.0002 0.0145 138.37
9 1.0 0.0000 0.0045 220.36
10 1.0 0.0000 0.0014 736.94

LENGTH OF THE LONGEST RUN UP = 10


LENGTH OF THE LONGEST RUN DOWN = 5
LENGTH OF THE LONGEST RUN UP OR DOWN = 10

NUMBER OF POSITIVE DIFFERENCES = 23


NUMBER OF NEGATIVE DIFFERENCES = 18
NUMBER OF ZERO DIFFERENCES = 8

Values in the column labeled "Z" greater than 1.96 or less than -1.96 are statistically significant
at the 5% level. Due to the number of values that are much larger than the 1.96 cut-off, we
conclude that the data are not random.

Distributional Since we rejected the randomness assumption, the distributional tests are not meaningful.
Analysis Therefore, these quantitative tests are omitted. We also omit Grubbs' outlier test since it also
assumes the data are approximately normally distributed.

Univariate It is sometimes useful and convenient to summarize the above results in a report.
Report

Analysis for filter transmittance data

1: Sample Size = 50

2: Location
Mean = 2.001857
Standard Deviation of Mean = 0.00006
95% Confidence Interval for Mean = (2.001735,2.001979)
Drift with respect to location? = NO

3: Variation
Standard Deviation = 0.00043
95% Confidence Interval for SD = (0.000359,0.000535)
Change in variation?
(based on Levene's test on quarters
of the data) = NO

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4263.htm (5 of 6) [11/13/2003 5:33:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4263.htm (6 of 6) [11/13/2003 5:33:27 PM]


1.4.2.6.4. Work This Example Yourself 1.4.2.6.4. Work This Example Yourself

4. Generate summary statistics, quantitative


analysis, and print a univariate report.

1. Exploratory Data Analysis


1. Generate a table of summary 1. The summary statistics table displays
1.4. EDA Case Studies
1.4.2. Case Studies statistics. 25+ statistics.
1.4.2.6. Filter Transmittance

2. Compute a linear fit based on 2. The linear fit indicates a slight drift in
1.4.2.6.4. Work This Example Yourself quarters of the data to detect location since the slope parameter is
drift in location. statistically significant, but small.
View This page allows you to repeat the analysis outlined in the case study
Dataplot description on the previous page using Dataplot . It is required that you
Macro for have already downloaded and installed Dataplot and configured your
this Case browser. to run Dataplot. Output from each analysis step below will be 3. Compute Levene's test based on 3. Levene's test indicates no significant
Study displayed in one or more of the Dataplot windows. The four main quarters of the data to detect drift in variation.
windows are the Output window, the Graphics window, the Command changes in variation.
History window, and the data sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in.
4. Check for randomness by generating an 4. The lag 1 autocorrelation is 0.94.
autocorrelation plot and a runs test. This is outside the 95% confidence
Data Analysis Steps Results and Conclusions interval bands which indicates significant
non-randomness.

Click on the links below to start Dataplot and run this case study
The links in this column will connect you with more detailed 5. Print a univariate report (this assumes 5. The results are summarized in a
yourself. Each step may use results from previous steps, so please
information about each analysis step from the case study steps 2 thru 4 have already been run).
be patient. Wait until the software verifies that the current step is convenient report.
description.
complete before clicking on the next step.

1. Invoke Dataplot and read data.

1. Read in the data. 1. You have read 1 column of numbers


into Dataplot, variable Y.

2. 4-plot of the data.

1. 4-plot of Y. 1. Based on the 4-plot, there is a shift


in location and the data are not random.

3. Generate the individual plots.

1. Generate a run sequence plot. 1. The run sequence plot indicates that
there is a shift in location.

2. Generate a lag plot. 2. The strong linear pattern of the lag


plot indicates significant
non-randomness.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4264.htm (1 of 2) [11/13/2003 5:33:27 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4264.htm (2 of 2) [11/13/2003 5:33:27 PM]


1.4.2.7. Standard Resistor 1.4.2.7.1. Background and Data

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.4. EDA Case Studies 1.4. EDA Case Studies
1.4.2. Case Studies 1.4.2. Case Studies
1.4.2.7. Standard Resistor

1.4.2.7. Standard Resistor


1.4.2.7.1. Background and Data
Standard This example illustrates the univariate analysis of standard resistor data.
Resistor Generation This data set was collected by Ron Dziuba of NIST over a 5-year period
from 1980 to 1985. The response variable is resistor values.
1. Background and Data The motivation for studying this data set is to illustrate data that violate
2. Graphical Output and Interpretation the assumptions of constant location and scale.
3. Quantitative Output and Interpretation This file can be read by Dataplot with the following commands:
4. Work This Example Yourself SKIP 25
COLUMN LIMITS 10 80
READ DZIUBA1.DAT Y
COLUMN LIMITS

Resulting The following are the data used for this case study.
Data
27.8680
27.8929
27.8773
27.8530
27.8876
27.8725
27.8743
27.8879
27.8728
27.8746
27.8863
27.8716
27.8818
27.8872
27.8885
27.8945
27.8797
27.8627
27.8870

http://www.itl.nist.gov/div898/handbook/eda/section4/eda427.htm [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (1 of 23) [11/13/2003 5:33:28 PM]


1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

27.8895 27.8953
27.9138 27.8970
27.8931 27.9190
27.8852 27.9180
27.8788 27.8997
27.8827 27.9204
27.8939 27.9234
27.8558 27.9072
27.8814 27.9152
27.8479 27.9091
27.8479 27.8882
27.8848 27.9035
27.8809 27.9267
27.8479 27.9138
27.8611 27.8955
27.8630 27.9203
27.8679 27.9239
27.8637 27.9199
27.8985 27.9646
27.8900 27.9411
27.8577 27.9345
27.8848 27.8712
27.8869 27.9145
27.8976 27.9259
27.8610 27.9317
27.8567 27.9239
27.8417 27.9247
27.8280 27.9150
27.8555 27.9444
27.8639 27.9457
27.8702 27.9166
27.8582 27.9066
27.8605 27.9088
27.8900 27.9255
27.8758 27.9312
27.8774 27.9439
27.9008 27.9210
27.8988 27.9102
27.8897 27.9083
27.8990 27.9121
27.8958 27.9113
27.8830 27.9091
27.8967 27.9235
27.9105 27.9291
27.9028 27.9253
27.8977 27.9092

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (2 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (3 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

27.9117 27.9368
27.9194 27.9403
27.9039 27.9529
27.9515 27.9263
27.9143 27.9347
27.9124 27.9371
27.9128 27.9129
27.9260 27.9549
27.9339 27.9422
27.9500 27.9423
27.9530 27.9750
27.9430 27.9339
27.9400 27.9629
27.8850 27.9587
27.9350 27.9503
27.9120 27.9573
27.9260 27.9518
27.9660 27.9527
27.9280 27.9589
27.9450 27.9300
27.9390 27.9629
27.9429 27.9630
27.9207 27.9660
27.9205 27.9730
27.9204 27.9660
27.9198 27.9630
27.9246 27.9570
27.9366 27.9650
27.9234 27.9520
27.9125 27.9820
27.9032 27.9560
27.9285 27.9670
27.9561 27.9520
27.9616 27.9470
27.9530 27.9720
27.9280 27.9610
27.9060 27.9437
27.9380 27.9660
27.9310 27.9580
27.9347 27.9660
27.9339 27.9700
27.9410 27.9600
27.9397 27.9660
27.9472 27.9770
27.9235 27.9110
27.9315 27.9690

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (4 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (5 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

27.9698 27.9836
27.9616 28.0030
27.9371 27.9678
27.9700 28.0146
27.9265 27.9945
27.9964 27.9805
27.9842 27.9785
27.9667 27.9791
27.9610 27.9817
27.9943 27.9805
27.9616 27.9782
27.9397 27.9753
27.9799 27.9792
28.0086 27.9704
27.9709 27.9794
27.9741 27.9814
27.9675 27.9794
27.9826 27.9795
27.9676 27.9881
27.9703 27.9772
27.9789 27.9796
27.9786 27.9736
27.9722 27.9772
27.9831 27.9960
28.0043 27.9795
27.9548 27.9779
27.9875 27.9829
27.9495 27.9829
27.9549 27.9815
27.9469 27.9811
27.9744 27.9773
27.9744 27.9778
27.9449 27.9724
27.9837 27.9756
27.9585 27.9699
28.0096 27.9724
27.9762 27.9666
27.9641 27.9666
27.9854 27.9739
27.9877 27.9684
27.9839 27.9861
27.9817 27.9901
27.9845 27.9879
27.9877 27.9865
27.9880 27.9876
27.9822 27.9814

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (6 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (7 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

27.9842 28.0065
27.9868 27.9959
27.9834 28.0073
27.9892 28.0017
27.9864 28.0042
27.9843 28.0036
27.9838 28.0055
27.9847 28.0007
27.9860 28.0066
27.9872 28.0011
27.9869 27.9960
27.9602 28.0083
27.9852 27.9978
27.9860 28.0108
27.9836 28.0088
27.9813 28.0088
27.9623 28.0139
27.9843 28.0092
27.9802 28.0092
27.9863 28.0049
27.9813 28.0111
27.9881 28.0120
27.9850 28.0093
27.9850 28.0116
27.9830 28.0102
27.9866 28.0139
27.9888 28.0113
27.9841 28.0158
27.9863 28.0156
27.9903 28.0137
27.9961 28.0236
27.9905 28.0171
27.9945 28.0224
27.9878 28.0184
27.9929 28.0199
27.9914 28.0190
27.9914 28.0204
27.9997 28.0170
28.0006 28.0183
27.9999 28.0201
28.0004 28.0182
28.0020 28.0183
28.0029 28.0175
28.0008 28.0127
28.0040 28.0211
28.0078 28.0057

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (8 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (9 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

28.0180 28.0169
28.0183 28.0105
28.0149 28.0136
28.0185 28.0138
28.0182 28.0114
28.0192 28.0122
28.0213 28.0122
28.0216 28.0116
28.0169 28.0025
28.0162 28.0097
28.0167 28.0066
28.0167 28.0072
28.0169 28.0066
28.0169 28.0068
28.0161 28.0067
28.0152 28.0130
28.0179 28.0091
28.0215 28.0088
28.0194 28.0091
28.0115 28.0091
28.0174 28.0115
28.0178 28.0087
28.0202 28.0128
28.0240 28.0139
28.0198 28.0095
28.0194 28.0115
28.0171 28.0101
28.0134 28.0121
28.0121 28.0114
28.0121 28.0121
28.0141 28.0122
28.0101 28.0121
28.0114 28.0168
28.0122 28.0212
28.0124 28.0219
28.0171 28.0221
28.0165 28.0204
28.0166 28.0169
28.0159 28.0141
28.0181 28.0142
28.0200 28.0147
28.0116 28.0159
28.0144 28.0165
28.0141 28.0144
28.0116 28.0182
28.0107 28.0155

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (10 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (11 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

28.0155 28.0390
28.0192 28.0376
28.0204 28.0376
28.0185 28.0377
28.0248 28.0345
28.0185 28.0333
28.0226 28.0429
28.0271 28.0379
28.0290 28.0401
28.0240 28.0401
28.0302 28.0423
28.0243 28.0393
28.0288 28.0382
28.0287 28.0424
28.0301 28.0386
28.0273 28.0386
28.0313 28.0373
28.0293 28.0397
28.0300 28.0412
28.0344 28.0565
28.0308 28.0419
28.0291 28.0456
28.0287 28.0426
28.0358 28.0423
28.0309 28.0391
28.0286 28.0403
28.0308 28.0388
28.0291 28.0408
28.0380 28.0457
28.0411 28.0455
28.0420 28.0460
28.0359 28.0456
28.0368 28.0464
28.0327 28.0442
28.0361 28.0416
28.0334 28.0451
28.0300 28.0432
28.0347 28.0434
28.0359 28.0448
28.0344 28.0448
28.0370 28.0373
28.0355 28.0429
28.0371 28.0392
28.0318 28.0469
28.0390 28.0443
28.0390 28.0356

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (12 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (13 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

28.0474 28.0486
28.0446 28.0427
28.0348 28.0548
28.0368 28.0616
28.0418 28.0298
28.0445 28.0726
28.0533 28.0695
28.0439 28.0629
28.0474 28.0503
28.0435 28.0493
28.0419 28.0537
28.0538 28.0613
28.0538 28.0643
28.0463 28.0678
28.0491 28.0564
28.0441 28.0703
28.0411 28.0647
28.0507 28.0579
28.0459 28.0630
28.0519 28.0716
28.0554 28.0586
28.0512 28.0607
28.0507 28.0601
28.0582 28.0611
28.0471 28.0606
28.0539 28.0611
28.0530 28.0066
28.0502 28.0412
28.0422 28.0558
28.0431 28.0590
28.0395 28.0750
28.0177 28.0483
28.0425 28.0599
28.0484 28.0490
28.0693 28.0499
28.0490 28.0565
28.0453 28.0612
28.0494 28.0634
28.0522 28.0627
28.0393 28.0519
28.0443 28.0551
28.0465 28.0696
28.0450 28.0581
28.0539 28.0568
28.0566 28.0572
28.0585 28.0529

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (14 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (15 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

28.0421 28.0684
28.0432 28.0646
28.0211 28.0590
28.0363 28.0465
28.0436 28.0594
28.0619 28.0303
28.0573 28.0533
28.0499 28.0561
28.0340 28.0585
28.0474 28.0497
28.0534 28.0582
28.0589 28.0507
28.0466 28.0562
28.0448 28.0715
28.0576 28.0468
28.0558 28.0411
28.0522 28.0587
28.0480 28.0456
28.0444 28.0705
28.0429 28.0534
28.0624 28.0558
28.0610 28.0536
28.0461 28.0552
28.0564 28.0461
28.0734 28.0598
28.0565 28.0598
28.0503 28.0650
28.0581 28.0423
28.0519 28.0442
28.0625 28.0449
28.0583 28.0660
28.0645 28.0506
28.0642 28.0655
28.0535 28.0512
28.0510 28.0407
28.0542 28.0475
28.0677 28.0411
28.0416 28.0512
28.0676 28.1036
28.0596 28.0641
28.0635 28.0572
28.0558 28.0700
28.0623 28.0577
28.0718 28.0637
28.0585 28.0534
28.0552 28.0461

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (16 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (17 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

28.0701 28.0682
28.0631 28.0756
28.0575 28.0857
28.0444 28.0739
28.0592 28.0840
28.0684 28.0862
28.0593 28.0724
28.0677 28.0727
28.0512 28.0752
28.0644 28.0732
28.0660 28.0703
28.0542 28.0849
28.0768 28.0795
28.0515 28.0902
28.0579 28.0874
28.0538 28.0971
28.0526 28.0638
28.0833 28.0877
28.0637 28.0751
28.0529 28.0904
28.0535 28.0971
28.0561 28.0661
28.0736 28.0711
28.0635 28.0754
28.0600 28.0516
28.0520 28.0961
28.0695 28.0689
28.0608 28.1110
28.0608 28.1062
28.0590 28.0726
28.0290 28.1141
28.0939 28.0913
28.0618 28.0982
28.0551 28.0703
28.0757 28.0654
28.0698 28.0760
28.0717 28.0727
28.0529 28.0850
28.0644 28.0877
28.0613 28.0967
28.0759 28.1185
28.0745 28.0945
28.0736 28.0834
28.0611 28.0764
28.0732 28.1129
28.0782 28.0797

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (18 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (19 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

28.0707 28.0885
28.1008 28.0940
28.0971 28.0856
28.0826 28.0849
28.0857 28.0955
28.0984 28.0955
28.0869 28.0846
28.0795 28.0871
28.0875 28.0872
28.1184 28.0917
28.0746 28.0931
28.0816 28.0865
28.0879 28.0900
28.0888 28.0915
28.0924 28.0963
28.0979 28.0917
28.0702 28.0950
28.0847 28.0898
28.0917 28.0902
28.0834 28.0867
28.0823 28.0843
28.0917 28.0939
28.0779 28.0902
28.0852 28.0911
28.0863 28.0909
28.0942 28.0949
28.0801 28.0867
28.0817 28.0932
28.0922 28.0891
28.0914 28.0932
28.0868 28.0887
28.0832 28.0925
28.0881 28.0928
28.0910 28.0883
28.0886 28.0946
28.0961 28.0977
28.0857 28.0914
28.0859 28.0959
28.1086 28.0926
28.0838 28.0923
28.0921 28.0950
28.0945 28.1006
28.0839 28.0924
28.0877 28.0963
28.0803 28.0893
28.0928 28.0956

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (20 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (21 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.1. Background and Data 1.4.2.7.1. Background and Data

28.0980 28.0981
28.0928 28.1045
28.0951 28.1047
28.0958 28.1042
28.0912 28.1146
28.0990 28.1113
28.0915 28.1051
28.0957 28.1065
28.0976 28.1065
28.0888 28.0985
28.0928 28.1000
28.0910 28.1066
28.0902 28.1041
28.0950 28.0954
28.0995 28.1090
28.0965
28.0972
28.0963
28.0946
28.0942
28.0998
28.0911
28.1043
28.1002
28.0991
28.0959
28.0996
28.0926
28.1002
28.0961
28.0983
28.0997
28.0959
28.0988
28.1029
28.0989
28.1000
28.0944
28.0979
28.1005
28.1012
28.1013
28.0999
28.0991
28.1059
28.0961

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (22 of 23) [11/13/2003 5:33:28 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4271.htm (23 of 23) [11/13/2003 5:33:28 PM]
1.4.2.7.2. Graphical Output and Interpretation 1.4.2.7.2. Graphical Output and Interpretation

4-Plot of
Data

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.7. Standard Resistor

1.4.2.7.2. Graphical Output and


Interpretation
Goal The goal of this analysis is threefold:
1. Determine if the univariate model:

is appropriate and valid.


2. Determine if the typical underlying assumptions for an "in Interpretation The assumptions are addressed by the graphics shown above:
control" measurement process are valid. These assumptions are:
1. The run sequence plot (upper left) indicates significant shifts in
1. random drawings; both location and variation. Specifically, the location is
2. from a fixed distribution; increasing with time. The variability seems greater in the first
3. with the distribution having a fixed location; and and last third of the data than it does in the middle third.
4. the distribution having a fixed scale. 2. The lag plot (upper right) shows a significant non-random
3. Determine if the confidence interval pattern in the data. Specifically, the strong linear appearance of
this plot is indicative of a model that relates Yt to Yt-1.
3. The distributional plots, the histogram (lower left) and the
is appropriate and valid where s is the standard deviation of the normal probability plot (lower right), are not interpreted since
original data. the randomness assumption is so clearly violated.
The serious violation of the non-randomness assumption means that
the univariate model

is not valid. Given the linear appearance of the lag plot, the first step
might be to consider a model of the type

However, discussions with the scientist revealed the following:


1. the drift with respect to location was expected.
2. the non-constant variability was not expected.
The scientist examined the data collection device and determined that
the non-constant variation was a seasonal effect. The high variability

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4272.htm (1 of 4) [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4272.htm (2 of 4) [11/13/2003 5:33:31 PM]


1.4.2.7.2. Graphical Output and Interpretation 1.4.2.7.2. Graphical Output and Interpretation

data in the first and last thirds was collected in winter while the more
stable middle third was collected in the summer. The seasonal effect Lag Plot
was determined to be caused by the amount of humidity affecting the
measurement equipment. In this case, the solution was to modify the
test equipment to be less sensitive to enviromental factors.
Simple graphical techniques can be quite effective in revealing
unexpected results in the data. When this occurs, it is important to
investigate whether the unexpected result is due to problems in the
experiment and data collection, or is it in fact indicative of an
unexpected underlying structure in the data. This determination cannot
be made on the basis of statistics alone. The role of the graphical and
statistical analysis is to detect problems or unexpected results in the
data. Resolving the issues requires the knowledge of the scientist or
engineer.

Individual Although it is generally unnecessary, the plots can be generated


Plots individually to give more detail. Since the lag plot indicates significant
non-randomness, we omit the distributional plots.

Run
Sequence
Plot

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4272.htm (3 of 4) [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4272.htm (4 of 4) [11/13/2003 5:33:31 PM]


1.4.2.7.3. Quantitative Output and Interpretation 1.4.2.7.3. Quantitative Output and Interpretation

Location One way to quantify a change in location over time is to fit a straight line to the data set using
the index variable X = 1, 2, ..., N, with N denoting the number of observations. If there is no
significant drift in the location, the slope parameter estimate should be zero. For this data set,
1. Exploratory Data Analysis Dataplot generates the following output:
1.4. EDA Case Studies
1.4.2. Case Studies LEAST SQUARES MULTILINEAR FIT
1.4.2.7. Standard Resistor SAMPLE SIZE N = 1000
NUMBER OF VARIABLES = 1
NO REPLICATION CASE
1.4.2.7.3. Quantitative Output and Interpretation
PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
1 A0 27.9114 (0.1209E-02) 0.2309E+05
Summary As a first step in the analysis, a table of summary statistics is computed from the data. The 2 A1 X 0.209670E-03 (0.2092E-05) 100.2
Statistics following table, generated by Dataplot, shows a typical set of statistics.
RESIDUAL STANDARD DEVIATION = 0.1909796E-01
RESIDUAL DEGREES OF FREEDOM = 998
SUMMARY COEF AND SD(COEF) WRITTEN OUT TO FILE DPST1F.DAT
SD(PRED),95LOWER,95UPPER,99LOWER,99UPPER
NUMBER OF OBSERVATIONS = 1000 WRITTEN OUT TO FILE DPST2F.DAT
REGRESSION DIAGNOSTICS WRITTEN OUT TO FILE DPST3F.DAT
PARAMETER VARIANCE-COVARIANCE MATRIX AND
*********************************************************************** INVERSE OF X-TRANSPOSE X MATRIX
* LOCATION MEASURES * DISPERSION MEASURES * WRITTEN OUT TO FILE DPST4F.DAT
***********************************************************************
* MIDRANGE = 0.2797325E+02 * RANGE = 0.2905006E+00 * The slope parameter, A1, has a t value of 100 which is statistically significant. The value of the
* MEAN = 0.2801634E+02 * STAND. DEV. = 0.6349404E-01 * slope parameter estimate is 0.00021. Although this number is nearly zero, we need to take into
* MIDMEAN = 0.2802659E+02 * AV. AB. DEV. = 0.5101655E-01 *
* MEDIAN = 0.2802910E+02 * MINIMUM = 0.2782800E+02 * account that the original scale of the data is from about 27.8 to 28.2. In this case, we conclude
* = * LOWER QUART. = 0.2797905E+02 * that there is a drift in location.
* = * LOWER HINGE = 0.2797900E+02 *
* = * UPPER HINGE = 0.2806295E+02 * Variation
* = * UPPER QUART. = 0.2806293E+02 * One simple way to detect a change in variation is with a Bartlett test after dividing the data set
* = * MAXIMUM = 0.2811850E+02 * into several equal-sized intervals. However, the Bartlett test is not robust for non-normality.
*********************************************************************** Since the normality assumption is questionable for these data, we use the alternative Levene
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES *
*********************************************************************** test. In partiuclar, we use the Levene test based on the median rather the mean. The choice of
* AUTOCO COEF = 0.9721591E+00 * ST. 3RD MOM. = -0.6936395E+00 * the number of intervals is somewhat arbitrary, although values of 4 or 8 are reasonable.
* = 0.0000000E+00 * ST. 4TH MOM. = 0.2689681E+01 * Dataplot generated the following output for the Levene test.
* = 0.0000000E+00 * ST. WILK-SHA = -0.4216419E+02 *
* = * UNIFORM PPCC = 0.9689648E+00 *
* = * NORMAL PPCC = 0.9718416E+00 * LEVENE F-TEST FOR SHIFT IN VARIATION
* = * TUK -.5 PPCC = 0.7334843E+00 * (ASSUMPTION: NORMALITY)
* = * CAUCHY PPCC = 0.3347875E+00 *
*********************************************************************** 1. STATISTICS
NUMBER OF OBSERVATIONS = 1000
NUMBER OF GROUPS = 4
The autocorrelation coefficient of 0.972 is evidence of significant non-randomness. LEVENE F TEST STATISTIC = 140.8509

FOR LEVENE TEST STATISTIC


0 % POINT = 0.0000000E+00
50 % POINT = 0.7891988
75 % POINT = 1.371589
90 % POINT = 2.089303
95 % POINT = 2.613852
99 % POINT = 3.801369
99.9 % POINT = 5.463994

100.0000 % Point: 140.8509

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4273.htm (1 of 6) [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4273.htm (2 of 6) [11/13/2003 5:33:31 PM]


1.4.2.7.3. Quantitative Output and Interpretation 1.4.2.7.3. Quantitative Output and Interpretation
7 0.0 0.0194 0.1394 -0.14
3. CONCLUSION (AT THE 5% LEVEL): 8 0.0 0.0022 0.0470 -0.05
THERE IS A SHIFT IN VARIATION. 9 0.0 0.0002 0.0150 -0.02
THUS: NOT HOMOGENEOUS WITH RESPECT TO VARIATION. 10 0.0 0.0000 0.0046 0.00
In this case, since the Levene test statistic value of 140.9 is greater than the 5% significance
level critical value of 2.6, we conclude that there is significant evidence of nonconstant STATISTIC = NUMBER OF RUNS UP
variation. OF LENGTH I OR MORE

I STAT EXP(STAT) SD(STAT) Z


Randomness
There are many ways in which data can be non-random. However, most common forms of
non-randomness can be detected with a few simple tests. The lag plot in the 4-plot in the 1 315.0 333.1667 9.4195 -1.93
2 137.0 124.7917 6.2892 1.94
previous section is a simple graphical technique. 3 47.0 33.2417 4.8619 2.83
4 18.0 6.9181 2.5200 4.40
One check is an autocorrelation plot that shows the autocorrelations for various lags. 5 2.0 1.1847 1.0787 0.76
Confidence bands can be plotted at the 95% and 99% confidence levels. Points outside this 6 0.0 0.1726 0.4148 -0.42
band indicate statistically significant values (lag 0 is always 1). Dataplot generated the 7 0.0 0.0219 0.1479 -0.15
following autocorrelation plot. 8 0.0 0.0025 0.0496 -0.05
9 0.0 0.0002 0.0158 -0.02
10 0.0 0.0000 0.0048 0.00

RUNS DOWN

STATISTIC = NUMBER OF RUNS DOWN


OF LENGTH EXACTLY I

I STAT EXP(STAT) SD(STAT) Z

1 195.0 208.3750 14.5453 -0.92


2 81.0 91.5500 7.5002 -1.41
3 32.0 26.3236 4.5727 1.24
4 4.0 5.7333 2.3164 -0.75
5 1.0 1.0121 0.9987 -0.01
6 1.0 0.1507 0.3877 2.19
7 0.0 0.0194 0.1394 -0.14
8 0.0 0.0022 0.0470 -0.05
9 0.0 0.0002 0.0150 -0.02
10 0.0 0.0000 0.0046 0.00

STATISTIC = NUMBER OF RUNS DOWN


The lag 1 autocorrelation, which is generally the one of greatest interest, is 0.97. The critical OF LENGTH I OR MORE
values at the 5% significance level are -0.062 and 0.062. This indicates that the lag 1
autocorrelation is statistically significant, so there is strong evidence of non-randomness.
I STAT EXP(STAT) SD(STAT) Z
A common test for randomness is the runs test.
1 314.0 333.1667 9.4195 -2.03
2 119.0 124.7917 6.2892 -0.92
3 38.0 33.2417 4.8619 0.98
RUNS UP 4 6.0 6.9181 2.5200 -0.36
5 2.0 1.1847 1.0787 0.76
STATISTIC = NUMBER OF RUNS UP 6 1.0 0.1726 0.4148 1.99
OF LENGTH EXACTLY I 7 0.0 0.0219 0.1479 -0.15
8 0.0 0.0025 0.0496 -0.05
I STAT EXP(STAT) SD(STAT) Z 9 0.0 0.0002 0.0158 -0.02
10 0.0 0.0000 0.0048 0.00
1 178.0 208.3750 14.5453 -2.09
2 90.0 91.5500 7.5002 -0.21
3 29.0 26.3236 4.5727 0.59 RUNS TOTAL = RUNS UP + RUNS DOWN
4 16.0 5.7333 2.3164 4.43
5 2.0 1.0121 0.9987 0.99 STATISTIC = NUMBER OF RUNS TOTAL
6 0.0 0.1507 0.3877 -0.39 OF LENGTH EXACTLY I

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4273.htm (3 of 6) [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4273.htm (4 of 6) [11/13/2003 5:33:31 PM]


1.4.2.7.3. Quantitative Output and Interpretation 1.4.2.7.3. Quantitative Output and Interpretation

I STAT EXP(STAT) SD(STAT) Z


Univariate It is sometimes useful and convenient to summarize the above results in a report.
1 373.0 416.7500 20.5701 -2.13 Report
2 171.0 183.1000 10.6068 -1.14
3 61.0 52.6472 6.4668 1.29 Analysis for resistor case study
4 20.0 11.4667 3.2759 2.60
5 3.0 2.0243 1.4123 0.69 1: Sample Size = 1000
6 1.0 0.3014 0.5483 1.27
7 0.0 0.0389 0.1971 -0.20 2: Location
8 0.0 0.0044 0.0665 -0.07 Mean = 28.01635
9 0.0 0.0005 0.0212 -0.02 Standard Deviation of Mean = 0.002008
10 0.0 0.0000 0.0065 -0.01 95% Confidence Interval for Mean = (28.0124,28.02029)
Drift with respect to location? = NO

STATISTIC = NUMBER OF RUNS TOTAL 3: Variation


OF LENGTH I OR MORE Standard Deviation = 0.063495
95% Confidence Interval for SD = (0.060829,0.066407)
I STAT EXP(STAT) SD(STAT) Z Change in variation?
(based on Levene's test on quarters
1 629.0 666.3333 13.3212 -2.80 of the data) = YES
2 256.0 249.5833 8.8942 0.72
3 85.0 66.4833 6.8758 2.69 4: Randomness
4 24.0 13.8361 3.5639 2.85 Autocorrelation = 0.972158
5 4.0 2.3694 1.5256 1.07 Data Are Random?
6 1.0 0.3452 0.5866 1.12 (as measured by autocorrelation) = NO
7 0.0 0.0438 0.2092 -0.21
8 0.0 0.0049 0.0701 -0.07 5: Distribution
9 0.0 0.0005 0.0223 -0.02 Distributional test omitted due to
10 0.0 0.0000 0.0067 -0.01 non-randomness of the data

6: Statistical Control
LENGTH OF THE LONGEST RUN UP = 5 (i.e., no drift in location or scale,
LENGTH OF THE LONGEST RUN DOWN = 6 data are random, distribution is
LENGTH OF THE LONGEST RUN UP OR DOWN = 6 fixed)
Data Set is in Statistical Control? = NO
NUMBER OF POSITIVE DIFFERENCES = 505
NUMBER OF NEGATIVE DIFFERENCES = 469 7: Outliers?
NUMBER OF ZERO DIFFERENCES = 25 (Grubbs' test omitted due to
non-randomness of the data
Values in the column labeled "Z" greater than 1.96 or less than -1.96 are statistically significant
at the 5% level. Due to the number of values that are larger than the 1.96 cut-off, we conclude
that the data are not random. However, in this case the evidence from the runs test is not nearly
as strong as it is from the autocorrelation plot.

Distributional Since we rejected the randomness assumption, the distributional tests are not meaningful.
Analysis Therefore, these quantitative tests are omitted. Since the Grubbs' test for outliers also assumes
the approximate normality of the data, we omit Grubbs' test as well.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4273.htm (5 of 6) [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4273.htm (6 of 6) [11/13/2003 5:33:31 PM]


1.4.2.7.4. Work This Example Yourself 1.4.2.7.4. Work This Example Yourself

4. Generate summary statistics, quantitative


analysis, and print a univariate report.

1. Exploratory Data Analysis


1. Generate a table of summary 1. The summary statistics table displays
1.4. EDA Case Studies
1.4.2. Case Studies statistics. 25+ statistics.
1.4.2.7. Standard Resistor

2. Generate the sample mean, a confidence 2. The mean is 28.0163 and a 95%
1.4.2.7.4. Work This Example Yourself interval for the population mean, and confidence interval is (28.0124,28.02029).
compute a linear fit to detect drift in The linear fit indicates drift in
View This page allows you to repeat the analysis outlined in the case study location. location since the slope parameter
Dataplot description on the previous page using Dataplot . It is required that you estimate is statistically significant.
Macro for have already downloaded and installed Dataplot and configured your
this Case browser. to run Dataplot. Output from each analysis step below will be
Study displayed in one or more of the Dataplot windows. The four main 3. Generate the sample standard deviation, 3. The standard deviation is 0.0635 with
windows are the Output window, the Graphics window, the Command a confidence interval for the population a 95% confidence interval of (0.060829,0.066407).
History window, and the data sheet window. Across the top of the main
standard deviation, and detect drift in Levene's test indicates significant
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in. variation by dividing the data into change in variation.
quarters and computing Levene's test for
equal standard deviations.
Data Analysis Steps Results and Conclusions
4. Check for randomness by generating an 4. The lag 1 autocorrelation is 0.97.
Click on the links below to start Dataplot and run this case study autocorrelation plot and a runs test. From the autocorrelation plot, this is
yourself. Each step may use results from previous steps, so please be outside the 95% confidence interval
patient. Wait until the software verifies that the current step is bands, indicating significant non-randomness.
complete before clicking on the next step.
The links in this column will connect you with more detailed information about
NOTE: This case study has 1,000 points. For better performance, it each analysis step from the case study description.
is highly recommended that you check the "No Update" box on the 5. Print a univariate report (this assumes 5. The results are summarized in a
Spreadsheet window for this case study. This will suppress steps 2 thru 5 have already been run). convenient report.
subsequent updating of the Spreadsheet window as the data are
created or modified.

1. Invoke Dataplot and read data.

1. Read in the data. 1. You have read 1 column of numbers


into Dataplot, variable Y.

2. 4-plot of the data.

1. 4-plot of Y. 1. Based on the 4-plot, there are shifts


in location and variation and the data
are not random.

3. Generate the individual plots.

1. Generate a run sequence plot. 1. The run sequence plot indicates that
there are shifts of location and
variation.

2. Generate a lag plot. 2. The lag plot shows a strong linear


pattern, which indicates significant
non-randomness.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4274.htm (1 of 2) [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4274.htm (2 of 2) [11/13/2003 5:33:31 PM]


1.4.2.8. Heat Flow Meter 1 1.4.2.8.1. Background and Data

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.4. EDA Case Studies 1.4. EDA Case Studies
1.4.2. Case Studies 1.4.2. Case Studies
1.4.2.8. Heat Flow Meter 1

1.4.2.8. Heat Flow Meter 1


1.4.2.8.1. Background and Data
Heat Flow This example illustrates the univariate analysis of standard resistor data.
Meter Generation This data set was collected by Bob Zarr of NIST in January, 1990 from
Calibration a heat flow meter calibration and stability analysis. The response
and Stability variable is a calibration factor.
The motivation for studying this data set is to illustrate a well-behaved
1. Background and Data
process where the underlying assumptions hold and the process is in
2. Graphical Output and Interpretation statistical control.
3. Quantitative Output and Interpretation This file can be read by Dataplot with the following commands:
4. Work This Example Yourself SKIP 25
READ ZARR13.DAT Y

Resulting The following are the data used for this case study.
Data
9.206343
9.299992
9.277895
9.305795
9.275351
9.288729
9.287239
9.260973
9.303111
9.275674
9.272561
9.288454
9.255672
9.252141
9.297670
9.266534
9.256689
9.277542
9.248205

http://www.itl.nist.gov/div898/handbook/eda/section4/eda428.htm [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htm (1 of 5) [11/13/2003 5:33:31 PM]


1.4.2.8.1. Background and Data 1.4.2.8.1. Background and Data

9.252107 9.268955
9.276345 9.257269
9.278694 9.264979
9.267144 9.295500
9.246132 9.292883
9.238479 9.264188
9.269058 9.280731
9.248239 9.267336
9.257439 9.300566
9.268481 9.253089
9.288454 9.261376
9.258452 9.238409
9.286130 9.225073
9.251479 9.235526
9.257405 9.239510
9.268343 9.264487
9.291302 9.244242
9.219460 9.277542
9.270386 9.310506
9.218808 9.261594
9.241185 9.259791
9.269989 9.253089
9.226585 9.245735
9.258556 9.284058
9.286184 9.251122
9.320067 9.275385
9.327973 9.254619
9.262963 9.279526
9.248181 9.275065
9.238644 9.261952
9.225073 9.275351
9.220878 9.252433
9.271318 9.230263
9.252072 9.255150
9.281186 9.268780
9.270624 9.290389
9.294771 9.274161
9.301821 9.255707
9.278849 9.261663
9.236680 9.250455
9.233988 9.261952
9.244687 9.264041
9.221601 9.264509
9.207325 9.242114
9.258776 9.239674
9.275708 9.221553

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htm (2 of 5) [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htm (3 of 5) [11/13/2003 5:33:31 PM]


1.4.2.8.1. Background and Data 1.4.2.8.1. Background and Data

9.241935 9.237646
9.215265 9.248937
9.285930 9.256689
9.271559 9.265777
9.266046 9.299047
9.285299 9.244814
9.268989 9.287205
9.267987 9.300566
9.246166 9.256621
9.231304 9.271318
9.240768 9.275154
9.260506 9.281834
9.274355 9.253158
9.292376 9.269024
9.271170 9.282077
9.267018 9.277507
9.308838 9.284910
9.264153 9.239840
9.278822 9.268344
9.255244 9.247778
9.229221 9.225039
9.253158 9.230750
9.256292 9.270024
9.262602 9.265095
9.219793 9.284308
9.258452 9.280697
9.267987 9.263032
9.267987 9.291851
9.248903 9.252072
9.235153 9.244031
9.242933 9.283269
9.253453 9.196848
9.262671 9.231372
9.242536 9.232963
9.260803 9.234956
9.259825 9.216746
9.253123 9.274107
9.240803 9.273776
9.238712
9.263676
9.243002
9.246826
9.252107
9.261663
9.247311
9.306055

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htm (4 of 5) [11/13/2003 5:33:31 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4281.htm (5 of 5) [11/13/2003 5:33:31 PM]


1.4.2.8.2. Graphical Output and Interpretation 1.4.2.8.2. Graphical Output and Interpretation

4-Plot of
Data

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.8. Heat Flow Meter 1

1.4.2.8.2. Graphical Output and


Interpretation
Goal The goal of this analysis is threefold:
1. Determine if the univariate model:

is appropriate and valid.


2. Determine if the typical underlying assumptions for an "in Interpretation The assumptions are addressed by the graphics shown above:
control" measurement process are valid. These assumptions are:
1. The run sequence plot (upper left) indicates that the data do not
1. random drawings; have any significant shifts in location or scale over time.
2. from a fixed distribution; 2. The lag plot (upper right) does not indicate any non-random
3. with the distribution having a fixed location; and pattern in the data.
4. the distribution having a fixed scale. 3. The histogram (lower left) shows that the data are reasonably
3. Determine if the confidence interval symmetric, there does not appear to be significant outliers in the
tails, and it seems reasonable to assume that the data are from
approximately a normal distribution.
is appropriate and valid where s is the standard deviation of the 4. The normal probability plot (lower right) verifies that an
original data. assumption of normality is in fact reasonable.

Individual Although it is generally unnecessary, the plots can be generated


Plots individually to give more detail.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4282.htm (1 of 4) [11/13/2003 5:33:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4282.htm (2 of 4) [11/13/2003 5:33:32 PM]


1.4.2.8.2. Graphical Output and Interpretation 1.4.2.8.2. Graphical Output and Interpretation

Run Histogram
Sequence (with
Plot overlaid
Normal PDF)

Lag Plot Normal


Probability
Plot

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4282.htm (3 of 4) [11/13/2003 5:33:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4282.htm (4 of 4) [11/13/2003 5:33:32 PM]


1.4.2.8.3. Quantitative Output and Interpretation 1.4.2.8.3. Quantitative Output and Interpretation

Location One way to quantify a change in location over time is to fit a straight line to the data set using
the index variable X = 1, 2, ..., N, with N denoting the number of observations. If there is no
significant drift in the location, the slope parameter should be zero. For this data set, Dataplot
1. Exploratory Data Analysis generates the following output:
1.4. EDA Case Studies
1.4.2. Case Studies LEAST SQUARES MULTILINEAR FIT
1.4.2.8. Heat Flow Meter 1 SAMPLE SIZE N = 195
NUMBER OF VARIABLES = 1
NO REPLICATION CASE
1.4.2.8.3. Quantitative Output and Interpretation
PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
1 A0 9.26699 (0.3253E-02) 2849.
Summary As a first step in the analysis, a table of summary statistics is computed from the data. The 2 A1 X -0.564115E-04 (0.2878E-04) -1.960
Statistics following table, generated by Dataplot, shows a typical set of statistics.
RESIDUAL STANDARD DEVIATION = 0.2262372E-01
RESIDUAL DEGREES OF FREEDOM = 193
SUMMARY The slope parameter, A1, has a t value of -1.96 which is (barely) statistically significant since
it is essentially equal to the 95% level cutoff of -1.96. However, notice that the value of the
NUMBER OF OBSERVATIONS = 195
slope parameter estimate is -0.00056. This slope, even though statistically significant, can
essentially be considered zero.
***********************************************************************
* LOCATION MEASURES * DISPERSION MEASURES * Variation One simple way to detect a change in variation is with a Bartlett test after dividing the data set
***********************************************************************
* MIDRANGE = 0.9262411E+01 * RANGE = 0.1311255E+00 * into several equal-sized intervals. The choice of the number of intervals is somewhat arbitrary,
* MEAN = 0.9261460E+01 * STAND. DEV. = 0.2278881E-01 * although values of 4 or 8 are reasonable. Dataplot generated the following output for the
* MIDMEAN = 0.9259412E+01 * AV. AB. DEV. = 0.1788945E-01 * Bartlett test.
* MEDIAN = 0.9261952E+01 * MINIMUM = 0.9196848E+01 *
* = * LOWER QUART. = 0.9246826E+01 *
* = * LOWER HINGE = 0.9246496E+01 * BARTLETT TEST
* = * UPPER HINGE = 0.9275530E+01 * (STANDARD DEFINITION)
* = * UPPER QUART. = 0.9275708E+01 * NULL HYPOTHESIS UNDER TEST--ALL SIGMA(I) ARE EQUAL
* = * MAXIMUM = 0.9327973E+01 *
*********************************************************************** TEST:
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES * DEGREES OF FREEDOM = 3.000000
***********************************************************************
* AUTOCO COEF = 0.2805789E+00 * ST. 3RD MOM. = -0.8537455E-02 * TEST STATISTIC VALUE = 3.147338
* = 0.0000000E+00 * ST. 4TH MOM. = 0.3049067E+01 * CUTOFF: 95% PERCENT POINT = 7.814727
* = 0.0000000E+00 * ST. WILK-SHA = 0.9458605E+01 * CUTOFF: 99% PERCENT POINT = 11.34487
* = * UNIFORM PPCC = 0.9735289E+00 *
* = * NORMAL PPCC = 0.9989640E+00 * CHI-SQUARE CDF VALUE = 0.630538
* = * TUK -.5 PPCC = 0.8927904E+00 *
* = * CAUCHY PPCC = 0.6360204E+00 * NULL NULL HYPOTHESIS NULL HYPOTHESIS
*********************************************************************** HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
ALL SIGMA EQUAL (0.000,0.950) ACCEPT

In this case, since the Bartlett test statistic of 3.14 is less than the critical value at the 5%
significance level of 7.81, we conclude that the standard deviations are not significantly
different in the 4 intervals. That is, the assumption of constant scale is valid.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4283.htm (1 of 7) [11/13/2003 5:33:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4283.htm (2 of 7) [11/13/2003 5:33:32 PM]


1.4.2.8.3. Quantitative Output and Interpretation 1.4.2.8.3. Quantitative Output and Interpretation

I STAT EXP(STAT) SD(STAT) Z


Randomness There are many ways in which data can be non-random. However, most common forms of
non-randomness can be detected with a few simple tests. The lag plot in the previous section is 1 58.0 64.8333 4.1439 -1.65
a simple graphical technique. 2 23.0 24.1667 2.7729 -0.42
3 15.0 6.4083 2.1363 4.02
Another check is an autocorrelation plot that shows the autocorrelations for various lags. 4 3.0 1.3278 1.1043 1.51
5 0.0 0.2264 0.4716 -0.48
Confidence bands can be plotted at the 95% and 99% confidence levels. Points outside this 6 0.0 0.0328 0.1809 -0.18
band indicate statistically significant values (lag 0 is always 1). Dataplot generated the 7 0.0 0.0041 0.0644 -0.06
following autocorrelation plot. 8 0.0 0.0005 0.0215 -0.02
9 0.0 0.0000 0.0068 -0.01
10 0.0 0.0000 0.0021 0.00

RUNS DOWN

STATISTIC = NUMBER OF RUNS DOWN


OF LENGTH EXACTLY I

I STAT EXP(STAT) SD(STAT) Z

1 33.0 40.6667 6.4079 -1.20


2 18.0 17.7583 3.3021 0.07
3 3.0 5.0806 2.0096 -1.04
4 3.0 1.1014 1.0154 1.87
5 1.0 0.1936 0.4367 1.85
6 0.0 0.0287 0.1692 -0.17
7 0.0 0.0037 0.0607 -0.06
8 0.0 0.0004 0.0204 -0.02
9 0.0 0.0000 0.0065 -0.01
10 0.0 0.0000 0.0020 0.00

STATISTIC = NUMBER OF RUNS DOWN


The lag 1 autocorrelation, which is generally the one of greatest interest, is 0.281. The critical OF LENGTH I OR MORE
values at the 5% significance level are -0.087 and 0.087. This indicates that the lag 1
autocorrelation is statistically significant, so there is evidence of non-randomness.
I STAT EXP(STAT) SD(STAT) Z
A common test for randomness is the runs test.
1 58.0 64.8333 4.1439 -1.65
2 25.0 24.1667 2.7729 0.30
3 7.0 6.4083 2.1363 0.28
RUNS UP 4 4.0 1.3278 1.1043 2.42
5 1.0 0.2264 0.4716 1.64
STATISTIC = NUMBER OF RUNS UP 6 0.0 0.0328 0.1809 -0.18
OF LENGTH EXACTLY I 7 0.0 0.0041 0.0644 -0.06
8 0.0 0.0005 0.0215 -0.02
I STAT EXP(STAT) SD(STAT) Z 9 0.0 0.0000 0.0068 -0.01
10 0.0 0.0000 0.0021 0.00
1 35.0 40.6667 6.4079 -0.88
2 8.0 17.7583 3.3021 -2.96
3 12.0 5.0806 2.0096 3.44 RUNS TOTAL = RUNS UP + RUNS DOWN
4 3.0 1.1014 1.0154 1.87
5 0.0 0.1936 0.4367 -0.44 STATISTIC = NUMBER OF RUNS TOTAL
6 0.0 0.0287 0.1692 -0.17 OF LENGTH EXACTLY I
7 0.0 0.0037 0.0607 -0.06
8 0.0 0.0004 0.0204 -0.02 I STAT EXP(STAT) SD(STAT) Z
9 0.0 0.0000 0.0065 -0.01
10 0.0 0.0000 0.0020 0.00 1 68.0 81.3333 9.0621 -1.47
2 26.0 35.5167 4.6698 -2.04
3 15.0 10.1611 2.8420 1.70
STATISTIC = NUMBER OF RUNS UP 4 6.0 2.2028 1.4360 2.64
OF LENGTH I OR MORE 5 1.0 0.3871 0.6176 0.99

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4283.htm (3 of 7) [11/13/2003 5:33:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4283.htm (4 of 7) [11/13/2003 5:33:32 PM]


1.4.2.8.3. Quantitative Output and Interpretation 1.4.2.8.3. Quantitative Output and Interpretation
6 0.0 0.0574 0.2392 -0.24 NUMBER OF OBSERVATIONS = 195
7 0.0 0.0074 0.0858 -0.09 MEAN = 9.261460
8 0.0 0.0008 0.0289 -0.03 STANDARD DEVIATION = 0.2278881E-01
9 0.0 0.0001 0.0092 -0.01
10 0.0 0.0000 0.0028 0.00 ANDERSON-DARLING TEST STATISTIC VALUE = 0.1264954
ADJUSTED TEST STATISTIC VALUE = 0.1290070

STATISTIC = NUMBER OF RUNS TOTAL 2. CRITICAL VALUES:


OF LENGTH I OR MORE 90 % POINT = 0.6560000
95 % POINT = 0.7870000
I STAT EXP(STAT) SD(STAT) Z 97.5 % POINT = 0.9180000
99 % POINT = 1.092000
1 116.0 129.6667 5.8604 -2.33
2 48.0 48.3333 3.9215 -0.09 3. CONCLUSION (AT THE 5% LEVEL):
3 22.0 12.8167 3.0213 3.04 THE DATA DO COME FROM A NORMAL DISTRIBUTION.
4 7.0 2.6556 1.5617 2.78
5 1.0 0.4528 0.6669 0.82
6 0.0 0.0657 0.2559 -0.26
The Anderson-Darling test also does not reject the normality assumption because the test
7 0.0 0.0083 0.0911 -0.09 statistic, 0.129, is less than the critical value at the 5% significance level of 0.918.
8 0.0 0.0009 0.0305 -0.03
9 0.0 0.0001 0.0097 -0.01 Outlier A test for outliers is the Grubbs' test. Dataplot generated the following output for Grubbs' test.
10 0.0 0.0000 0.0029 0.00
Analysis

LENGTH OF THE LONGEST RUN UP = 4 GRUBBS TEST FOR OUTLIERS


LENGTH OF THE LONGEST RUN DOWN = 5 (ASSUMPTION: NORMALITY)
LENGTH OF THE LONGEST RUN UP OR DOWN = 5
1. STATISTICS:
NUMBER OF POSITIVE DIFFERENCES = 98 NUMBER OF OBSERVATIONS = 195
NUMBER OF NEGATIVE DIFFERENCES = 95 MINIMUM = 9.196848
NUMBER OF ZERO DIFFERENCES = 1 MEAN = 9.261460
MAXIMUM = 9.327973
STANDARD DEVIATION = 0.2278881E-01
Values in the column labeled "Z" greater than 1.96 or less than -1.96 are statistically
significant at the 5% level. The runs test does indicate some non-randomness. GRUBBS TEST STATISTIC = 2.918673
Although the autocorrelation plot and the runs test indicate some mild non-randomness, the 2. PERCENT POINTS OF THE REFERENCE DISTRIBUTION
violation of the randomness assumption is not serious enough to warrant developing a more FOR GRUBBS TEST STATISTIC
sophisticated model. It is common in practice that some of the assumptions are mildly violated 0 % POINT = 0.0000000E+00
50 % POINT = 2.984294
and it is a judgement call as to whether or not the violations are serious enough to warrant 75 % POINT = 3.181226
developing a more sophisticated model for the data. 90 % POINT = 3.424672
95 % POINT = 3.597898
Distributional Probability plots are a graphical test for assessing if a particular distribution provides an 99 % POINT = 3.970215
Analysis adequate fit to a data set. 3. CONCLUSION (AT THE 5% LEVEL):
THERE ARE NO OUTLIERS.
A quantitative enhancement to the probability plot is the correlation coefficient of the points
on the probability plot. For this data set the correlation coefficient is 0.996. Since this is For this data set, Grubbs' test does not detect any outliers at the 25%, 10%, 5%, and 1%
greater than the critical value of 0.987 (this is a tabulated value), the normality assumption is significance levels.
not rejected.
Chi-square and Kolmogorov-Smirnov goodness-of-fit tests are alternative methods for Model Since the underlying assumptions were validated both graphically and analytically, with a mild
assessing distributional adequacy. The Wilk-Shapiro and Anderson-Darling tests can be used violation of the randomness assumption, we conclude that a reasonable model for the data is:
to test for normality. Dataplot generates the following output for the Anderson-Darling
normality test. We can express the uncertainty for C, here estimated by 9.26146, as the 95% confidence
interval (9.258242,9.26479).
ANDERSON-DARLING 1-SAMPLE TEST
THAT THE DATA CAME FROM A NORMAL DISTRIBUTION

1. STATISTICS:

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4283.htm (5 of 7) [11/13/2003 5:33:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4283.htm (6 of 7) [11/13/2003 5:33:32 PM]


1.4.2.8.3. Quantitative Output and Interpretation 1.4.2.8.4. Work This Example Yourself

Univariate It is sometimes useful and convenient to summarize the above results in a report. The report
Report for the heat flow meter data follows.
1. Exploratory Data Analysis
1.4. EDA Case Studies
Analysis for heat flow meter data 1.4.2. Case Studies
1.4.2.8. Heat Flow Meter 1
1: Sample Size = 195

2: Location 1.4.2.8.4. Work This Example Yourself


Mean = 9.26146
Standard Deviation of Mean = 0.001632 View This page allows you to repeat the analysis outlined in the case study
95% Confidence Interval for Mean = (9.258242,9.264679) Dataplot description on the previous page using Dataplot . It is required that you
Drift with respect to location? = NO Macro for have already downloaded and installed Dataplot and configured your
this Case browser. to run Dataplot. Output from each analysis step below will be
3: Variation Study displayed in one or more of the Dataplot windows. The four main
Standard Deviation = 0.022789 windows are the Output window, the Graphics window, the Command
95% Confidence Interval for SD = (0.02073,0.025307) History window, and the data sheet window. Across the top of the main
Drift with respect to variation? windows there are menus for executing Dataplot commands. Across the
(based on Bartlett's test on quarters bottom is a command entry window where commands can be typed in.
of the data) = NO

4: Randomness Data Analysis Steps Results and Conclusions


Autocorrelation = 0.280579
Data are Random?
(as measured by autocorrelation) = NO
Click on the links below to start Dataplot and run this case study
5: Distribution yourself. Each step may use results from previous steps, so please be The links in this column will connect you with more detailed information
patient. Wait until the software verifies that the current step is about each analysis step from the case study description.
Normal PPCC = 0.998965
complete before clicking on the next step.
Data are Normal?
(as measured by Normal PPCC) = YES

6: Statistical Control
(i.e., no drift in location or scale, 1. Invoke Dataplot and read data.
data are random, distribution is
fixed, here we are testing only for 1. Read in the data. 1. You have read 1 column of numbers
fixed normal) into Dataplot, variable Y.
Data Set is in Statistical Control? = YES

7: Outliers?
(as determined by Grubbs' test) = NO 2. 4-plot of the data.

1. 4-plot of Y. 1. Based on the 4-plot, there are no shifts


in location or scale, and the data seem to
follow a normal distribution.

3. Generate the individual plots.

1. Generate a run sequence plot. 1. The run sequence plot indicates that
there are no shifts of location or
scale.

2. Generate a lag plot. 2. The lag plot does not indicate any
significant patterns (which would
show the data were not random).

3. Generate a histogram with an


3. The histogram indicates that a
overlaid normal pdf.
normal distribution is a good
distribution for these data.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4283.htm (7 of 7) [11/13/2003 5:33:32 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4284.htm (1 of 2) [11/13/2003 5:33:33 PM]


1.4.2.8.4. Work This Example Yourself 1.4.2.9. Airplane Glass Failure Time

4. Generate a normal probability 4. The normal probability plot verifies


plot. that the normal distribution is a
reasonable distribution for these data.

4. Generate summary statistics, quantitative 1. Exploratory Data Analysis


analysis, and print a univariate report.
1.4. EDA Case Studies
1. Generate a table of summary 1. The summary statistics table displays
1.4.2. Case Studies
statistics. 25+ statistics.

2. Generate the mean, a confidence 2. The mean is 9.261 and a 95%


1.4.2.9. Airplane Glass Failure Time
interval for the mean, and compute confidence interval is (9.258,9.265).
a linear fit to detect drift in The linear fit indicates no drift in Airplane This example illustrates the univariate analysis of airplane glass failure
location. location since the slope parameter
estimate is essentially zero.
Glass time data.
Failure
Time
3. Generate the standard deviation, a 3. The standard deviation is 0.023 with
confidence interval for the standard a 95% confidence interval of (0.0207,0.0253).
deviation, and detect drift in variation Bartlett's test indicates no significant 1. Background and Data
by dividing the data into quarters and change in variation.
computing Bartlett's test for equal
2. Graphical Output and Interpretation
standard deviations. 3. Weibull Analysis
4. Lognormal Analysis
4. Check for randomness by generating an 4. The lag 1 autocorrelation is 0.28.
autocorrelation plot and a runs test. From the autocorrelation plot, this is 5. Gamma Analysis
statistically significant at the 95%
level. 6. Power Normal Analysis
7. Power Lognormal Analysis
5. Check for normality by computing the 5. The normal probability plot correlation 8. Work This Example Yourself
normal probability plot correlation coefficient is 0.999. At the 5% level,
coefficient. we cannot reject the normality assumption.

6. Check for outliers using Grubbs' test. 6. Grubbs' test detects no outliers at the
5% level.

7. Print a univariate report (this assumes 7. The results are summarized in a


steps 2 thru 6 have already been run). convenient report.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4284.htm (2 of 2) [11/13/2003 5:33:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda429.htm [11/13/2003 5:33:33 PM]


1.4.2.9.1. Background and Data 1.4.2.9.1. Background and Data

Resulting The following are the data used for this case study.
Data
18.830
1. Exploratory Data Analysis 20.800
1.4. EDA Case Studies 21.657
1.4.2. Case Studies 23.030
1.4.2.9. Airplane Glass Failure Time 23.230
24.050
24.321
1.4.2.9.1. Background and Data 25.500
25.520
Generation This data set was collected by Ed Fuller of NIST in December, 1993. 25.800
The response variable is time to failure for airplane glass under test. 26.690
26.770
Purpose of The goal of this case study is to find a good distributional model for the 26.780
Analysis data. Once a good distributional model has been determined, various 27.050
percent points for glass failure will be computed. 27.670
29.900
Since the data are failure times, this case study is a form of reliability 31.110
analysis. The assessing product reliability chapter contains a more 33.200
complete discussion of reliabilty methods. This case study is meant to 33.730
complement that chapter by showing the use of graphical techniques in 33.760
one aspect of reliability modeling. 33.890
34.760
Failure times are basically extreme values that do not follow a normal 35.750
distribution; non-parametric methods (techniques that do not rely on a 35.910
specific distribution) are frequently recommended for developing 36.980
confidence intervals for failure data. One problem with this approach is 37.080
that sample sizes are often small due to the expense involved in 37.090
collecting the data, and non-parametric methods do not work well for 39.580
small sample sizes. For this reason, a parametric method based on a 44.045
specific distributional model of the data is preferred if the data can be 45.290
shown to follow a specific distribution. Parametric models typically 45.381
have greater efficiency at the cost of more specific assumptions about
the data, but, it is important to verify that the distributional assumption
is indeed valid. If the distributional assumption is not justified, then the
conclusions drawn from the model may not be valid.
This file can be read by Dataplot with the following commands:
SKIP 25
READ FULLER2.DAT Y

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4291.htm (1 of 2) [11/13/2003 5:33:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4291.htm (2 of 2) [11/13/2003 5:33:33 PM]


1.4.2.9.2. Graphical Output and Interpretation 1.4.2.9.2. Graphical Output and Interpretation

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.9. Airplane Glass Failure Time

1.4.2.9.2. Graphical Output and Interpretation


Goal The goal of this analysis is to determine a good distributional model for these
failure time data. A secondary goal is to provide estimates for various percent
points of the data. Percent points provide an answer to questions of the type "At
what time do we expect 5% of the airplane glass to have failed?".

Initial Plots of the


The normal probability plot has a correlation coefficient of 0.980. We can use
Data The first step is to generate a histogram to get an overall feel for the data. this number as a reference baseline when comparing the performance of other
distributional fits.

Other Potential There is a large number of distributions that would be distributional model
Distributions candidates for the data. However, we will restrict ourselves to consideration of
the following distributional models because these have proven to be useful in
reliability studies.
1. Normal distribution
2. Exponential distribution
3. Weibull distribution
4. Lognormal distribution
5. Gamma distribution
6. Power normal distribution
7. Power lognormal distribution

The histogram shows the following:


● The failure times range between slightly greater than 15 to slightly less
than 50.
● There are modes at approximately 28 and 38 with a gap in-between.

● The data are somewhat symmetric, but with a gap in the middle.

We next generate a normal probability plot.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4292.htm (1 of 7) [11/13/2003 5:33:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4292.htm (2 of 7) [11/13/2003 5:33:33 PM]


1.4.2.9.2. Graphical Output and Interpretation 1.4.2.9.2. Graphical Output and Interpretation

Approach There are two basic questions that need to be addressed. Summary of The results are summarized below.
1. Does a given distributional model provide an adequate fit to the data? Results Normal Distribution
2. Of the candidate distributional models, is there one distribution that fits Max PPCC = 0.980
the data better than the other candidate distributional models? Estimate of location = 30.81
The use of probability plots and probability plot correlation coefficient (PPCC) Estimate of scale = 7.38
plots provide answers to both of these questions. Weibull Distribution
Max PPCC = 0.988
If the distribution does not have a shape parameter, we simply generate a Estimate of shape = 2.13
probability plot. Estimate of location = 15.9
1. If we fit a straight line to the points on the probability plot, the intercept Estimate of scale = 16.92
and slope of that line provide estimates of the location and scale Lognormal Distribution
parameters, respectively.
Max PPCC = 0.986
2. Our critierion for the "best fit" distribution is the one with the most linear Estimate of shape = 0.18
probability plot. The correlation coefficient of the fitted line of the points Estimate of location = -9.96
on the probability plot, referred to as the PPCC value, provides a measure Estimate of scale = 40.17
of the linearity of the probability plot, and thus a measure of how well the
distribution fits the data. The PPCC values for multiple distributions can Gamma Distribution
be compared to address the second question above. Max PPCC = 0.987
Estimate of shape = 11.8
If the distribution does have a shape parameter, then we are actually addressing Estimate of location = 5.19
a family of distributions rather than a single distribution. We first need to find Estimate of scale = 2.17
the optimal value of the shape parameter. The PPCC plot can be used to
Power Normal Distribution
determine the optimal parameter. We will use the PPCC plots in two stages. The
first stage will be over a broad range of parameter values while the second stage Max PPCC = 0.987
will be in the neighborhood of the largest values. Although we could go further Estimate of shape = 0.11
than two stages, for practical purposes two stages is sufficient. After Estimate of location = 20.9
determining an optimal value for the shape parameter, we use the probability Estimate of scale = 3.3
plot as above to obtain estimates of the location and scale parameters and to Power Lognormal Distribution
determine the PPCC value. This PPCC value can be compared to the PPCC Max PPCC = 0.988
values obtained from other distributional models. Estimate of shape = 50
Estimate of location = 13.5
Analyses for We analyzed the data using the approach described above for the following Estimate of scale = 150.8
Specific distributional models: These results indicate that several of these distributions provide an adequate
Distributions 1. Normal distribution - from the 4-plot above, the PPCC value was 0.980. distributional model for the data. We choose the 3-parameter Weibull
2. Exponential distribution - the exponential distribution is a special case of distribution as the most appropriate model because it provides the best balance
the Weibull with shape parameter equal to 1. If the Weibull analysis between simplicity and best fit.
yields a shape parameter close to 1, then we would consider using the
simpler exponential model.
3. Weibull distribution
4. Lognormal distribution
5. Gamma distribution
6. Power normal distribution
7. Power lognormal distribution

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4292.htm (3 of 7) [11/13/2003 5:33:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4292.htm (4 of 7) [11/13/2003 5:33:33 PM]


1.4.2.9.2. Graphical Output and Interpretation 1.4.2.9.2. Graphical Output and Interpretation

Percent Point The final step in this analysis is to compute percent point estimates for the 1%, Normal
Estimates Anderson-Darling **************************************
2.5%, 5%, 95%, 97.5%, and 99% percent points. A percent point estimate is an ** Anderson-Darling normal test y **
estimate of the time by which a given percentage of the units will have failed. Output **************************************
For example, the 5% point is the time at which we estimate 5% of the units will
have failed.
ANDERSON-DARLING 1-SAMPLE TEST
To calculate these values, we use the Weibull percent point function with the THAT THE DATA CAME FROM A NORMAL DISTRIBUTION
appropriate estimates of the shape, location, and scale parameters. The Weibull
1. STATISTICS:
percent point function can be computed in many general purpose statistical NUMBER OF OBSERVATIONS = 31
software programs, including Dataplot. MEAN = 30.81142
STANDARD DEVIATION = 7.253381
Dataplot generated the following estimates for the percent points:
ANDERSON-DARLING TEST STATISTIC VALUE = 0.5321903
ADJUSTED TEST STATISTIC VALUE = 0.5870153
Estimated percent points using Weibull Distribution
2. CRITICAL VALUES:
PERCENT POINT FAILURE TIME 90 % POINT = 0.6160000
95 % POINT = 0.7350000
0.01 17.86 97.5 % POINT = 0.8610000
0.02 18.92 99 % POINT = 1.021000
0.05 20.10
0.95 44.21 3. CONCLUSION (AT THE 5% LEVEL):
THE DATA DO COME FROM A NORMAL DISTRIBUTION.
0.97 47.11
0.99 50.53
Lognormal
Anderson-Darling *****************************************
** Anderson-Darling lognormal test y **
Output *****************************************
Quantitative Although it is generally unnecessary, we can include quantitative measures of
Measures of distributional goodness-of-fit. Three of the commonly used measures are:
Goodness of Fit 1. Chi-square goodness-of-fit. ANDERSON-DARLING 1-SAMPLE TEST
THAT THE DATA CAME FROM A LOGNORMAL DISTRIBUTION
2. Kolmogorov-Smirnov goodness-of-fit.
3. Anderson-Darling goodness-of-fit. 1. STATISTICS:
NUMBER OF OBSERVATIONS = 31
In this case, the sample size of 31 precludes the use of the chi-square test since MEAN = 3.401242
the chi-square approximation is not valid for small sample sizes. Specifically, STANDARD DEVIATION = 0.2349026
the smallest expected frequency should be at least 5. Although we could ANDERSON-DARLING TEST STATISTIC VALUE = 0.3888340
combine classes, we will instead use one of the other tests. The ADJUSTED TEST STATISTIC VALUE = 0.4288908
Kolmogorov-Smirnov test requires a fully specified distribution. Since we need
to use the data to estimate the shape, location, and scale parameters, we do not 2. CRITICAL VALUES:
use this test here. The Anderson-Darling test is a refinement of the 90 % POINT = 0.6160000
95 % POINT = 0.7350000
Kolmogorov-Smirnov test. We run this test for the normal, lognormal, and 97.5 % POINT = 0.8610000
Weibull distributions. 99 % POINT = 1.021000

3. CONCLUSION (AT THE 5% LEVEL):


THE DATA DO COME FROM A LOGNORMAL DISTRIBUTION.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4292.htm (5 of 7) [11/13/2003 5:33:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4292.htm (6 of 7) [11/13/2003 5:33:33 PM]


1.4.2.9.2. Graphical Output and Interpretation 1.4.2.9.3. Weibull Analysis

Weibull
Anderson-Darling ***************************************
** Anderson-Darling Weibull test y **
Output ***************************************
1. Exploratory Data Analysis
ANDERSON-DARLING 1-SAMPLE TEST 1.4. EDA Case Studies
THAT THE DATA CAME FROM A WEIBULL DISTRIBUTION 1.4.2. Case Studies
1.4.2.9. Airplane Glass Failure Time
1. STATISTICS:
NUMBER OF OBSERVATIONS = 31
MEAN
STANDARD DEVIATION
=
=
30.81142
7.253381 1.4.2.9.3. Weibull Analysis
SHAPE PARAMETER = 4.635379
SCALE PARAMETER = 33.67423
Plots for The following plots were generated for a Weibull distribution.
ANDERSON-DARLING TEST STATISTIC VALUE = 0.5973396 Weibull
ADJUSTED TEST STATISTIC VALUE = 0.6187967 Distribution
2. CRITICAL VALUES:
90 % POINT = 0.6370000
95 % POINT = 0.7570000
97.5 % POINT = 0.8770000
99 % POINT = 1.038000

3. CONCLUSION (AT THE 5% LEVEL):


THE DATA DO COME FROM A WEIBULL DISTRIBUTION.
Note that for the Weibull distribution, the Anderson-Darling test is actually
testing the 2-parameter Weibull distribution (based on maximum likelihood
estimates), not the 3-parameter Weibull distribution. However, passing the
2-parameter Weibull distribution does give evidence that the Weibull is an
appropriate distributional model even though we used a different parameter
estimation method.

Conclusions The Anderson-Darling test passes all three of these distributions.

Conclusions We can make the following conclusions from these plots.


1. The optimal value, in the sense of having the most linear
probability plot, of the shape parameter gamma is 2.13.
2. At the optimal value of the shape parameter, the PPCC value is
0.988.
3. At the optimal value of the shape parameter, the estimate of the
location parameter is 15.90 and the estimate of the scale
parameter is 16.92.
4. Fine tuning the estimate of gamma (from 2 to 2.13) has minimal
impact on the PPCC value.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4292.htm (7 of 7) [11/13/2003 5:33:33 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4293.htm (1 of 3) [11/13/2003 5:33:34 PM]


1.4.2.9.3. Weibull Analysis 1.4.2.9.3. Weibull Analysis

Alternative The Weibull plot and the Weibull hazard plot are alternative graphical Weibull
Plots analysis procedures to the PPCC plots and probability plots. Hazard Plot
These two procedures, especially the Weibull plot, are very commonly
employed. That not withstanding, the disadvantage of these two
procedures is that they both assume that the location parameter (i.e., the
lower bound) is zero and that we are fitting a 2-parameter Weibull
instead of a 3-parameter Weibull. The advantage is that there is an
extensive literature on these methods and they have been designed to
work with either censored or uncensored data.

Weibull Plot

The construction and interpretation of the Weibull hazard plot is


discussed in the Assessing Product Reliability chapter.

This Weibull plot shows the following


1. The Weibull plot is approximately linear indicating that the
2-parameter Weibull provides an adequate fit to the data.
2. The estimate of the shape parameter is 5.28 and the estimate of
the scale parameter is 33.32.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4293.htm (2 of 3) [11/13/2003 5:33:34 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4293.htm (3 of 3) [11/13/2003 5:33:34 PM]


1.4.2.9.4. Lognormal Analysis 1.4.2.9.4. Lognormal Analysis

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.9. Airplane Glass Failure Time

1.4.2.9.4. Lognormal Analysis


Plots for The following plots were generated for a lognormal distribution.
Lognormal
Distribution

Conclusions We can make the following conclusions from these plots.


1. The optimal value, in the sense of having the most linear
probability plot, of the shape parameter is 0.18.
2. At the optimal value of the shape parameter, the PPCC value is
0.986.
3. At the optimal value of the shape parameter, the estimate of the
location parameter is -9.96 and the estimate of the scale parameter
is 40.17.
4. Fine tuning the estimate of the shape parameter (from 0.2 to 0.18)
has minimal impact on the PPCC value.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4294.htm (1 of 2) [11/13/2003 5:33:34 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4294.htm (2 of 2) [11/13/2003 5:33:34 PM]


1.4.2.9.5. Gamma Analysis 1.4.2.9.5. Gamma Analysis

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.9. Airplane Glass Failure Time

1.4.2.9.5. Gamma Analysis


Plots for The following plots were generated for a gamma distribution.
Gamma
Distribution

Conclusions We can make the following conclusions from these plots.


1. The optimal value, in the sense of having the most linear
probability plot, of the shape parameter is 11.8.
2. At the optimal value of the shape parameter, the PPCC value is
0.987.
3. At the optimal value of the shape parameter, the estimate of the
location parameter is 5.19 and the estimate of the scale parameter
is 2.17.
4. Fine tuning the estimate of (from 12 to 11.8) has some impact
on the PPCC value (from 0.978 to 0.987).

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4295.htm (1 of 2) [11/13/2003 5:33:34 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4295.htm (2 of 2) [11/13/2003 5:33:34 PM]


1.4.2.9.6. Power Normal Analysis 1.4.2.9.6. Power Normal Analysis

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.9. Airplane Glass Failure Time

1.4.2.9.6. Power Normal Analysis


Plots for The following plots were generated for a power normal distribution.
Power
Normal
Distribution

Conclusions We can make the following conclusions from these plots.


1. A reasonable value, in the sense of having the most linear
probability plot, of the shape parameter p is 0.11.
2. At the this value of the shape parameter, the PPCC value is 0.987.
3. At the optimal value of the shape parameter, the estimate of the
location parameter is 20.9 and the estimate of the scale parameter
is 3.3.
4. Fine tuning the estimate of p (from 1 to 0.11) results in a slight
improvement of the the computed PPCC value (from 0.980 to
0.987).

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4296.htm (1 of 2) [11/13/2003 5:33:34 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4296.htm (2 of 2) [11/13/2003 5:33:34 PM]


1.4.2.9.7. Power Lognormal Analysis 1.4.2.9.7. Power Lognormal Analysis

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.9. Airplane Glass Failure Time

1.4.2.9.7. Power Lognormal Analysis


Plots for The following plots were generated for a power lognormal distribution.
Power
Lognormal
Distribution

Conclusions We can make the following conclusions from these plots.


1. A reasonable value, in the sense of having the most linear
probability plot, of the shape parameter p is 100 (i.e., p is
asymptotically increasing).
2. At this value of the shape parameter, the PPCC value is 0.987.
3. At this value of the shape parameter, the estimate of the location
parameter is 12.01 and the estimate of the scale parameter is
212.92.
4. Fine tuning the estimate of p (from 50 to 100) has minimal impact
on the PPCC value.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4297.htm (1 of 2) [11/13/2003 5:33:40 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4297.htm (2 of 2) [11/13/2003 5:33:40 PM]


1.4.2.9.8. Work This Example Yourself 1.4.2.9.8. Work This Example Yourself
3. Generate a Weibull hazard plot. 3. The Weibull hazard plot is
approximately linear, indicating
that the Weibull provides a good
distributional model for these data.
1. Exploratory Data Analysis
1.4. EDA Case Studies
1.4.2. Case Studies
4. Generate the lognormal analysis.
1.4.2.9. Airplane Glass Failure Time

1. Generate 2 iterations of the 1. The lognormal analysis results in


1.4.2.9.8. Work This Example Yourself lognormal PPCC plot and a a maximum PPCC value of 0.986.
lognormal probability plot.
View This page allows you to repeat the analysis outlined in the case study
Dataplot description on the previous page using Dataplot . It is required that you
Macro for have already downloaded and installed Dataplot and configured your 5. Generate the gamma analysis.
this Case browser. to run Dataplot. Output from each analysis step below will be
Study displayed in one or more of the Dataplot windows. The four main 1. Generate 2 iterations of the 1. The gamma analysis results in
windows are the Output window, the Graphics window, the Command gamma PPCC plot and a a maximum PPCC value of 0.987.
History window, and the data sheet window. Across the top of the main gamma probability plot.
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in.
6. Generate the power normal analysis.
Data Analysis Steps Results and Conclusions
1. Generate 2 iterations of the 1. The power normal analysis results
power normal PPCC plot and a in a maximum PPCC value of 0.988.
Click on the links below to start Dataplot and run this case study power normal probability plot.
yourself. Each step may use results from previous steps, so please be The links in this column will connect you with more detailed information
patient. Wait until the software verifies that the current step is about each analysis step from the case study description.
complete before clicking on the next step. 7. Generate the power lognormal analysis.

1. Generate 2 iterations of the 1. The power lognormal analysis


power lognormal PPCC plot and results in a maximum PPCC value
1. Invoke Dataplot and read data. a power lognormal probability of 0.987.
plot.
1. Read in the data. 1. You have read 1 column of numbers
into Dataplot, variable Y.
8. Generate quantitative goodness of fit tests

1. Generate Anderson-Darling test 1. The Anderson-Darling normality


2. 4-plot of the data.
for normality. test indicates the normal
distribution provides an adequate
1. 4-plot of Y. 1. The failure times are in the range 15 to
fit to the data.
50. The histogram and normal probability
plot indicate a normal distribution fits 2. Generate Anderson-Darling test 2. The Anderson-Darling lognormal
the data reasonably well, but we can probably for lognormal distribution. test indicates the lognormal
do better.
distribution provides an adequate
fit to the data.
3. Generate the Weibull analysis. 3. Generate Anderson-Darling test
3. The Anderson-Darling Weibull
for Weibull distribution.
test indicates the lognormal
1. Generate 2 iterations of the 1. The Weibull analysis results in a
distribution provides an adequate
Weibull PPCC plot, a Weibull maximum PPCC value of 0.988.
fit to the data.
probability plot, and estimate
some percent points.

2. Generate a Weibull plot. 2. The Weibull plot permits the


estimation of a 2-parameter Weibull
model.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda4298.htm (1 of 2) [11/13/2003 5:33:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda4298.htm (2 of 2) [11/13/2003 5:33:41 PM]


1.4.2.10. Ceramic Strength 1.4.2.10.1. Background and Data

1. Exploratory Data Analysis 1. Exploratory Data Analysis


1.4. EDA Case Studies 1.4. EDA Case Studies
1.4.2. Case Studies 1.4.2. Case Studies
1.4.2.10. Ceramic Strength

1.4.2.10. Ceramic Strength


1.4.2.10.1. Background and Data
Ceramic This case study analyzes the effect of machining factors on the strength
Strength of ceramics. Generation The data for this case study were collected by Said Jahanmir of the NIST
Ceramics Division in 1996 in connection with a NIST/industry ceramics
1. Background and Data consortium for strength optimization of ceramic strength
2. Analysis of the Response Variable The motivation for studying this data set is to illustrate the analysis of multiple
3. Analysis of Batch Effect factors from a designed experiment
4. Analysis of Lab Effect This case study will utilize only a subset of a full study that was conducted by
5. Analysis of Primary Factors Lisa Gill and James Filliben of the NIST Statistical Engineering Division
6. Work This Example Yourself The response variable is a measure of the strength of the ceramic material
(bonded Si nitrate). The complete data set contains the following variables:
1. Factor 1 = Observation ID, i.e., run number (1 to 960)
2. Factor 2 = Lab (1 to 8)
3. Factor 3 = Bar ID within lab (1 to 30)
4. Factor 4 = Test number (1 to 4)
5. Response Variable = Strength of Ceramic
6. Factor 5 = Table speed (2 levels: 0.025 and 0.125)
7. Factor 6 = Down feed rate (2 levels: 0.050 and 0.125)
8. Factor 7 = Wheel grit size (2 levels: 150 and 80)
9. Factor 8 = Direction (2 levels: longitudinal and transverse)
10. Factor 9 = Treatment (1 to 16)
11. Factor 10 = Set of 15 within lab (2 levels: 1 and 2)
12. Factor 11 = Replication (2 levels: 1 and 2)
13. Factor 12 = Bar Batch (1 and 2)
The four primary factors of interest are:
1. Table speed (X1)
2. Down feed rate (X2)
3. Wheel grit size (X3)

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a.htm [11/13/2003 5:33:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (1 of 13) [11/13/2003 5:33:41 PM]


1.4.2.10.1. Background and Data 1.4.2.10.1. Background and Data

4. Direction (X4) 13 1 1 740.447 -1 -1 -1


For this case study, we are using only half the data. Specifically, we are using 14 1 2 588.375 -1 -1 -1
the data with the direction longitudinal. Therefore, we have only three primary 15 1 1 666.830 -1 -1 -1
factors 16 1 2 531.384 -1 -1 -1
17 1 1 710.272 -1 -1 -1
In addtion, we are interested in the nuisance factors 18 1 2 633.417 -1 -1 -1
1. Lab 19 1 1 751.669 -1 -1 -1
2. Batch 20 1 2 619.060 -1 -1 -1
21 1 1 697.979 -1 -1 -1
The complete file can be read into Dataplot with the following commands: 22 1 2 632.447 -1 -1 -1
23 1 1 708.583 -1 -1 -1
DIMENSION 20 VARIABLES 24 1 2 624.256 -1 -1 -1
SKIP 50 25 1 1 624.972 -1 -1 -1
READ JAHANMI2.DAT RUN RUN LAB BAR SET Y X1 TO X8 BATCH 26 1 2 575.143 -1 -1 -1
27 1 1 695.070 -1 -1 -1
Purpose of The goals of this case study are: 28 1 2 549.278 -1 -1 -1
Analysis 1. Determine which of the four primary factors has the strongest effect on 29 1 1 769.391 -1 -1 -1
the strength of the ceramic material 30 1 2 624.972 -1 -1 -1
61 1 1 720.186 -1 1 1
2. Estimate the magnitude of the effects
62 1 2 587.695 -1 1 1
3. Determine the optimal settings for the primary factors 63 1 1 723.657 -1 1 1
4. Determine if the nuisance factors (lab and batch) have an effect on the 64 1 2 569.207 -1 1 1
ceramic strength 65 1 1 703.700 -1 1 1
66 1 2 613.257 -1 1 1
This case study is an example of a designed experiment. The Process 67 1 1 697.626 -1 1 1
Improvement chapter contains a detailed discussion of the construction and 68 1 2 565.737 -1 1 1
analysis of designed experiments. This case study is meant to complement the 69 1 1 714.980 -1 1 1
material in that chapter by showing how an EDA approach (emphasizing the use 70 1 2 662.131 -1 1 1
of graphical techniques) can be used in the analysis of designed experiments 71 1 1 657.712 -1 1 1
72 1 2 543.177 -1 1 1
Resulting The following are the data used for this case study 73 1 1 609.989 -1 1 1
Data 74 1 2 512.394 -1 1 1
Run Lab Batch Y X1 X2 X3 75 1 1 650.771 -1 1 1
1 1 1 608.781 -1 -1 -1 76 1 2 611.190 -1 1 1
2 1 2 569.670 -1 -1 -1 77 1 1 707.977 -1 1 1
3 1 1 689.556 -1 -1 -1 78 1 2 659.982 -1 1 1
4 1 2 747.541 -1 -1 -1 79 1 1 712.199 -1 1 1
5 1 1 618.134 -1 -1 -1 80 1 2 569.245 -1 1 1
6 1 2 612.182 -1 -1 -1 81 1 1 709.631 -1 1 1
7 1 1 680.203 -1 -1 -1 82 1 2 725.792 -1 1 1
8 1 2 607.766 -1 -1 -1 83 1 1 703.160 -1 1 1
9 1 1 726.232 -1 -1 -1 84 1 2 608.960 -1 1 1
10 1 2 605.380 -1 -1 -1 85 1 1 744.822 -1 1 1
11 1 1 518.655 -1 -1 -1 86 1 2 586.060 -1 1 1
12 1 2 589.226 -1 -1 -1 87 1 1 719.217 -1 1 1
88 1 2 617.441 -1 1 1

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (2 of 13) [11/13/2003 5:33:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (3 of 13) [11/13/2003 5:33:41 PM]
1.4.2.10.1. Background and Data 1.4.2.10.1. Background and Data

89 1 1 619.137 -1 1 1 195 2 1 728.499 1 -1 -1


90 1 2 592.845 -1 1 1 196 2 2 724.175 1 -1 -1
151 2 1 753.333 1 1 1 197 2 1 797.662 1 -1 -1
152 2 2 631.754 1 1 1 198 2 2 583.034 1 -1 -1
153 2 1 677.933 1 1 1 199 2 1 668.530 1 -1 -1
154 2 2 588.113 1 1 1 200 2 2 620.227 1 -1 -1
155 2 1 735.919 1 1 1 201 2 1 815.754 1 -1 -1
156 2 2 555.724 1 1 1 202 2 2 584.861 1 -1 -1
157 2 1 695.274 1 1 1 203 2 1 777.392 1 -1 -1
158 2 2 702.411 1 1 1 204 2 2 565.391 1 -1 -1
159 2 1 504.167 1 1 1 205 2 1 712.140 1 -1 -1
160 2 2 631.754 1 1 1 206 2 2 622.506 1 -1 -1
161 2 1 693.333 1 1 1 207 2 1 663.622 1 -1 -1
162 2 2 698.254 1 1 1 208 2 2 628.336 1 -1 -1
163 2 1 625.000 1 1 1 209 2 1 684.181 1 -1 -1
164 2 2 616.791 1 1 1 210 2 2 587.145 1 -1 -1
165 2 1 596.667 1 1 1 271 3 1 629.012 1 -1 1
166 2 2 551.953 1 1 1 272 3 2 584.319 1 -1 1
167 2 1 640.898 1 1 1 273 3 1 640.193 1 -1 1
168 2 2 636.738 1 1 1 274 3 2 538.239 1 -1 1
169 2 1 720.506 1 1 1 275 3 1 644.156 1 -1 1
170 2 2 571.551 1 1 1 276 3 2 538.097 1 -1 1
171 2 1 700.748 1 1 1 277 3 1 642.469 1 -1 1
172 2 2 521.667 1 1 1 278 3 2 595.686 1 -1 1
173 2 1 691.604 1 1 1 279 3 1 639.090 1 -1 1
174 2 2 587.451 1 1 1 280 3 2 648.935 1 -1 1
175 2 1 636.738 1 1 1 281 3 1 439.418 1 -1 1
176 2 2 700.422 1 1 1 282 3 2 583.827 1 -1 1
177 2 1 731.667 1 1 1 283 3 1 614.664 1 -1 1
178 2 2 595.819 1 1 1 284 3 2 534.905 1 -1 1
179 2 1 635.079 1 1 1 285 3 1 537.161 1 -1 1
180 2 2 534.236 1 1 1 286 3 2 569.858 1 -1 1
181 2 1 716.926 1 -1 -1 287 3 1 656.773 1 -1 1
182 2 2 606.188 1 -1 -1 288 3 2 617.246 1 -1 1
183 2 1 759.581 1 -1 -1 289 3 1 659.534 1 -1 1
184 2 2 575.303 1 -1 -1 290 3 2 610.337 1 -1 1
185 2 1 673.903 1 -1 -1 291 3 1 695.278 1 -1 1
186 2 2 590.628 1 -1 -1 292 3 2 584.192 1 -1 1
187 2 1 736.648 1 -1 -1 293 3 1 734.040 1 -1 1
188 2 2 729.314 1 -1 -1 294 3 2 598.853 1 -1 1
189 2 1 675.957 1 -1 -1 295 3 1 687.665 1 -1 1
190 2 2 619.313 1 -1 -1 296 3 2 554.774 1 -1 1
191 2 1 729.230 1 -1 -1 297 3 1 710.858 1 -1 1
192 2 2 624.234 1 -1 -1 298 3 2 605.694 1 -1 1
193 2 1 697.239 1 -1 -1 299 3 1 701.716 1 -1 1
194 2 2 651.304 1 -1 -1 300 3 2 627.516 1 -1 1

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (4 of 13) [11/13/2003 5:33:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (5 of 13) [11/13/2003 5:33:41 PM]
1.4.2.10.1. Background and Data 1.4.2.10.1. Background and Data

301 3 1 382.133 1 1 -1 377 4 1 755.864 -1 -1 1


302 3 2 574.522 1 1 -1 378 4 2 643.449 -1 -1 1
303 3 1 719.744 1 1 -1 379 4 1 692.945 -1 -1 1
304 3 2 582.682 1 1 -1 380 4 2 581.593 -1 -1 1
305 3 1 756.820 1 1 -1 381 4 1 766.532 -1 -1 1
306 3 2 563.872 1 1 -1 382 4 2 494.122 -1 -1 1
307 3 1 690.978 1 1 -1 383 4 1 725.663 -1 -1 1
308 3 2 715.962 1 1 -1 384 4 2 620.948 -1 -1 1
309 3 1 670.864 1 1 -1 385 4 1 698.818 -1 -1 1
310 3 2 616.430 1 1 -1 386 4 2 615.903 -1 -1 1
311 3 1 670.308 1 1 -1 387 4 1 760.000 -1 -1 1
312 3 2 778.011 1 1 -1 388 4 2 606.667 -1 -1 1
313 3 1 660.062 1 1 -1 389 4 1 775.272 -1 -1 1
314 3 2 604.255 1 1 -1 390 4 2 579.167 -1 -1 1
315 3 1 790.382 1 1 -1 421 4 1 708.885 -1 1 -1
316 3 2 571.906 1 1 -1 422 4 2 662.510 -1 1 -1
317 3 1 714.750 1 1 -1 423 4 1 727.201 -1 1 -1
318 3 2 625.925 1 1 -1 424 4 2 436.237 -1 1 -1
319 3 1 716.959 1 1 -1 425 4 1 642.560 -1 1 -1
320 3 2 682.426 1 1 -1 426 4 2 644.223 -1 1 -1
321 3 1 603.363 1 1 -1 427 4 1 690.773 -1 1 -1
322 3 2 707.604 1 1 -1 428 4 2 586.035 -1 1 -1
323 3 1 713.796 1 1 -1 429 4 1 688.333 -1 1 -1
324 3 2 617.400 1 1 -1 430 4 2 620.833 -1 1 -1
325 3 1 444.963 1 1 -1 431 4 1 743.973 -1 1 -1
326 3 2 689.576 1 1 -1 432 4 2 652.535 -1 1 -1
327 3 1 723.276 1 1 -1 433 4 1 682.461 -1 1 -1
328 3 2 676.678 1 1 -1 434 4 2 593.516 -1 1 -1
329 3 1 745.527 1 1 -1 435 4 1 761.430 -1 1 -1
330 3 2 563.290 1 1 -1 436 4 2 587.451 -1 1 -1
361 4 1 778.333 -1 -1 1 437 4 1 691.542 -1 1 -1
362 4 2 581.879 -1 -1 1 438 4 2 570.964 -1 1 -1
363 4 1 723.349 -1 -1 1 439 4 1 643.392 -1 1 -1
364 4 2 447.701 -1 -1 1 440 4 2 645.192 -1 1 -1
365 4 1 708.229 -1 -1 1 441 4 1 697.075 -1 1 -1
366 4 2 557.772 -1 -1 1 442 4 2 540.079 -1 1 -1
367 4 1 681.667 -1 -1 1 443 4 1 708.229 -1 1 -1
368 4 2 593.537 -1 -1 1 444 4 2 707.117 -1 1 -1
369 4 1 566.085 -1 -1 1 445 4 1 746.467 -1 1 -1
370 4 2 632.585 -1 -1 1 446 4 2 621.779 -1 1 -1
371 4 1 687.448 -1 -1 1 447 4 1 744.819 -1 1 -1
372 4 2 671.350 -1 -1 1 448 4 2 585.777 -1 1 -1
373 4 1 597.500 -1 -1 1 449 4 1 655.029 -1 1 -1
374 4 2 569.530 -1 -1 1 450 4 2 703.980 -1 1 -1
375 4 1 637.410 -1 -1 1 541 5 1 715.224 -1 -1 -1
376 4 2 581.667 -1 -1 1 542 5 2 698.237 -1 -1 -1

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (6 of 13) [11/13/2003 5:33:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (7 of 13) [11/13/2003 5:33:41 PM]
1.4.2.10.1. Background and Data 1.4.2.10.1. Background and Data

543 5 1 614.417 -1 -1 -1 589 5 1 684.812 1 -1 1


544 5 2 757.120 -1 -1 -1 590 5 2 621.471 1 -1 1
545 5 1 761.363 -1 -1 -1 591 5 1 738.161 1 -1 1
546 5 2 621.751 -1 -1 -1 592 5 2 612.727 1 -1 1
547 5 1 716.106 -1 -1 -1 593 5 1 671.492 1 -1 1
548 5 2 472.125 -1 -1 -1 594 5 2 606.460 1 -1 1
549 5 1 659.502 -1 -1 -1 595 5 1 709.771 1 -1 1
550 5 2 612.700 -1 -1 -1 596 5 2 571.760 1 -1 1
551 5 1 730.781 -1 -1 -1 597 5 1 685.199 1 -1 1
552 5 2 583.170 -1 -1 -1 598 5 2 599.304 1 -1 1
553 5 1 546.928 -1 -1 -1 599 5 1 624.973 1 -1 1
554 5 2 599.771 -1 -1 -1 600 5 2 579.459 1 -1 1
555 5 1 734.203 -1 -1 -1 601 6 1 757.363 1 1 1
556 5 2 549.227 -1 -1 -1 602 6 2 761.511 1 1 1
557 5 1 682.051 -1 -1 -1 603 6 1 633.417 1 1 1
558 5 2 605.453 -1 -1 -1 604 6 2 566.969 1 1 1
559 5 1 701.341 -1 -1 -1 605 6 1 658.754 1 1 1
560 5 2 569.599 -1 -1 -1 606 6 2 654.397 1 1 1
561 5 1 759.729 -1 -1 -1 607 6 1 664.666 1 1 1
562 5 2 637.233 -1 -1 -1 608 6 2 611.719 1 1 1
563 5 1 689.942 -1 -1 -1 609 6 1 663.009 1 1 1
564 5 2 621.774 -1 -1 -1 610 6 2 577.409 1 1 1
565 5 1 769.424 -1 -1 -1 611 6 1 773.226 1 1 1
566 5 2 558.041 -1 -1 -1 612 6 2 576.731 1 1 1
567 5 1 715.286 -1 -1 -1 613 6 1 708.261 1 1 1
568 5 2 583.170 -1 -1 -1 614 6 2 617.441 1 1 1
569 5 1 776.197 -1 -1 -1 615 6 1 739.086 1 1 1
570 5 2 345.294 -1 -1 -1 616 6 2 577.409 1 1 1
571 5 1 547.099 1 -1 1 617 6 1 667.786 1 1 1
572 5 2 570.999 1 -1 1 618 6 2 548.957 1 1 1
573 5 1 619.942 1 -1 1 619 6 1 674.481 1 1 1
574 5 2 603.232 1 -1 1 620 6 2 623.315 1 1 1
575 5 1 696.046 1 -1 1 621 6 1 695.688 1 1 1
576 5 2 595.335 1 -1 1 622 6 2 621.761 1 1 1
577 5 1 573.109 1 -1 1 623 6 1 588.288 1 1 1
578 5 2 581.047 1 -1 1 624 6 2 553.978 1 1 1
579 5 1 638.794 1 -1 1 625 6 1 545.610 1 1 1
580 5 2 455.878 1 -1 1 626 6 2 657.157 1 1 1
581 5 1 708.193 1 -1 1 627 6 1 752.305 1 1 1
582 5 2 627.880 1 -1 1 628 6 2 610.882 1 1 1
583 5 1 502.825 1 -1 1 629 6 1 684.523 1 1 1
584 5 2 464.085 1 -1 1 630 6 2 552.304 1 1 1
585 5 1 632.633 1 -1 1 631 6 1 717.159 -1 1 -1
586 5 2 596.129 1 -1 1 632 6 2 545.303 -1 1 -1
587 5 1 683.382 1 -1 1 633 6 1 721.343 -1 1 -1
588 5 2 640.371 1 -1 1 634 6 2 651.934 -1 1 -1

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (8 of 13) [11/13/2003 5:33:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (9 of 13) [11/13/2003 5:33:41 PM]
1.4.2.10.1. Background and Data 1.4.2.10.1. Background and Data

635 6 1 750.623 -1 1 -1 741 7 1 732.039 1 1 -1


636 6 2 635.240 -1 1 -1 742 7 2 603.883 1 1 -1
637 6 1 776.488 -1 1 -1 743 7 1 751.832 1 1 -1
638 6 2 641.083 -1 1 -1 744 7 2 608.643 1 1 -1
639 6 1 750.623 -1 1 -1 745 7 1 618.663 1 1 -1
640 6 2 645.321 -1 1 -1 746 7 2 630.778 1 1 -1
641 6 1 600.840 -1 1 -1 747 7 1 744.845 1 1 -1
642 6 2 566.127 -1 1 -1 748 7 2 623.063 1 1 -1
643 6 1 686.196 -1 1 -1 749 7 1 690.826 1 1 -1
644 6 2 647.844 -1 1 -1 750 7 2 472.463 1 1 -1
645 6 1 687.870 -1 1 -1 811 7 1 666.893 -1 1 1
646 6 2 554.815 -1 1 -1 812 7 2 645.932 -1 1 1
647 6 1 725.527 -1 1 -1 813 7 1 759.860 -1 1 1
648 6 2 620.087 -1 1 -1 814 7 2 577.176 -1 1 1
649 6 1 658.796 -1 1 -1 815 7 1 683.752 -1 1 1
650 6 2 711.301 -1 1 -1 816 7 2 567.530 -1 1 1
651 6 1 690.380 -1 1 -1 817 7 1 729.591 -1 1 1
652 6 2 644.355 -1 1 -1 818 7 2 821.654 -1 1 1
653 6 1 737.144 -1 1 -1 819 7 1 730.706 -1 1 1
654 6 2 713.812 -1 1 -1 820 7 2 684.490 -1 1 1
655 6 1 663.851 -1 1 -1 821 7 1 763.124 -1 1 1
656 6 2 696.707 -1 1 -1 822 7 2 600.427 -1 1 1
657 6 1 766.630 -1 1 -1 823 7 1 724.193 -1 1 1
658 6 2 589.453 -1 1 -1 824 7 2 686.023 -1 1 1
659 6 1 625.922 -1 1 -1 825 7 1 630.352 -1 1 1
660 6 2 634.468 -1 1 -1 826 7 2 628.109 -1 1 1
721 7 1 694.430 1 1 -1 827 7 1 750.338 -1 1 1
722 7 2 599.751 1 1 -1 828 7 2 605.214 -1 1 1
723 7 1 730.217 1 1 -1 829 7 1 752.417 -1 1 1
724 7 2 624.542 1 1 -1 830 7 2 640.260 -1 1 1
725 7 1 700.770 1 1 -1 831 7 1 707.899 -1 1 1
726 7 2 723.505 1 1 -1 832 7 2 700.767 -1 1 1
727 7 1 722.242 1 1 -1 833 7 1 715.582 -1 1 1
728 7 2 674.717 1 1 -1 834 7 2 665.924 -1 1 1
729 7 1 763.828 1 1 -1 835 7 1 728.746 -1 1 1
730 7 2 608.539 1 1 -1 836 7 2 555.926 -1 1 1
731 7 1 695.668 1 1 -1 837 7 1 591.193 -1 1 1
732 7 2 612.135 1 1 -1 838 7 2 543.299 -1 1 1
733 7 1 688.887 1 1 -1 839 7 1 592.252 -1 1 1
734 7 2 591.935 1 1 -1 840 7 2 511.030 -1 1 1
735 7 1 531.021 1 1 -1 901 8 1 740.833 -1 -1 1
736 7 2 676.656 1 1 -1 902 8 2 583.994 -1 -1 1
737 7 1 698.915 1 1 -1 903 8 1 786.367 -1 -1 1
738 7 2 647.323 1 1 -1 904 8 2 611.048 -1 -1 1
739 7 1 735.905 1 1 -1 905 8 1 712.386 -1 -1 1
740 7 2 811.970 1 1 -1 906 8 2 623.338 -1 -1 1

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (10 of 13) [11/13/2003 5:33:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (11 of 13) [11/13/2003 5:33:41 PM]
1.4.2.10.1. Background and Data 1.4.2.10.1. Background and Data

907 8 1 738.333 -1 -1 1 953 8 1 677.933 1 -1 -1


908 8 2 679.585 -1 -1 1 954 8 2 611.874 1 -1 -1
909 8 1 741.480 -1 -1 1 955 8 1 674.600 1 -1 -1
910 8 2 665.004 -1 -1 1 956 8 2 698.254 1 -1 -1
911 8 1 729.167 -1 -1 1 957 8 1 611.999 1 -1 -1
912 8 2 655.860 -1 -1 1 958 8 2 748.130 1 -1 -1
913 8 1 795.833 -1 -1 1 959 8 1 530.680 1 -1 -1
914 8 2 715.711 -1 -1 1 960 8 2 689.942 1 -1 -1
915 8 1 723.502 -1 -1 1
916 8 2 611.999 -1 -1 1
917 8 1 718.333 -1 -1 1
918 8 2 577.722 -1 -1 1
919 8 1 768.080 -1 -1 1
920 8 2 615.129 -1 -1 1
921 8 1 747.500 -1 -1 1
922 8 2 540.316 -1 -1 1
923 8 1 775.000 -1 -1 1
924 8 2 711.667 -1 -1 1
925 8 1 760.599 -1 -1 1
926 8 2 639.167 -1 -1 1
927 8 1 758.333 -1 -1 1
928 8 2 549.491 -1 -1 1
929 8 1 682.500 -1 -1 1
930 8 2 684.167 -1 -1 1
931 8 1 658.116 1 -1 -1
932 8 2 672.153 1 -1 -1
933 8 1 738.213 1 -1 -1
934 8 2 594.534 1 -1 -1
935 8 1 681.236 1 -1 -1
936 8 2 627.650 1 -1 -1
937 8 1 704.904 1 -1 -1
938 8 2 551.870 1 -1 -1
939 8 1 693.623 1 -1 -1
940 8 2 594.534 1 -1 -1
941 8 1 624.993 1 -1 -1
942 8 2 602.660 1 -1 -1
943 8 1 700.228 1 -1 -1
944 8 2 585.450 1 -1 -1
945 8 1 611.874 1 -1 -1
946 8 2 555.724 1 -1 -1
947 8 1 579.167 1 -1 -1
948 8 2 574.934 1 -1 -1
949 8 1 720.872 1 -1 -1
950 8 2 584.625 1 -1 -1
951 8 1 690.320 1 -1 -1
952 8 2 555.724 1 -1 -1

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (12 of 13) [11/13/2003 5:33:41 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a1.htm (13 of 13) [11/13/2003 5:33:41 PM]
1.4.2.10.2. Analysis of the Response Variable 1.4.2.10.2. Analysis of the Response Variable

4-Plot The next step is generate a 4-plot of the response variable.

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.10. Ceramic Strength

1.4.2.10.2. Analysis of the Response Variable


Numerical As a first step in the analysis, a table of summary statistics is computed for the response
Summary variable. The following table, generated by Dataplot, shows a typical set of statistics.

SUMMARY

NUMBER OF OBSERVATIONS = 480

***********************************************************************
* LOCATION MEASURES * DISPERSION MEASURES * This 4-plot shows:
***********************************************************************
* MIDRANGE = 0.5834740E+03 * RANGE = 0.4763600E+03 *
1. The run sequence plot (upper left corner) shows that the location and scale are
* MEAN = 0.6500773E+03 * STAND. DEV. = 0.7463826E+02 * relatively constant. It also shows a few outliers on the low side. Most of the points
* MIDMEAN = 0.6426155E+03 * AV. AB. DEV. = 0.6184948E+02 * are in the range 500 to 750. However, there are about half a dozen points in the 300
* MEDIAN = 0.6466275E+03 * MINIMUM = 0.3452940E+03 * to 450 range that may require special attention.
* = * LOWER QUART. = 0.5960515E+03 *
* = * LOWER HINGE = 0.5959740E+03 * A run sequence plot is useful for designed experiments in that it can reveal time
* = * UPPER HINGE = 0.7084220E+03 *
* = * UPPER QUART. = 0.7083415E+03 * effects. Time is normally a nuisance factor. That is, the time order on which runs are
* = * MAXIMUM = 0.8216540E+03 * made should not have a significant effect on the response. If a time effect does
*********************************************************************** appear to exist, this means that there is a potential bias in the experiment that needs
* RANDOMNESS MEASURES * DISTRIBUTIONAL MEASURES * to be investigated and resolved.
***********************************************************************
* AUTOCO COEF = -0.2290508E+00 * ST. 3RD MOM. = -0.3682922E+00 * 2. The lag plot (the upper right corner) does not show any significant structure. This is
* = 0.0000000E+00 * ST. 4TH MOM. = 0.3220554E+01 * another tool for detecting any potential time effect.
* = 0.0000000E+00 * ST. WILK-SHA = 0.3877698E+01 *
* = * UNIFORM PPCC = 0.9756916E+00 * 3. The histogram (the lower left corner) shows the response appears to be reasonably
* = * NORMAL PPCC = 0.9906310E+00 * symmetric, but with a bimodal distribution.
* = * TUK -.5 PPCC = 0.8357126E+00 *
* = * CAUCHY PPCC = 0.5063868E+00 * 4. The normal probability plot (the lower right corner) shows some curvature
*********************************************************************** indicating that distributions other than the normal may provide a better fit.

From the above output, the mean strength is 650.08 and the standard deviation of the
strength is 74.64.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a2.htm (1 of 2) [11/13/2003 5:33:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a2.htm (2 of 2) [11/13/2003 5:33:42 PM]


1.4.2.10.3. Analysis of the Batch Effect 1.4.2.10.3. Analysis of the Batch Effect

Quantile-Quantile
Plot

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.10. Ceramic Strength

1.4.2.10.3. Analysis of the Batch Effect


Batch is a The two nuisance factors in this experiment are the batch number and the lab. There are
Nuisance Factor 2 batches and 8 labs. Ideally, these factors will have minimal effect on the response
variable.
We will investigate the batch factor first.

Bihistogram
This q-q plot shows the following.
1. Except for a few points in the right tail, the batch 1 values have higher quantiles
than the batch 2 values. This implies that batch 1 has a greater location value than
batch 2.
2. The q-q plot is not linear. This implies that the difference between the batches is
not explained simply by a shift in location. That is, the variation and/or skewness
varies as well. From the bihistogram, it appears that the skewness in batch 2 is the
most likely explanation for the non-linearity in the q-q plot.

Box Plot

This bihistogram shows the following.


1. There does appear to be a batch effect.
2. The batch 1 responses are centered at 700 while the batch 2 responses are
centered at 625. That is, the batch effect is approximately 75 units.
3. The variability is comparable for the 2 batches.
4. Batch 1 has some skewness in the lower tail. Batch 2 has some skewness in the
center of the distribution, but not as much in the tails compared to batch 1.
5. Both batches have a few low-lying points.
Although we could stop with the bihistogram, we will show a few other commonly used
two-sample graphical techniques for comparison. This box plot shows the following.
1. The median for batch 1 is approximately 700 while the median for batch 2 is
approximately 600.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a3.htm (1 of 5) [11/13/2003 5:33:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a3.htm (2 of 5) [11/13/2003 5:33:42 PM]


1.4.2.10.3. Analysis of the Batch Effect 1.4.2.10.3. Analysis of the Batch Effect

2. The spread is reasonably similar for both batches, maybe slightly larger for batch
1. Two Sample The following is the Dataplot output from the two sample t-test.
3. Both batches have a number of outliers on the low side. Batch 2 also has a few T-Test
outliers on the high side. Box plots are a particularly effective method for T-TEST
identifying the presence of outliers. (2-SAMPLE)
NULL HYPOTHESIS UNDER TEST--POPULATION MEANS MU1 = MU2
Block Plots A block plot is generated for each of the eight labs, with "1" and "2" denoting the batch
numbers. In the first plot, we do not include any of the primary factors. The next 3 SAMPLE 1:
block plots include one of the primary factors. Note that each of the 3 primary factors NUMBER OF OBSERVATIONS = 240
(table speed = X1, down feed rate = X2, wheel grit size = X3) has 2 levels. With 8 labs MEAN = 688.9987
and 2 levels for the primary factor, we would expect 16 separate blocks on these plots. STANDARD DEVIATION = 65.54909
The fact that some of these blocks are missing indicates that some of the combinations STANDARD DEVIATION OF MEAN = 4.231175
of lab and primary factor are empty.
SAMPLE 2:
NUMBER OF OBSERVATIONS = 240
MEAN = 611.1559
STANDARD DEVIATION = 61.85425
STANDARD DEVIATION OF MEAN = 3.992675

IF ASSUME SIGMA1 = SIGMA2:


POOLED STANDARD DEVIATION = 63.72845
DIFFERENCE (DELTA) IN MEANS = 77.84271
STANDARD DEVIATION OF DELTA = 5.817585
T-TEST STATISTIC VALUE = 13.38059
DEGREES OF FREEDOM = 478.0000
T-TEST STATISTIC CDF VALUE = 1.000000

IF NOT ASSUME SIGMA1 = SIGMA2:


STANDARD DEVIATION SAMPLE 1 = 65.54909
STANDARD DEVIATION SAMPLE 2 = 61.85425
BARTLETT CDF VALUE = 0.629618
DIFFERENCE (DELTA) IN MEANS = 77.84271
These block plots show the following. STANDARD DEVIATION OF DELTA = 5.817585
1. The mean for batch 1 is greater than the mean for batch 2 in all of the cases T-TEST STATISTIC VALUE = 13.38059
above. This is strong evidence that the batch effect is real and consistent across EQUIVALENT DEG. OF FREEDOM = 476.3999
labs and primary factors. T-TEST STATISTIC CDF VALUE = 1.000000

Quantitative We can confirm some of the conclusions drawn from the above graphics by using ALTERNATIVE- ALTERNATIVE-
Techniques quantitative techniques. The two sample t-test can be used to test whether or not the ALTERNATIVE- HYPOTHESIS HYPOTHESIS
HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
means from the two batches are equal and the F-test can be used to test whether or not
MU1 <> MU2 (0,0.025) (0.975,1) ACCEPT
the standard deviations from the two batches are equal. MU1 < MU2 (0,0.05) REJECT
MU1 > MU2 (0.95,1) ACCEPT
The t-test indicates that the mean for batch 1 is larger than the mean for batch 2 (at the
5% confidence level).

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a3.htm (3 of 5) [11/13/2003 5:33:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a3.htm (4 of 5) [11/13/2003 5:33:42 PM]


1.4.2.10.3. Analysis of the Batch Effect 1.4.2.10.4. Analysis of the Lab Effect

F-Test The following is the Dataplot output from the F-test.

F-TEST
NULL HYPOTHESIS UNDER TEST--SIGMA1 = SIGMA2
ALTERNATIVE HYPOTHESIS UNDER TEST--SIGMA1 NOT EQUAL SIGMA2 1. Exploratory Data Analysis
1.4. EDA Case Studies
SAMPLE 1: 1.4.2. Case Studies
NUMBER OF OBSERVATIONS = 240 1.4.2.10. Ceramic Strength
MEAN = 688.9987
STANDARD DEVIATION = 65.54909
1.4.2.10.4. Analysis of the Lab Effect
SAMPLE 2:
NUMBER OF OBSERVATIONS = 240
MEAN = 611.1559 Box Plot The next matter is to determine if there is a lab effect. The first step is to
STANDARD DEVIATION = 61.85425 generate a box plot for the ceramic strength based on the lab.

TEST:
STANDARD DEV. (NUMERATOR) = 65.54909
STANDARD DEV. (DENOMINATOR) = 61.85425
F-TEST STATISTIC VALUE = 1.123037
DEG. OF FREEDOM (NUMER.) = 239.0000
DEG. OF FREEDOM (DENOM.) = 239.0000
F-TEST STATISTIC CDF VALUE = 0.814808

NULL NULL HYPOTHESIS NULL HYPOTHESIS


HYPOTHESIS ACCEPTANCE INTERVAL CONCLUSION
SIGMA1 = SIGMA2 (0.000,0.950) ACCEPT
The F-test indicates that the standard deviations for the two batches are not significantly
different at the 5% confidence level.

Conclusions We can draw the following conclusions from the above analysis.
1. There is in fact a significant batch effect. This batch effect is consistent across
labs and primary factors.
2. The magnitude of the difference is on the order of 75 to 100 (with batch 2 being
smaller than batch 1). The standard deviations do not appear to be significantly
different. This box plot shows the following.
3. There is some skewness in the batches. 1. There is minor variation in the medians for the 8 labs.
This batch effect was completely unexpected by the scientific investigators in this 2. The scales are relatively constant for the labs.
study. 3. Two of the labs (3 and 5) have outliers on the low side.
Note that although the quantitative techniques support the conclusions of unequal
means and equal standard deviations, they do not show the more subtle features of the
data such as the presence of outliers and the skewness of the batch 2 data.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a3.htm (5 of 5) [11/13/2003 5:33:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a4.htm (1 of 3) [11/13/2003 5:33:42 PM]


1.4.2.10.4. Analysis of the Lab Effect 1.4.2.10.4. Analysis of the Lab Effect

This box plot shows the following.


Box Plot for Given that the previous section showed a distinct batch effect, the next
Batch 1 step is to generate the box plots for the two batches separately. 1. The medians are in the range 550 to 600.
2. There is a bit more variability, across the labs, for batch2
compared to batch 1.
3. Six of the eight labs show outliers on the high side. Three of the
labs show outliers on the low side.

Conclusions We can draw the following conclusions about a possible lab effect from
the above box plots.
1. The batch effect (of approximately 75 to 100 units) on location
dominates any lab effects.
2. It is reasonable to treat the labs as homogeneous.

This box plot shows the following.


1. Each of the labs has a median in the 650 to 700 range.
2. The variability is relatively constant across the labs.
3. Each of the labs has at least one outlier on the low side.

Box Plot for


Batch 2

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a4.htm (2 of 3) [11/13/2003 5:33:42 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a4.htm (3 of 3) [11/13/2003 5:33:42 PM]


1.4.2.10.5. Analysis of Primary Factors 1.4.2.10.5. Analysis of Primary Factors

3 primary factors in terms of location and spread.

Dex Mean
Plot for
1. Exploratory Data Analysis
Batch 1
1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.10. Ceramic Strength

1.4.2.10.5. Analysis of Primary Factors


Main effects The first step in analyzing the primary factors is to determine which
factors are the most significant. The dex scatter plot, dex mean plot, and
the dex standard deviation plots will be the primary tools, with "dex"
being short for "design of experiments".
Since the previous pages showed a significant batch effect but a minimal
lab effect, we will generate separate plots for batch 1 and batch 2.
However, the labs will be treated as equivalent.
This dex mean plot shows the following for batch 1.
Dex Scatter 1. The table speed factor (X1) is the most significant factor with an
Plot for effect, the difference between the two points, of approximately 35
Batch 1 units.
2. The wheel grit factor (X3) is the next most significant factor with
an effect of approximately 10 units.
3. The feed rate factor (X2) has minimal effect.

Dex SD Plot
for Batch 1

This dex scatter plot shows the following for batch 1.


1. Most of the points are between 500 and 800.
2. There are about a dozen or so points between 300 and 500.
3. Except for the outliers on the low side (i.e., the points between
300 and 500), the distribution of the points is comparable for the

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a5.htm (1 of 7) [11/13/2003 5:33:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a5.htm (2 of 7) [11/13/2003 5:33:43 PM]


1.4.2.10.5. Analysis of Primary Factors 1.4.2.10.5. Analysis of Primary Factors

1. Most of the points are between 450 and 750.


2. There are a few outliers on both the low side and the high side.
3. Except for the outliers (i.e., the points less than 450 or greater
than 750), the distribution of the points is comparable for the 3
primary factors in terms of location and spread.

Dex Mean
Plot for
Batch 2

This dex standard deviation plot shows the following for batch 1.
1. The table speed factor (X1) has a significant difference in
variability between the levels of the factor. The difference is
approximately 20 units.
2. The wheel grit factor (X3) and the feed rate factor (X2) have
minimal differences in variability.
This dex mean plot shows the following for batch 2.
Dex Scatter 1. The feed rate (X2) and wheel grit (X3) factors have an
Plot for approximately equal effect of about 15 or 20 units.
Batch 2
2. The table speed factor (X1) has a minimal effect.

Dex SD Plot
for Batch 2

This dex scatter plot shows the following for batch 2.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a5.htm (3 of 7) [11/13/2003 5:33:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a5.htm (4 of 7) [11/13/2003 5:33:43 PM]


1.4.2.10.5. Analysis of Primary Factors 1.4.2.10.5. Analysis of Primary Factors

This dex standard deviation plot shows the following for batch 2. The ranked list of factors for batch 1 is:
1. The difference in the standard deviations is roughly comparable 1. Table speed (X1) with an estimated effect of -30.77.
for the three factors (slightly less for the feed rate factor). 2. The interaction of table speed (X1) and wheel grit (X3) with an
estimated effect of -20.25.
Interaction The above plots graphically show the main effects. An additonal 3. The interaction of table speed (X1) and feed rate (X2) with an
Effects concern is whether or not there any significant interaction effects. estimated effect of 9.7.
Main effects and 2-term interaction effects are discussed in the chapter 4. Wheel grit (X3) with an estimated effect of -7.18.
on Process Improvement. 5. Down feed (X2) and the down feed interaction with wheel grit
(X3) are essentially zero.
In the following dex interaction plots, the labels on the plot give the
variables and the estimated effect. For example, factor 1 is TABLE
DEX
SPEED and it has an estimated effect of 30.77 (it is actually -30.77 if
Interaction
the direction is taken into account).
Plot for
Batch 2
DEX
Interaction
Plot for
Batch 1

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a5.htm (5 of 7) [11/13/2003 5:33:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a5.htm (6 of 7) [11/13/2003 5:33:43 PM]


1.4.2.10.5. Analysis of Primary Factors 1.4.2.10.6. Work This Example Yourself

1. Exploratory Data Analysis


1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.10. Ceramic Strength

1.4.2.10.6. Work This Example Yourself


View This page allows you to use Dataplot to repeat the analysis outlined in
Dataplot the case study description on the previous page. It is required that you
Macro for have already downloaded and installed Dataplot and configured your
this Case browser. to run Dataplot. Output from each analysis step below will be
Study displayed in one or more of the Dataplot windows. The four main
windows are the Output window, the Graphics window, the Command
History window, and the data sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in.
The ranked list of factors for batch 2 is:
1. Down feed (X2) with an estimated effect of 18.22. Data Analysis Steps Results and Conclusions
2. The interaction of table speed (X1) and wheel grit (X3) with an
estimated effect of -16.71.
Click on the links below to start Dataplot and run this case
3. Wheel grit (X3) with an estimated effect of -14.71 study yourself. Each step may use results from previous The links in this column will connect you with more
4. Remaining main effect and 2-factor interaction effects are steps, so please be patient. Wait until the software verifies detailed information about each analysis step from the case
essentially zero. that the current step is complete before clicking on the next study description.
step.

Conclusions From the above plots, we can draw the following overall conclusions.
1. The batch effect (of approximately 75 units) is the dominant
primary factor. 1. Invoke Dataplot and read data.

2. The most important factors differ from batch to batch. See the 1. Read in the data. 1. You have read 1 column of numbers
above text for the ranked list of factors with the estimated effects. into Dataplot, variable Y.

2. Plot of the response variable

1. Numerical summary of Y. 1. The summary shows the mean strength


is 650.08 and the standard deviation
of the strength is 74.64.

2. 4-plot of Y. 2. The 4-plot shows no drift in


the location and scale and a
bimodal distribution.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a5.htm (7 of 7) [11/13/2003 5:33:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a6.htm (1 of 3) [11/13/2003 5:33:43 PM]


1.4.2.10.6. Work This Example Yourself 1.4.2.10.6. Work This Example Yourself

3. Determine if there is a batch effect. 5. Analysis of primary factors.

1. Generate a bihistogram based on 1. The bihistogram shows a distinct 1. Generate a dex scatter plot for 1. The dex scatter plot shows the
the 2 batches. batch effect of approximately batch 1. range of the points and the
75 units. presence of outliers.

2. Generate a q-q plot. 2. Generate a dex mean plot for 2. The dex mean plot shows that
2. The q-q plot shows that batch 1
batch 1. table speed is the most
and batch 2 do not come from a
common distribution. significant factor for batch 1.

3. Generate a box plot. 3. Generate a dex sd plot for 3. The dex sd plot shows that
3. The box plot shows that there is
batch 1. table speed has the most
a batch effect of approximately
75 to 100 units and there are variability for batch 1.
some outliers.
4. Generate a dex scatter plot for 4. The dex scatter plot shows
4. Generate block plots. batch 2. the range of the points and
4. The block plot shows that the batch
effect is consistent across labs the presence of outliers.
and levels of the primary factor. 5. Generate a dex mean plot for
5. The dex mean plot shows that
batch 2.
5. Perform a 2-sample t-test for feed rate and wheel grit are
equal means. 5. The t-test confirms the batch the most significant factors
effect with respect to the means. for batch 2.
6. Generate a dex sd plot for
batch 2. 6. The dex sd plot shows that
6. Perform an F-test for equal the variability is comparable
standard deviations. 6. The F-test does not indicate any for all 3 factors for batch 2.
significant batch effect with 7. Generate a dex interaction
respect to the standard deviations. effects matrix plot for 7. The dex interaction effects
batch 1. matrix plot provides a ranked
list of factors with the
estimated effects.
4. Determine if there is a lab effect. 8. Generate a dex interaction
effects matrix plot for 8. The dex interaction effects
1. Generate a box plot for the labs 1. The box plot does not show a batch 2. matrix plot provides a ranked
with the 2 batches combined. significant lab effect. list of factors with the
estimated effects.

2. Generate a box plot for the labs 2. The box plot does not show a
for batch 1 only. significant lab effect for batch 1.

3. Generate a box plot for the labs 3. The box plot does not show a
for batch 2 only. significant lab effect for batch 2.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a6.htm (2 of 3) [11/13/2003 5:33:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda42a6.htm (3 of 3) [11/13/2003 5:33:43 PM]


1.4.3. References For Chapter 1: Exploratory Data Analysis 1.4.3. References For Chapter 1: Exploratory Data Analysis

Cleveland, William (1993), Visualizing Data, Hobart Press.

Devaney, Judy (1997), Equation Discovery Through Global Self-Referenced Geometric


1. Exploratory Data Analysis Intervals and Machine Learning, Ph.d thesis, George Mason University, Fairfax, VA.
1.4. EDA Case Studies Coefficient Test for Normality , Technometrics, pp. 111-117.

Draper and Smith, (1981). Applied Regression Analysis, 2nd ed., John Wiley and Sons.
1.4.3. References For Chapter 1:
Exploratory Data Analysis du Toit, Steyn, and Stumpf (1986), Graphical Exploratory Data Analysis,
Springer-Verlag.

Anscombe, Francis (1973), Graphs in Statistical Analysis, The American Statistician, Evans, Hastings, and Peacock (2000), Statistical Distributions, 3rd. Ed., John Wiley and
pp. 195-199. Sons.

Anscombe, Francis and Tukey, J. W. (1963), The Examination and Analysis of Everitt, Brian (1978), Multivariate Techniques for Multivariate Data, North-Holland.
Residuals, Technometrics, pp. 141-160.
Efron and Gong (February 1983), A Leisurely Look at the Bootstrap, the Jackknife, and
Bloomfield, Peter (1976), Fourier Analysis of Time Series, John Wiley and Sons. Cross Validation, The American Statistician.

Box, G. E. P. and Cox, D. R. (1964), An Analysis of Transformations, Journal of the Filliben, J. J. (February 1975), The Probability Plot Correlation Coefficient Test for
Royal Statistical Society, 211-243, discussion 244-252. Normality , Technometrics, pp. 111-117.

Box, G. E. P., Hunter, W. G., and Hunter, J. S. (1978), Statistics for Experimenters: An Gill, Lisa (April 1997), Summary Analysis: High Performance Ceramics Experiment to
Introduction to Design, Data Analysis, and Model Building, John Wiley and Sons. Characterize the Effect of Grinding Parameters on Sintered Reaction Bonded Silicon
Nitride, Reaction Bonded Silicon Nitride, and Sintered Silicon Nitride , presented at the
Box, G. E. P., and Jenkins, G. (1976), Time Series Analysis: Forecasting and Control, NIST - Ceramic Machining Consortium, 10th Program Review Meeting, April 10, 1997.
Holden-Day.
Granger and Hatanaka (1964). Spectral Analysis of Economic Time Series, Princeton
Bradley, (1968). Distribution-Free Statistical Tests, Chapter 12. University Press.

Brown, M. B. and Forsythe, A. B. (1974), Journal of the American Statistical Grubbs, Frank (February 1969), Procedures for Detecting Outlying Observations in
Association, 69, 364-367. Samples, Technometrics, Vol. 11, No. 1, pp. 1-21.

Chakravarti, Laha, and Roy, (1967). Handbook of Methods of Applied Statistics, Volume Harris, Robert L. (1996), Information Graphics, Management Graphics.
I, John Wiley and Sons, pp. 392-394.
Jenkins and Watts, (1968), Spectral Analysis and Its Applications, Holden-Day.
Chambers, John, William Cleveland, Beat Kleiner, and Paul Tukey, (1983), Graphical
Methods for Data Analysis, Wadsworth. Johnson, Kotz, and Balakrishnan, (1994), Continuous Univariate Distributions, Volumes
I and II, 2nd. Ed., John Wiley and Sons.
Cleveland, William (1985), Elements of Graphing Data, Wadsworth.
Johnson, Kotz, and Kemp, (1992), Univariate Discrete Distributions, 2nd. Ed., John
Cleveland, William and Marylyn McGill, Editors (1988), Dynamic Graphics for Wiley and Sons.
Statistics, Wadsworth.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda43.htm (1 of 4) [11/13/2003 5:33:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda43.htm (2 of 4) [11/13/2003 5:33:43 PM]


1.4.3. References For Chapter 1: Exploratory Data Analysis 1.4.3. References For Chapter 1: Exploratory Data Analysis

Kuo, Way and Pierson, Marcia Martens, Eds. (1993), Quality Through Engineering Stephens, M. A. (1977). Goodness of Fit for the Extreme Value Distribution,
Design", specifically, the article Filliben, Cetinkunt, Yu, and Dommenz (1993), Biometrika, Vol. 64, pp. 583-588.
Exploratory Data Analysis Techniques as Applied to a High-Precision Turning Machine,
Elsevier, New York, pp. 199-223. Stephens, M. A. (1977). Goodness of Fit with Special Reference to Tests for
Exponentiality , Technical Report No. 262, Department of Statistics, Stanford
Levene, H. (1960). In Contributions to Probability and Statistics: Essays in Honor of University, Stanford, CA.
Harold Hotelling, I. Olkin et al. eds., Stanford University Press, pp. 278-292.
Stephens, M. A. (1979). Tests of Fit for the Logistic Distribution Based on the Empirical
McNeil, Donald (1977), Interactive Data Analysis, John Wiley and Sons. Distribution Function, Biometrika, Vol. 66, pp. 591-595.

Mosteller, Frederick and Tukey, John (1977), Data Analysis and Regression, Tukey, John (1977), Exploratory Data Analysis, Addison-Wesley.
Addison-Wesley.
Tufte, Edward (1983), The Visual Display of Quantitative Information, Graphics Press.
Nelson, Wayne (1982), Applied Life Data Analysis, Addison-Wesley.
Velleman, Paul and Hoaglin, David (1981), The ABC's of EDA: Applications, Basics,
Neter, Wasserman, and Kunter (1990). Applied Linear Statistical Models, 3rd ed., Irwin. and Computing of Exploratory Data Analysis, Duxbury.

Nelson, Wayne and Doganaksoy, Necip (1992), A Computer Program POWNOR for Wainer, Howard (1981), Visual Revelations, Copernicus.
Fitting the Power-Normal and -Lognormal Models to Life or Strength Data from
Specimens of Various Sizes, NISTIR 4760, U.S. Department of Commerce, National
Institute of Standards and Technology.

The RAND Corporation (1955), A Million Random Digits with 100,000 Normal
Deviates, Free Press.

Ryan, Thomas (1997). Modern Regression Methods, John Wiley.

Scott, David (1992), Multivariate Density Estimation: Theory, Practice, and


Visualization , John Wiley and Sons.

Snedecor, George W. and Cochran, William G. (1989), Statistical Methods, Eighth


Edition, Iowa State University Press.

Stefansky, W. (1972), Rejecting Outliers in Factorial Designs, Technometrics, Vol. 14,


pp. 469-479.

Stephens, M. A. (1974). EDF Statistics for Goodness of Fit and Some Comparisons,
Journal of the American Statistical Association, Vol. 69, pp. 730-737.

Stephens, M. A. (1976). Asymptotic Results for Goodness-of-Fit Statistics with Unknown


Parameters, Annals of Statistics, Vol. 4, pp. 357-369.

http://www.itl.nist.gov/div898/handbook/eda/section4/eda43.htm (3 of 4) [11/13/2003 5:33:43 PM] http://www.itl.nist.gov/div898/handbook/eda/section4/eda43.htm (4 of 4) [11/13/2003 5:33:43 PM]


2. Measurement Process Characterization 2. Measurement Process Characterization

2. Measurement Process Characterization


1. Characterization 2. Control
1. Issues 1. Issues
2. Check standards 2. Bias and long-term variability
3. Short-term variability

3. Calibration 4. Gauge R & R studies


1. Issues 1. Issues
2. Artifacts 2. Design
3. Designs 3. Data collection
4. Catalog of designs 4. Variability
5. Artifact control 5. Bias
6. Instruments 6. Uncertainty
7. Instrument control

5. Uncertainty analysis 6. Case Studies


1. Issues 1. Gauge study
2. Approach 2. Check standard
3. Type A evaluations 3. Type A uncertainty
4. Type B evaluations 4. Type B uncertainty
5. Propagation of error
6. Error budget
7. Expanded uncertainties
8. Uncorrected bias

Detailed table of contents

References for Chapter 2

http://www.itl.nist.gov/div898/handbook/mpc/mpc.htm (1 of 2) [11/13/2003 5:37:02 PM] http://www.itl.nist.gov/div898/handbook/mpc/mpc.htm (2 of 2) [11/13/2003 5:37:02 PM]


2. Measurement Process Characterization 2. Measurement Process Characterization

3. Calibration [2.3.]
1. Issues in calibration [2.3.1.]
1. Reference base [2.3.1.1.]
2. Reference standards [2.3.1.2.]

2. Measurement Process Characterization - 2. What is artifact (single-point) calibration? [2.3.2.]


3. What are calibration designs? [2.3.3.]
Detailed Table of Contents 1. Elimination of special types of bias [2.3.3.1.]
1. Left-right (constant instrument) bias [2.3.3.1.1.]
2. Bias caused by instrument drift [2.3.3.1.2.]
1. Characterization [2.1.]
2. Solutions to calibration designs [2.3.3.2.]
1. What are the issues for characterization? [2.1.1.] 1. General matrix solutions to calibration designs [2.3.3.2.1.]
1. Purpose [2.1.1.1.] 3. Uncertainties of calibrated values [2.3.3.3.]
2. Reference base [2.1.1.2.] 1. Type A evaluations for calibration designs [2.3.3.3.1.]
3. Bias and Accuracy [2.1.1.3.] 2. Repeatability and level-2 standard deviations [2.3.3.3.2.]
4. Variability [2.1.1.4.]
3. Combination of repeatability and level-2 standard
2. What is a check standard? [2.1.2.] deviations [2.3.3.3.3.]
1. Assumptions [2.1.2.1.] 4. Calculation of standard deviations for 1,1,1,1 design [2.3.3.3.4.]
2. Data collection [2.1.2.2.] 5. Type B uncertainty [2.3.3.3.5.]
3. Analysis [2.1.2.3.] 6. Expanded uncertainties [2.3.3.3.6.]
4. Catalog of calibration designs [2.3.4.]
2. Statistical control of a measurement process [2.2.]
1. Mass weights [2.3.4.1.]
1. What are the issues in controlling the measurement process? [2.2.1.]
1. Design for 1,1,1 [2.3.4.1.1.]
2. How are bias and variability controlled? [2.2.2.]
2. Design for 1,1,1,1 [2.3.4.1.2.]
1. Shewhart control chart [2.2.2.1.]
3. Design for 1,1,1,1,1 [2.3.4.1.3.]
1. EWMA control chart [2.2.2.1.1.]
4. Design for 1,1,1,1,1,1 [2.3.4.1.4.]
2. Data collection [2.2.2.2.]
5. Design for 2,1,1,1 [2.3.4.1.5.]
3. Monitoring bias and long-term variability [2.2.2.3.]
6. Design for 2,2,1,1,1 [2.3.4.1.6.]
4. Remedial actions [2.2.2.4.]
7. Design for 2,2,2,1,1 [2.3.4.1.7.]
3. How is short-term variability controlled? [2.2.3.]
8. Design for 5,2,2,1,1,1 [2.3.4.1.8.]
1. Control chart for standard deviations [2.2.3.1.]
9. Design for 5,2,2,1,1,1,1 [2.3.4.1.9.]
2. Data collection [2.2.3.2.]
10. Design for 5,3,2,1,1,1 [2.3.4.1.10.]
3. Monitoring short-term precision [2.2.3.3.]
11. Design for 5,3,2,1,1,1,1 [2.3.4.1.11.]
4. Remedial actions [2.2.3.4.]

http://www.itl.nist.gov/div898/handbook/mpc/mpc_d.htm (1 of 7) [11/13/2003 5:37:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/mpc_d.htm (2 of 7) [11/13/2003 5:37:19 PM]


2. Measurement Process Characterization 2. Measurement Process Characterization

12. Design for 5,3,2,2,1,1,1 [2.3.4.1.12.] 5. Designs for angle blocks [2.3.4.5.]
13. Design for 5,4,4,3,2,2,1,1 [2.3.4.1.13.] 1. Design for 4 angle blocks [2.3.4.5.1.]
14. Design for 5,5,2,2,1,1,1,1 [2.3.4.1.14.] 2. Design for 5 angle blocks [2.3.4.5.2.]
15. Design for 5,5,3,2,1,1,1 [2.3.4.1.15.] 3. Design for 6 angle blocks [2.3.4.5.3.]
16. Design for 1,1,1,1,1,1,1,1 weights [2.3.4.1.16.] 6. Thermometers in a bath [2.3.4.6.]
17. Design for 3,2,1,1,1 weights [2.3.4.1.17.] 7. Humidity standards [2.3.4.7.]
18. Design for 10 and 20 pound weights [2.3.4.1.18.] 1. Drift-elimination design for 2 reference weights and 3
2. Drift-elimination designs for gage blocks [2.3.4.2.] cylinders [2.3.4.7.1.]
1. Doiron 3-6 Design [2.3.4.2.1.] 5. Control of artifact calibration [2.3.5.]
2. Doiron 3-9 Design [2.3.4.2.2.] 1. Control of precision [2.3.5.1.]
3. Doiron 4-8 Design [2.3.4.2.3.] 1. Example of control chart for precision [2.3.5.1.1.]
4. Doiron 4-12 Design [2.3.4.2.4.] 2. Control of bias and long-term variability [2.3.5.2.]
5. Doiron 5-10 Design [2.3.4.2.5.] 1. Example of Shewhart control chart for mass calibrations [2.3.5.2.1.]
6. Doiron 6-12 Design [2.3.4.2.6.] 2. Example of EWMA control chart for mass calibrations [2.3.5.2.2.]
7. Doiron 7-14 Design [2.3.4.2.7.] 6. Instrument calibration over a regime [2.3.6.]
8. Doiron 8-16 Design [2.3.4.2.8.] 1. Models for instrument calibration [2.3.6.1.]
9. Doiron 9-18 Design [2.3.4.2.9.] 2. Data collection [2.3.6.2.]
10. Doiron 10-20 Design [2.3.4.2.10.] 3. Assumptions for instrument calibration [2.3.6.3.]
11. Doiron 11-22 Design [2.3.4.2.11.] 4. What can go wrong with the calibration procedure [2.3.6.4.]
3. Designs for electrical quantities [2.3.4.3.] 1. Example of day-to-day changes in calibration [2.3.6.4.1.]
1. Left-right balanced design for 3 standard cells [2.3.4.3.1.] 5. Data analysis and model validation [2.3.6.5.]
2. Left-right balanced design for 4 standard cells [2.3.4.3.2.] 1. Data on load cell #32066 [2.3.6.5.1.]
3. Left-right balanced design for 5 standard cells [2.3.4.3.3.] 6. Calibration of future measurements [2.3.6.6.]
4. Left-right balanced design for 6 standard cells [2.3.4.3.4.] 7. Uncertainties of calibrated values [2.3.6.7.]
5. Left-right balanced design for 4 references and 4 test items [2.3.4.3.5.] 1. Uncertainty for quadratic calibration using propagation of
error [2.3.6.7.1.]
6. Design for 8 references and 8 test items [2.3.4.3.6.]
2. Uncertainty for linear calibration using check standards [2.3.6.7.2.]
7. Design for 4 reference zeners and 2 test zeners [2.3.4.3.7.]
3. Comparison of check standard analysis and propagation of
8. Design for 4 reference zeners and 3 test zeners [2.3.4.3.8.]
error [2.3.6.7.3.]
9. Design for 3 references and 1 test resistor [2.3.4.3.9.]
7. Instrument control for linear calibration [2.3.7.]
10. Design for 4 references and 1 test resistor [2.3.4.3.10.]
1. Control chart for a linear calibration line [2.3.7.1.]
4. Roundness measurements [2.3.4.4.]
1. Single trace roundness design [2.3.4.4.1.] 4. Gauge R & R studies [2.4.]
2. Multiple trace roundness designs [2.3.4.4.2.] 1. What are the important issues? [2.4.1.]

http://www.itl.nist.gov/div898/handbook/mpc/mpc_d.htm (3 of 7) [11/13/2003 5:37:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/mpc_d.htm (4 of 7) [11/13/2003 5:37:19 PM]


2. Measurement Process Characterization 2. Measurement Process Characterization

2. Design considerations [2.4.2.] 4. Type B evaluations [2.5.4.]


3. Data collection for time-related sources of variability [2.4.3.] 1. Standard deviations from assumed distributions [2.5.4.1.]
1. Simple design [2.4.3.1.] 5. Propagation of error considerations [2.5.5.]
2. 2-level nested design [2.4.3.2.] 1. Formulas for functions of one variable [2.5.5.1.]
3. 3-level nested design [2.4.3.3.] 2. Formulas for functions of two variables [2.5.5.2.]
4. Analysis of variability [2.4.4.] 3. Propagation of error for many variables [2.5.5.3.]
1. Analysis of repeatability [2.4.4.1.] 6. Uncertainty budgets and sensitivity coefficients [2.5.6.]
2. Analysis of reproducibility [2.4.4.2.] 1. Sensitivity coefficients for measurements on the test item [2.5.6.1.]
3. Analysis of stability [2.4.4.3.] 2. Sensitivity coefficients for measurements on a check standard [2.5.6.2.]
1. Example of calculations [2.4.4.4.4.] 3. Sensitivity coefficients for measurements from a 2-level design [2.5.6.3.]
5. Analysis of bias [2.4.5.] 4. Sensitivity coefficients for measurements from a 3-level design [2.5.6.4.]
1. Resolution [2.4.5.1.] 5. Example of uncertainty budget [2.5.6.5.]
2. Linearity of the gauge [2.4.5.2.] 7. Standard and expanded uncertainties [2.5.7.]
3. Drift [2.4.5.3.] 1. Degrees of freedom [2.5.7.1.]
4. Differences among gauges [2.4.5.4.] 8. Treatment of uncorrected bias [2.5.8.]
5. Geometry/configuration differences [2.4.5.5.] 1. Computation of revised uncertainty [2.5.8.1.]
6. Remedial actions and strategies [2.4.5.6.]
6. Quantifying uncertainties from a gauge study [2.4.6.] 6. Case studies [2.6.]
1. Gauge study of resistivity probes [2.6.1.]
5. Uncertainty analysis [2.5.] 1. Background and data [2.6.1.1.]
1. Issues [2.5.1.] 1. Database of resistivity measurements [2.6.1.1.1.]
2. Approach [2.5.2.] 2. Analysis and interpretation [2.6.1.2.]
1. Steps [2.5.2.1.] 3. Repeatability standard deviations [2.6.1.3.]
3. Type A evaluations [2.5.3.] 4. Effects of days and long-term stability [2.6.1.4.]
1. Type A evaluations of random components [2.5.3.1.] 5. Differences among 5 probes [2.6.1.5.]
1. Type A evaluations of time-dependent effects [2.5.3.1.1.] 6. Run gauge study example using Dataplot™ [2.6.1.6.]
2. Measurement configuration within the laboratory [2.5.3.1.2.] 7. Dataplot™ macros [2.6.1.7.]
2. Material inhomogeneity [2.5.3.2.] 2. Check standard for resistivity measurements [2.6.2.]
1. Data collection and analysis [2.5.3.2.1.] 1. Background and data [2.6.2.1.]
3. Type A evaluations of bias [2.5.3.3.] 1. Database for resistivity check standard [2.6.2.1.1.]
1. Inconsistent bias [2.5.3.3.1.] 2. Analysis and interpretation [2.6.2.2.]
2. Consistent bias [2.5.3.3.2.] 1. Repeatability and level-2 standard deviations [2.6.2.2.1.]
3. Bias with sparse data [2.5.3.3.3.] 3. Control chart for probe precision [2.6.2.3.]

http://www.itl.nist.gov/div898/handbook/mpc/mpc_d.htm (5 of 7) [11/13/2003 5:37:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/mpc_d.htm (6 of 7) [11/13/2003 5:37:19 PM]


2. Measurement Process Characterization 2.1. Characterization

4. Control chart for bias and long-term variability [2.6.2.4.]


5. Run check standard example yourself [2.6.2.5.]
6. Dataplot™ macros [2.6.2.6.]
3. Evaluation of type A uncertainty [2.6.3.]
1. Background and data [2.6.3.1.]
1. Database of resistivity measurements [2.6.3.1.1.] 2. Measurement Process Characterization
2. Measurements on wiring configurations [2.6.3.1.2.]
2. Analysis and interpretation [2.6.3.2.] 2.1. Characterization
1. Difference between 2 wiring configurations [2.6.3.2.1.]
3. Run the type A uncertainty analysis using Dataplot™ [2.6.3.3.] The primary goal of this section is to lay the groundwork for
understanding the measurement process in terms of the errors that affect
4. Dataplot™ macros [2.6.3.4.] the process.
4. Evaluation of type B uncertainty and propagation of error [2.6.4.]
What are the issues for characterization?

7. References [2.7.] 1. Purpose


2. Reference base
3. Bias and Accuracy
4. Variability

What is a check standard?


1. Assumptions
2. Data collection
3. Analysis

http://www.itl.nist.gov/div898/handbook/mpc/mpc_d.htm (7 of 7) [11/13/2003 5:37:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc1.htm [11/13/2003 5:37:34 PM]


2.1.1. What are the issues for characterization? 2.1.1. What are the issues for characterization?

Application to The material in this chapter is pertinent to the study of production


production processes for which the size of the metrology (tool) error may be an
processes important consideration. More specific guidance on assessing
2. Measurement Process Characterization metrology errors can be found in the section on gauge studies.
2.1. Characterization

2.1.1. What are the issues for


characterization?
'Goodness' of A measurement process can be thought of as a well-run production
measurements process in which measurements are the output. The 'goodness' of
measurements is the issue, and goodness is characterized in terms of
the errors that affect the measurements.

Bias, variability The goodness of measurements is quantified in terms of


and uncertainty ● Bias

● Short-term variability or instrument precision


● Day-to-day or long-term variability
● Uncertainty

Requires The continuation of goodness is guaranteed by a statistical control


ongoing program that controls both
statistical ● Short-term variability or instrument precision
control
program ● Long-term variability which controls bias and day-to-day
variability of the process

Scope is limited The techniques in this chapter are intended primarily for ongoing
to ongoing processes. One-time tests and special tests or destructive tests are
processes difficult to characterize. Examples of ongoing processes are:
● Calibration where similar test items are measured on a regular
basis
● Certification where materials are characterized on a regular
basis
● Production where the metrology (tool) errors may be
significant
● Special studies where data can be collected over the life of the
study

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc11.htm (1 of 2) [11/13/2003 5:37:34 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc11.htm (2 of 2) [11/13/2003 5:37:34 PM]


2.1.1.1. Purpose 2.1.1.2. Reference base

2. Measurement Process Characterization 2. Measurement Process Characterization


2.1. Characterization 2.1. Characterization
2.1.1. What are the issues for characterization? 2.1.1. What are the issues for characterization?

2.1.1.1. Purpose 2.1.1.2. Reference base


Purpose is The purpose of characterization is to develop an understanding of the Ultimate The most critical element of any measurement process is the
to sources of error in the measurement process and how they affect specific authority relationship between a single measurement and the reference base for
understand measurement results. This section provides the background for: the unit of measurement. The reference base is the ultimate source of
and quantify ● identifying sources of error in the measurement process
authority for the measurement unit.
the effect of
● understanding and quantifying errors in the measurement process
error on For Reference bases for fundamental units of measurement (length, mass,
reported ● codifying the effects of these errors on a specific reported value in fundamental temperature, voltage, and time) and some derived units (such as
values a statement of uncertainty units pressure, force, flow rate, etc.) are maintained by national and regional
standards laboratories. Consensus values from interlaboratory tests or
Important Characterization relies upon the understanding of certain underlying instrumentation/standards as maintained in specific environments may
concepts concepts of measurement systems; namely, serve as reference bases for other units of measurement.
● reference base (authority) for the measurement
For A reference base, for comparison purposes, may be based on an
● bias
comparison agreement among participating laboratories or organizations and derived
● variability purposes from
● check standard ● measurements made with a standard test method

● measurements derived from an interlaboratory test


Reported The reported value is the measurement result for a particular test item. It
value is a can be:
generic term ● a single measurement
that
● an average of several measurements
identifies the
result that is ● a least-squares prediction from a model

transmitted ● a combination of several measurement results that are related by a


to the physical model
customer

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc111.htm [11/13/2003 5:37:34 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc112.htm [11/13/2003 5:37:35 PM]


2.1.1.3. Bias and Accuracy 2.1.1.3. Bias and Accuracy

materials are circulated among several laboratories.

Reduction of Bias can be eliminated or reduced by calibration of standards and/or


bias instruments. Because of costs and time constraints, the majority of
2. Measurement Process Characterization
calibrations are performed by secondary or tertiary laboratories and are
2.1. Characterization related to the reference base via a chain of intercomparisons that start
2.1.1. What are the issues for characterization? at the reference laboratory.
Bias can also be reduced by corrections to in-house measurements
based on comparisons with artifacts or instruments circulated for that
2.1.1.3. Bias and Accuracy purpose (reference materials).

Definition of Accuracy is a qualitative term referring to whether there is agreement Caution Errors that contribute to bias can be present even where all equipment
Accuracy and between a measurement made on an object and its true (target or and standards are properly calibrated and under control. Temperature
Bias reference) value. Bias is a quantitative term describing the difference probably has the most potential for introducing this type of bias into
between the average of measurements made on the same object and its the measurements. For example, a constant heat source will introduce
true value. In particular, for a measurement laboratory, bias is the serious errors in dimensional measurements of metal objects.
difference (generally unknown) between a laboratory's average value Temperature affects chemical and electrical measurements as well.
(over time) for a test item and the average that would be achieved by
the reference laboratory if it undertook the same measurements on the Generally speaking, errors of this type can be identified only by those
same test item. who are thoroughly familiar with the measurement technology. The
reader is advised to consult the technical literature and experts in the
Depiction of field for guidance.
bias and
unbiased
measurements Unbiased measurements relative to the target

Biased measurements relative to the target

Identification Bias in a measurement process can be identified by:


of bias
1. Calibration of standards and/or instruments by a reference
laboratory, where a value is assigned to the client's standard
based on comparisons with the reference laboratory's standards.
2. Check standards , where violations of the control limits on a
control chart for the check standard suggest that re-calibration of
standards or instruments is needed.
3. Measurement assurance programs, where artifacts from a
reference laboratory or other qualified agency are sent to a client
and measured in the client's environment as a 'blind' sample.
4. Interlaboratory comparisons, where reference standards or

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc113.htm (1 of 2) [11/13/2003 5:37:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc113.htm (2 of 2) [11/13/2003 5:37:35 PM]


2.1.1.4. Variability 2.1.1.4. Variability

Short-term Short-term errors affect the precision of the instrument. Even very precise instruments
variability exhibit small changes caused by random errors. It is useful to think in terms of
measurements performed with a single instrument over minutes or hours; this is to be
2. Measurement Process Characterization understood, normally, as the time that it takes to complete a measurement sequence.
2.1. Characterization
2.1.1. What are the issues for characterization? Terminology Four terms are in common usage to describe short-term phenomena. They are
interchangeable.
1. precision
2.1.1.4. Variability 2. repeatability
3. within-time variability
Sources of Variability is the tendency of the measurement process to produce slightly different
time-dependent measurements on the same test item, where conditions of measurement are either stable 4. short-term variability
variability or vary over time, temperature, operators, etc. In this chapter we consider two sources of
time-dependent variability: Precision is
The measure of precision is a standard deviation. Good precision implies a small standard
● Short-term variability ascribed to the precision of the instrument
quantified by a
deviation. This standard deviation is called the short-term standard deviation of the
standard
● Long-term variability related to changes in environment and handling techniques process or the repeatability standard deviation.
deviation

Depiction of Caution -- With very precise instrumentation, it is not unusual to find that the variability exhibited
two Process 1 Process 2 long-term by the measurement process from day-to-day often exceeds the precision of the
measurement Large between-day variability Small between-day variability variability may instrument because of small changes in environmental conditions and handling
processes with be dominant techniques which cannot be controlled or corrected in the measurement process. The
the same measurement process is not completely characterized until this source of variability is
short-term quantified.
variability over
six days where
Terminology Three terms are in common usage to describe long-term phenomena. They are
process 1 has
interchangeable.
large
between-day 1. day-to-day variability
variability and 2. long-term variability
process 2 has 3. reproducibility
negligible
between-day Caution -- The term 'reproducibility' is given very specific definitions in some national and
variability regarding term international standards. However, the definitions are not always in agreement. Therefore,
'reproducibility' it is used here only in a generic sense to indicate variability across days.

Definitions in We adopt precise definitions and provide data collection and analysis techniques in the
this Handbook sections on check standards and measurement control for estimating:
● Level-1 standard deviation for short-term variability
● Level-2 standard deviation for day-to-day variability

In the section on gauge studies, the concept of variability is extended to include very
long-term measurement variability:
● Level-1 standard deviation for short-term variability

● Level-2 standard deviation for day-to-day variability


Distributions of short-term measurements over 6 days where ● Level-3 standard deviation for very long-term variability
distances from the centerlines illustrate between-day variability We refer to the standard deviations associated with these three kinds of uncertainty as

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc114.htm (1 of 3) [11/13/2003 5:37:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc114.htm (2 of 3) [11/13/2003 5:37:35 PM]


2.1.1.4. Variability 2.1.2. What is a check standard?

"Level 1, 2, and 3 standard deviations", respectively.

Long-term The measure of long-term variability is the standard deviation of measurements taken
variability is over several days, weeks or months.
quantified by a
standard The simplest method for doing this assessment is by analysis of a check standard 2. Measurement Process Characterization
deviation database. The measurements on the check standards are structured to cover a long time 2.1. Characterization
interval and to capture all sources of variation in the measurement process.

2.1.2. What is a check standard?


A check Check standard methodology is a tool for collecting data on the
standard is measurement process to expose errors that afflict the process over
useful for time. Time-dependent sources of error are evaluated and quantified
gathering from the database of check standard measurements. It is a device for
data on the controlling the bias and long-term variability of the process once a
process baseline for these quantities has been established from historical data
on the check standard.

Think in The check standard should be thought of in terms of a database of


terms of data measurements. It can be defined as an artifact or as a characteristic of
the measurement process whose value can be replicated from
A check measurements taken over the life of the process. Examples are:
standard can
● measurements on a stable artifact
be an artifact
or defined ● differences between values of two reference standards as
quantity estimated from a calibration experiment
● values of a process characteristic, such as a bias term, which is
estimated from measurements on reference standards and/or test
items.
An artifact check standard must be close in material content and
geometry to the test items that are measured in the workload. If
possible, it should be one of the test items from the workload.
Obviously, it should be a stable artifact and should be available to the
measurement process at all times.

Solves the Measurement processes are similar to production processes in that they
difficulty of are continual and are expected to produce identical results (within
sampling the acceptable limits) over time, instruments, operators, and environmental
process conditions. However, it is difficult to sample the output of the
measurement process because, normally, test items change with each
measurement sequence.

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc114.htm (3 of 3) [11/13/2003 5:37:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc12.htm (1 of 2) [11/13/2003 5:37:35 PM]


2.1.2. What is a check standard? 2.1.2.1. Assumptions

Surrogate for Measurements on the check standard, spaced over time at regular
unseen intervals, act as surrogates for measurements that could be made on
measurements test items if sufficient time and resources were available.
2. Measurement Process Characterization
2.1. Characterization
2.1.2. What is a check standard?

2.1.2.1. Assumptions
Case study: Before applying the quality control procedures recommended in
this chapter to check standard data, basic assumptions should be
Resistivity check examined. The basic assumptions underlying the quality control
standard procedures are:
1. The data come from a single statistical distribution.
2. The distribution is a normal distribution.
3. The errors are uncorrelated over time.

An easy method for checking the assumption of a single normal


distribution is to construct a histogram of the check standard data.
The histogram should follow a bell-shaped pattern with a single
hump. Types of anomalies that indicate a problem with the
measurement system are:
1. a double hump indicating that errors are being drawn from
two or more distributions;
2. long tails indicating outliers in the process;
3. flat pattern or one with humps at either end indicating that
the measurement process in not in control or not properly
specified.

Another graphical method for testing the normality assumption is a


probability plot. The points are expected to fall approximately on a
straight line if the data come from a normal distribution. Outliers,
or data from other distributions, will produce an S-shaped curve.

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc12.htm (2 of 2) [11/13/2003 5:37:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc121.htm (1 of 2) [11/13/2003 5:37:36 PM]


2.1.2.1. Assumptions 2.1.2.2. Data collection

A graphical method for testing for correlation among


measurements is a time-lag plot. Correlation will frequently not be
a problem if measurements are properly structured over time.
Correlation problems generally occur when measurements are 2. Measurement Process Characterization
taken so close together in time that the instrument cannot properly 2.1. Characterization
recover from one measurement to the next. Correlations over time 2.1.2. What is a check standard?
are usually present but are often negligible.
2.1.2.2. Data collection
Schedule for A schedule for making check standard measurements over time (once a day, twice a
making week, or whatever is appropriate for sampling all conditions of measurement) should
measurements be set up and adhered to. The check standard measurements should be structured in
the same way as values reported on the test items. For example, if the reported values
are averages of two repetitions made within 5 minutes of each other, the check
standard values should be averages of the two measurements made in the same
manner.

Exception One exception to this rule is that there should be at least J = 2 repetitions per day.
Without this redundancy, there is no way to check on the short-term precision of the
measurement system.

Depiction of
schedule for
making check
standard
measurements
with four
repetitions
per day over
K days on the
surface of a
silicon wafer
with the
repetitions
randomized
at various K days - 4 repetitions
positions on
the wafer 2-level design for measurement process

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc121.htm (2 of 2) [11/13/2003 5:37:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc122.htm (1 of 2) [11/13/2003 5:37:36 PM]


2.1.2.2. Data collection 2.1.2.3. Analysis

Case study: The values for the check standard should be recorded along with pertinent
Resistivity environmental readings and identifications for all other significant factors. The best
check way to record this information is in one file with one line or row (on a spreadsheet)
standard for of information in fixed fields for each check standard measurement. A list of typical
2. Measurement Process Characterization
measurements entries follows.
2.1. Characterization
on silicon 1. Identification for check standard 2.1.2. What is a check standard?
wafers 2. Date
3. Identification for the measurement design (if applicable)
4. Identification for the instrument
2.1.2.3. Analysis
5. Check standard value
Short-term An analysis of the check standard data is the basis for quantifying
6. Short-term standard deviation from J repetitions
or level-1 random errors in the measurement process -- particularly
7. Degrees of freedom standard time-dependent errors.
8. Operator identification deviations
from J Given that we have a database of check standard measurements as
9. Environmental readings (if pertinent)
repetitions described in data collection where

represents the jth repetition on the kth day, the mean for the kth day is

and the short-term (level-1) standard deviation with v = J - 1 degrees of


freedom is

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc122.htm (2 of 2) [11/13/2003 5:37:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc123.htm (1 of 3) [11/13/2003 5:37:36 PM]


2.1.2.3. Analysis 2.1.2.3. Analysis

Drawback An individual short-term standard deviation will not be a reliable


of estimate of precision if the degrees of freedom is less than ten, but the
short-term individual estimates can be pooled over the K days to obtain a more
standard reliable estimate. The pooled level-1 standard deviation estimate with v
deviations = K(J - 1) degrees of freedom is

.
This standard deviation can be interpreted as quantifying the basic
precision of the instrumentation used in the measurement process.

Process The level-2 standard deviation of the check standard is appropriate for
(level-2) representing the process variability. It is computed with v = K - 1
standard degrees of freedom as:
deviation

where

is the grand mean of the KJ check standard measurements.

Use in The check standard data and standard deviations that are described in
quality this section are used for controlling two aspects of a measurement
control process:
1. Control of short-term variability
2. Control of bias and long-term variability

Case study: For an example, see the case study for resistivity where several check
Resistivity standards were measured J = 6 times per day over several days.
check
standard

http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc123.htm (2 of 3) [11/13/2003 5:37:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section1/mpc123.htm (3 of 3) [11/13/2003 5:37:36 PM]


2.2. Statistical control of a measurement process 2.2.1. What are the issues in controlling the measurement process?

2. Measurement Process Characterization 2. Measurement Process Characterization


2.2. Statistical control of a measurement process

2.2. Statistical control of a measurement


2.2.1. What are the issues in controlling the
process
measurement process?
The purpose of this section is to outline the steps that can be taken to
exercise statistical control over the measurement process and Purpose is to The purpose of statistical control is to guarantee the 'goodness' of
demonstrate the validity of the uncertainty statement. Measurement guarantee the measurement results within predictable limits and to validate the
processes can change both with respect to bias and variability. A change 'goodness' of statement of uncertainty of the measurement result.
in instrument precision may be readily noted as measurements are being measurement
recorded, but changes in bias or long-term variability are difficult to results Statistical control methods can be used to test the measurement
catch when the process is looking at a multitude of artifacts over time. process for change with respect to bias and variability from its
historical levels. However, if the measurement process is improperly
What are the issues for control of a measurement process? specified or calibrated, then the control procedures can only guarantee
1. Purpose comparability among measurements.

2. Assumptions Assumption of The assumptions that relate to measurement processes apply to


3. Role of the check standard normality is statistical control; namely that the errors of measurement are
not stringent uncorrelated over time and come from a population with a single
How are bias and long-term variability controlled? distribution. The tests for control depend on the assumption that the
1. Shewhart control chart underlying distribution is normal (Gaussian), but the test procedures
2. Exponentially weighted moving average control chart are robust to slight departures from normality. Practically speaking, all
that is required is that the distribution of measurements be bell-shaped
3. Data collection and analysis and symmetric.
4. Control procedure
5. Remedial actions & strategies Check Measurements on a check standard provide the mechanism for
standard is controlling the measurement process.
How is short-term variability controlled? mechanism
for controlling Measurements on the check standard should produce identical results
1. Control chart for standard deviations except for the effect of random errors, and tests for control are
the process
2. Data collection and analysis basically tests of whether or not the random errors from the process
continue to be drawn from the same statistical distribution as the
3. Control procedure
historical data on the check standard.
4. Remedial actions and strategies
Changes that can be monitored and tested with the check standard
database are:
1. Changes in bias and long-term variability
2. Changes in instrument precision or short-term variability

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc2.htm [11/13/2003 5:37:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc21.htm (1 of 2) [11/13/2003 5:37:36 PM]


2.2.1. What are the issues in controlling the measurement process? 2.2.2. How are bias and variability controlled?

2. Measurement Process Characterization


2.2. Statistical control of a measurement process

2.2.2. How are bias and variability controlled?


Bias and Bias and long-term variability are controlled by monitoring measurements
variability on a check standard over time. A change in the measurement on the check
are controlled standard that persists at a constant level over several measurement sequences
by monitoring indicates possible:
measurements 1. Change or damage to the reference standards
on a check
2. Change or damage to the check standard artifact
standard over
time 3. Procedural change that vitiates the assumptions of the measurement
process
A change in the variability of the measurements on the check standard can
be due to one of many causes such as:
1. Loss of environmental controls
2. Change in handling techniques
3. Severe degradation in instrumentation.

The control procedure monitors the progress of measurements on the check


standard over time and signals when a significant change occurs. There are
two control chart procedures that are suitable for this purpose.

Shewhart The Shewhart control chart has the advantage of being intuitive and easy to
Chart is easy implement. It is characterized by a center line and symmetric upper and
to implement lower control limits. The chart is good for detecting large changes but not
for quickly detecting small changes (of the order of one-half to one standard
deviation) in the process.

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc21.htm (2 of 2) [11/13/2003 5:37:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc22.htm (1 of 3) [11/13/2003 5:37:37 PM]


2.2.2. How are bias and variability controlled? 2.2.2. How are bias and variability controlled?

Depiction of In the simplistic illustration of a Shewhart control chart shown below, the Artifacts for The check standard artifacts for controlling the bias or long-term variability
Shewhart measurements are within the control limits with the exception of one process of the process must be of the same type and geometry as items that are
control chart measurement which exceeds the upper control limit. control must measured in the workload. The artifacts must be stable and available to the
be stable and measurement process on a continuing basis. Usually, one artifact is
available sufficient. It can be:
Case study: 1. An individual item drawn at random from the workload
Resistivity 2. A specific item reserved by the laboratory for the purpose.

Topic covered The topics covered in this section include:


in this 1. Shewhart control chart methodology
section>
2. EWMA control chart methodology
3. Data collection & analysis
4. Monitoring
EWMA Chart The EWMA control chart (exponentially weighted moving average) is more
is better for 5. Remedies and strategies for dealing with out-of-control signals.
difficult to implement but should be considered if the goal is quick detection
detecting of small changes. The decision process for the EWMA chart is based on an
small changes exponentially decreasing (over time) function of prior measurements on the
check standard while the decision process for the Shewhart chart is based on
the current measurement only.

Example of In the EWMA control chart below, the red dots represent the measurements.
EWMA Chart Control is exercised via the exponentially weighted moving average (shown
as the curved line) which, in this case, is approaching its upper control limit.

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc22.htm (2 of 3) [11/13/2003 5:37:37 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc22.htm (3 of 3) [11/13/2003 5:37:37 PM]


2.2.2.1. Shewhart control chart 2.2.2.1. Shewhart control chart

Choice of k To achieve tight control of the measurement process, set


depends on
number of k=2
2. Measurement Process Characterization measurements in which case approximately 5% of the measurements from a process
2.2. Statistical control of a measurement process we are willing that is in control will produce out-of-control signals. This assumes
2.2.2. How are bias and variability controlled? to reject that there is a sufficiently large number of degrees of freedom (>100)
for estimating the process standard deviation.

2.2.2.1. Shewhart control chart To flag only those measurements that are egregiously out of control,
set
Example of The Shewhart control chart has a baseline and upper and lower limits, k=3
Shewhart shown as dashed lines, that are symmetric about the baseline. in which case approximately 1% of the measurements from an
control chart Measurements are plotted on the chart versus a time line. in-control process will produce out-of-control signals.
for mass Measurements that are outside the limits are considered to be out of
calibrations control.

Baseline is the The baseline for the control chart is the accepted value, an average of
average from the historical check standard values. A minimum of 100 check
historical data standard values is required to establish an accepted value.

Caution - The upper (UCL) and lower (LCL) control limits are:
control limits
are computed UCL = Accepted value + k*process standard
from the deviation
process
standard
deviation --
LCL = Accepted value - k*process standard deviation
not from where the process standard deviation is the standard deviation
rational computed from the check standard database.
subsets

Individual This procedure is an individual observations control chart. The


measurements previously described control charts depended on rational subsets,
cannot be which use the standard deviations computed from the rational subsets
assessed using to calculate the control limits. For a measurement process, the
the standard subgroups would consist of short-term repetitions which can
deviation from characterize the precision of the instrument but not the long-term
short-term variability of the process. In measurement science, the interest is in
repetitions assessing individual measurements (or averages of short-term
repetitions). Thus, the standard deviation over time is the appropriate
measure of variability.

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc221.htm (1 of 2) [11/13/2003 5:37:37 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc221.htm (2 of 2) [11/13/2003 5:37:37 PM]


2.2.2.1.1. EWMA control chart 2.2.2.1.1. EWMA control chart

Limits for the The target or center line for the control chart is the average of historical
control chart data. The upper (UCL) and lower (LCL) limits are

2. Measurement Process Characterization


2.2. Statistical control of a measurement process
2.2.2. How are bias and variability controlled?
2.2.2.1. Shewhart control chart

2.2.2.1.1. EWMA control chart


where s times the radical expression is a good approximation to the
standard deviation of the EWMA statistic and the factor k is chosen in
Small Because it takes time for the patterns in the data to emerge, a permanent
the same way as for the Shewhart control chart -- generally to be 2 or 3.
changes only shift in the process may not immediately cause individual violations of
become the control limits on a Shewhart control chart. The Shewhart control
obvious over Procedure The implementation of the EWMA control chart is the same as for any
chart is not powerful for detecting small changes, say of the order of 1 -
time for other type of control procedure. The procedure is built on the
1/2 standard deviations. The EWMA (exponentially weighted moving implementing assumption that the "good" historical data are representative of the
average) control chart is better suited to this purpose. the EWMA in-control process, with future data from the same process tested for
control chart agreement with the historical data. To start the procedure, a target
Example of The exponentially weighted moving average (EWMA) is a statistic for (average) and process standard deviation are computed from historical
EWMA monitoring the process that averages the data in a way that gives less check standard data. Then the procedure enters the monitoring stage
control chart and less weight to data as they are further removed in time from the with the EWMA statistics computed and tested against the control
for mass current measurement. The data limits. The EWMA statistics are weighted averages, and thus their
calibrations Y1, Y2, ... , Yt standard deviations are smaller than the standard deviations of the raw
are the check standard measurements ordered in time. The EWMA data and the corresponding control limits are narrower than the control
statistic at time t is computed recursively from individual data points, limits for the Shewhart individual observations chart.
with the first EWMA statistic, EWMA1, being the arithmetic average of
historical data.

Control The EWMA control chart can be made sensitive to small changes or a
mechanism gradual drift in the process by the choice of the weighting factor, .A
for EWMA weighting factor of 0.2 - 0.3 is usually suggested for this purpose
(Hunter), and 0.15 is also a popular choice.

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc2211.htm (1 of 2) [11/13/2003 5:37:37 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc2211.htm (2 of 2) [11/13/2003 5:37:37 PM]


2.2.2.2. Data collection 2.2.2.2. Data collection

The check The check standard value for the kth day is
standard
value is
2. Measurement Process Characterization defined as an
2.2. Statistical control of a measurement process average of
2.2.2. How are bias and variability controlled? short-term
repetitions

2.2.2.2. Data collection Accepted The accepted value, or baseline for the control chart, is
value of check
Measurements A schedule should be set up for making measurements on the artifact (check standard
should cover standard) chosen for control purposes. The measurements are structured to sample all
a sufficiently environmental conditions in the laboratory and all other sources of influence on the
long time measurement result, such as operators and instruments. Process The process standard deviation is
period to standard
cover all For high-precision processes where the uncertainty of the result must be guaranteed,
a measurement on the check standard should be included with every measurement deviation
environmental
conditions sequence, if possible, and at least once a day.
For each occasion, J measurements are made on the check standard. If there is no
interest in controlling the short-term variability or precision of the instrument, then
one measurement is sufficient. However, a dual purpose is served by making two or Caution Check standard measurements should be structured in the same way as values
three measurements that track both the bias and the short-term variability of the reported on the test items. For example, if the reported values are averages of two
process with the same database. measurements made within 5 minutes of each other, the check standard values
should be averages of the two measurements made in the same manner.
Depiction of
check Database Averages and short-term standard deviations computed from J repetitions should be
standard recorded in a file along with identifications for all significant factors. The best way
measurements Case study: to record this information is to use one file with one line (row in a spreadsheet) of
with J = 4 Resistivity information in fixed fields for each group. A list of typical entries follows:
repetitions 1. Month
per day on the
surface of a 2. Day
silicon wafer 3. Year
over K days 4. Check standard identification
where the 5. Identification for the measurement design (if applicable)
repetitions
are 6. Instrument identification
randomized 7. Check standard value
over position K days - 4 repetitions
8. Repeatability (short-term) standard deviation from J repetitions
on the wafer 2-level design for measurements on a check standard 9. Degrees of freedom
10. Operator identification
Notation For J measurements on each of K days, the measurements are denoted by
11. Environmental readings (if pertinent)

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc222.htm (1 of 3) [11/13/2003 5:37:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc222.htm (2 of 3) [11/13/2003 5:37:38 PM]


2.2.2.2. Data collection 2.2.2.3. Monitoring bias and long-term variability

2. Measurement Process Characterization


2.2. Statistical control of a measurement process
2.2.2. How are bias and variability controlled?

2.2.2.3. Monitoring bias and long-term variability


Monitoring Once the baseline and control limits for the control chart have been determined from historical data,
stage and any bad observations removed and the control limits recomputed, the measurement process enters
the monitoring stage. A Shewhart control chart and EWMA control chart for monitoring a mass
calibration process are shown below. For the purpose of comparing the two techniques, the two
control charts are based on the same data where the baseline and control limits are computed from the
data taken prior to 1985. The monitoring stage begins at the start of 1985. Similarly, the control limits
for both charts are 3-standard deviation limits. The check standard data and analysis are explained
more fully in another section.

Shewhart
control chart
of
measurements
of kilogram
check
standard
showing
outliers and a
shift in the
process that
occurred after
1985

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc222.htm (3 of 3) [11/13/2003 5:37:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc223.htm (1 of 3) [11/13/2003 5:37:38 PM]


2.2.2.3. Monitoring bias and long-term variability 2.2.2.3. Monitoring bias and long-term variability

EWMA chart In the EWMA control chart below, the control data after 1985 are shown in green, and the EWMA
for statistics are shown as black dots superimposed on the raw data. The EWMA statistics, and not the
measurements raw data, are of interest in looking for out-of-control signals. Because the EWMA statistic is a
on kilogram weighted average, it has a smaller standard deviation than a single control measurement, and,
therefore, the EWMA control limits are narrower than the limits for the Shewhart control chart shown
check
above.
standard
showing
multiple
violations of
the control
limits for the
EWMA
statistics

Measurements The control strategy is based on the predictability of future measurements from historical data. Each
that exceed new check standard measurement is plotted on the control chart in real time. These values are
the control expected to fall within the control limits if the process has not changed. Measurements that exceed the
limits require control limits are probably out-of-control and require remedial action. Possible causes of
action out-of-control signals need to be understood when developing strategies for dealing with outliers.

Signs of The control chart should be viewed in its entirety on a regular basis] to identify drift or shift in the
significant process. In the Shewhart control chart shown above, only a few points exceed the control limits. The
trends or small, but significant, shift in the process that occurred after 1985 can only be identified by examining
shifts the plot of control measurements over time. A re-analysis of the kilogram check standard data shows
that the control limits for the Shewhart control chart should be updated based on the the data after
1985. In the EWMA control chart, multiple violations of the control limits occur after 1986. In the
calibration environment, the incidence of several violations should alert the control engineer that a
shift in the process has occurred, possibly because of damage or change in the value of a reference
standard, and the process requires review.

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc223.htm (2 of 3) [11/13/2003 5:37:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc223.htm (3 of 3) [11/13/2003 5:37:38 PM]


2.2.2.4. Remedial actions 2.2.2.4. Remedial actions

Check for 3. Examine the patterns of recent data. If the process is gradually
drift drifting out of control because of degradation in instrumentation
or artifacts, then:
2. Measurement Process Characterization ❍ Instruments may need to be repaired
2.2. Statistical control of a measurement process ❍ Reference artifacts may need to be recalibrated.
2.2.2. How are bias and variability controlled?
Reevaluate 4. Reestablish the process value and control limits from more
2.2.2.4. Remedial actions recent data if the measurement process cannot be brought back
into control.

Consider There are many possible causes of out-of-control signals.


possible
causes for A. Causes that do not warrant corrective action for the process (but
out-of-control which do require that the current measurement be discarded) are:
signals and 1. Chance failure where the process is actually in-control
take 2. Glitch in setting up or operating the measurement process
corrective 3. Error in recording of data
long-term
actions B. Changes in bias can be due to:
1. Damage to artifacts
2. Degradation in artifacts (wear or build-up of dirt and mineral
deposits)
C. Changes in long-term variability can be due to:
1. Degradation in the instrumentation
2. Changes in environmental conditions
3. Effect of a new or inexperienced operator

4-step An immediate strategy for dealing with out-of-control signals


strategy for associated with high precision measurement processes should be
short-term pursued as follows:

Repeat 1. Repeat the measurement sequence to establish whether or not


measurements the out-of-control signal was simply a chance occurrence, glitch,
or whether it flagged a permanent change or trend in the process.

Discard 2. With high precision processes, for which a check standard is


measurements measured along with the test items, new values should be
on test items assigned to the test items based on new measurement data.

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc224.htm (1 of 2) [11/13/2003 5:37:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc224.htm (2 of 2) [11/13/2003 5:37:38 PM]


2.2.3. How is short-term variability controlled? 2.2.3.1. Control chart for standard deviations

2. Measurement Process Characterization 2. Measurement Process Characterization


2.2. Statistical control of a measurement process 2.2. Statistical control of a measurement process
2.2.3. How is short-term variability controlled?

2.2.3. How is short-term variability


controlled? 2.2.3.1. Control chart for standard
deviations
Emphasis on Short-term variability or instrument precision is controlled by
instruments monitoring standard deviations from repeated measurements on the Degradation Changes in the precision of the instrument, particularly anomalies and
instrument(s) of interest. The database can come from measurements on of instrument degradation, must be addressed. Changes in precision can be detected
a single artifact or a representative set of artifacts. or anomalous by a statistical control procedure based on the F-distribution where the
behavior on short-term standard deviations are plotted on the control chart.
Artifacts - The artifacts must be of the same type and geometry as items that are one occasion
The base line for this type of control chart is the pooled standard
Case study: measured in the workload, such as: deviation, s1, as defined in Data collection and analysis.
Resistivity 1. Items from the workload
2. A single check standard chosen for this purpose Example of Only the upper control limit, UCL, is of interest for detecting
3. A collection of artifacts set aside for this specific purpose control chart degradation in the instrument. As long as the short-term standard
for a mass deviations fall within the upper control limit established from historical
Concepts The concepts that are covered in this section include: balance data, there is reason for confidence that the precision of the instrument
covered in 1. Control chart methodology for standard deviations has not degraded (i.e., common cause variations).
this section
2. Data collection and analysis The control The control limit is
3. Monitoring limit is based
4. Remedies and strategies for dealing with out-of-control signals on the
F-distribution
where the quantity under the radical is the upper critical value from
the F-table with degrees of freedom (J - 1) and K(J - 1). The numerator
degrees of freedom, v1 = (J -1), refers to the standard deviation
computed from the current measurements, and the denominator
degrees of freedom, v2 = K(J -1), refers to the pooled standard
deviation of the historical data. The probability is chosen to be
small, say 0.05.
The justification for this control limit, as opposed to the more
conventional standard deviation control limit, is that we are essentially
performing the following hypothesis test:
H0: 1 = 2
Ha: 2 > 1

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc23.htm [11/13/2003 5:37:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc231.htm (1 of 2) [11/13/2003 5:37:38 PM]


2.2.3.1. Control chart for standard deviations 2.2.3.2. Data collection

where 1 is the population value for the s1 defined above and 2 is the
population value for the standard deviation of the current values being
tested. Generally, s1 is based on sufficient historical data that it is
reasonable to make the assumption that 1 is a "known" value.
2. Measurement Process Characterization
The upper control limit above is then derived based on the standard 2.2. Statistical control of a measurement process
F-test for equal standard deviations. Justification and details of this 2.2.3. How is short-term variability controlled?
derivation are given in Cameron and Hailes (1974).

Run software Dataplot can compute the value of the F-statistic. For the case where 2.2.3.2. Data collection
macro for alpha = 0.05; J = 6; K = 6, the commands
computing Case study: A schedule should be set up for making measurements with a single
the F factor Resistivity instrument (once a day, twice a week, or whatever is appropriate for
let alpha = 0.05 sampling all conditions of measurement).
let alphau = 1 - alpha
let j = 6 Short-term The measurements are denoted
let k = 6 standard
let v1 = j-1 deviations
let v2 = k*(v1)
let F = fppf(alphau, v1, v2) where there are J measurements on each of K occasions. The average for
the kth occasion is:
return the following value:
THE COMPUTED VALUE OF THE CONSTANT F =
0.2533555E+01

The short-term (repeatability) standard deviation for the kth occasion is:

with (J-1) degrees of freedom.

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc231.htm (2 of 2) [11/13/2003 5:37:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc232.htm (1 of 2) [11/13/2003 5:37:39 PM]


2.2.3.2. Data collection 2.2.3.3. Monitoring short-term precision

Pooled The repeatability standard deviations are pooled over the K occasions to
standard obtain an estimate with K(J - 1) degrees of freedom of the level-1
deviation standard deviation 2. Measurement Process Characterization
2.2. Statistical control of a measurement process
2.2.3. How is short-term variability controlled?

2.2.3.3. Monitoring short-term precision


Note: The same notation is used for the repeatability standard deviation Monitoring future precision Once the base line and control limit for the control chart have been determined from
historical data, the measurement process enters the monitoring stage. In the control chart
whether it is based on one set of measurements or pooled over several shown below, the control limit is based on the data taken prior to 1985.
sets.
Each new standard deviation is Each new short-term standard deviation based on J measurements is plotted on the control
monitored on the control chart chart; points that exceed the control limits probably indicate lack of statistical control. Drift
Database The individual short-term standard deviations along with identifications over time indicates degradation of the instrument. Points out of control require remedial
for all significant factors are recorded in a file. The best way to record action, and possible causes of out of control signals need to be understood when developing
this information is by using one file with one line (row in a spreadsheet) strategies for dealing with outliers.
of information in fixed fields for each group. A list of typical entries
Control chart for precision for a
follows. mass balance from historical
1. Identification of test item or check standard standard deviations for the balance
with 3 degrees of freedom each. The
2. Date control chart identifies two outliers
3. Short-term standard deviation and slight degradation over time in
the precision of the balance
4. Degrees of freedom
5. Instrument
6. Operator

TIME IN YEARS

Monitoring where the number of


measurements are different from J

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc232.htm (2 of 2) [11/13/2003 5:37:39 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc233.htm (1 of 2) [11/13/2003 5:37:39 PM]


2.2.3.3. Monitoring short-term precision 2.2.3.4. Remedial actions

There is no requirement that future


standard deviations be based on J,
the number of measurements in the
historical database. However, a
change in the number of
measurements leads to a change in 2. Measurement Process Characterization
the test for control, and it may not be 2.2. Statistical control of a measurement process
convenient to draw a control chart
2.2.3. How is short-term variability controlled?
where the control limits are
changing with each new
measurement sequence.
For a new standard deviation based 2.2.3.4. Remedial actions
on J' measurements, the precision of
the instrument is in control if
Examine A. Causes that do not warrant corrective action (but which do require
possible that the current measurement be discarded) are:
.
causes 1. Chance failure where the precision is actually in control
Notice that the numerator degrees of
freedom, v1 = J'- 1, changes but the 2. Glitch in setting up or operating the measurement process
denominator degrees of freedom, v2
= K(J - 1), remains the same. 3. Error in recording of data
B. Changes in instrument performance can be due to:
1. Degradation in electronics or mechanical components
2. Changes in environmental conditions
3. Effect of a new or inexperienced operator

Repeat Repeat the measurement sequence to establish whether or not the


measurements out-of-control signal was simply a chance occurrence, glitch, or
whether it flagged a permanent change or trend in the process.

Assign new With high precision processes, for which the uncertainty must be
value to test guaranteed, new values should be assigned to the test items based on
item new measurement data.

Check for Examine the patterns of recent standard deviations. If the process is
degradation gradually drifting out of control because of degradation in
instrumentation or artifacts, instruments may need to be repaired or
replaced.

http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc233.htm (2 of 2) [11/13/2003 5:37:39 PM] http://www.itl.nist.gov/div898/handbook/mpc/section2/mpc234.htm [11/13/2003 5:37:39 PM]


2.3. Calibration 2.3. Calibration

Catalog of calibration designs


1. Mass weights
2. Gage blocks

2. Measurement Process Characterization


3. Electrical standards - saturated standard cells, zeners, resistors
4. Roundness standards
5. Angle blocks
2.3. Calibration
6. Indexing tables
7. Humidity cylinders
The purpose of this section is to outline the procedures for calibrating
artifacts and instruments while guaranteeing the 'goodness' of the Control of artifact calibration
calibration results. Calibration is a measurement process that assigns
values to the property of an artifact or to the response of an instrument 1. Control of the precision of the calibrating instrument
relative to reference standards or to a designated measurement process. 2. Control of bias and long-term variability
The purpose of calibration is to eliminate or reduce bias in the user's
measurement system relative to the reference base. The calibration What is instrument calibration over a regime?
procedure compares an "unknown" or test item(s) or instrument with 1. Models for instrument calibration
reference standards according to a specific algorithm.
2. Data collection
What are the issues for calibration? 3. Assumptions
1. Artifact or instrument calibration 4. What can go wrong with the calibration procedure?
2. Reference base 5. Data analysis and model validation
3. Reference standard(s) 6. Calibration of future measurements
What is artifact (single-point) calibration? 7. Uncertainties of calibrated values
1. Purpose 1. From propagation of error for a quadratic calibration
2. Assumptions 2. From check standard measurements for a linear calibration
3. Bias 3. Comparison of check standard technique and propagation
4. Calibration model of error

What are calibration designs? Control of instrument calibration


1. Purpose 1. Control chart for linear calibration
2. Assumptions 2. Critical values of t* statistic
3. Properties of designs
4. Restraint
5. Check standard in a design
6. Special types of bias (left-right effect & linear drift)
7. Solutions to calibration designs
8. Uncertainty of calibrated values

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3.htm (1 of 2) [11/13/2003 5:37:45 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3.htm (2 of 2) [11/13/2003 5:37:45 PM]


2.3.1. Issues in calibration 2.3.1.1. Reference base

2. Measurement Process Characterization 2. Measurement Process Characterization


2.3. Calibration 2.3. Calibration
2.3.1. Issues in calibration

2.3.1. Issues in calibration


2.3.1.1. Reference base
Calibration Calibration is a measurement process that assigns values to the property
reduces bias of an artifact or to the response of an instrument relative to reference Ultimate The most critical element of any measurement process is the
standards or to a designated measurement process. The purpose of authority relationship between a single measurement and the reference base for
calibration is to eliminate or reduce bias in the user's measurement the unit of measurement. The reference base is the ultimate source of
system relative to the reference base. authority for the measurement unit.

Artifact & The calibration procedure compares an "unknown" or test item(s) or Base and The base units of measurement in the Le Systeme International d'Unites
instrument instrument with reference standards according to a specific algorithm. derived units (SI) are (Taylor):
calibration Two general types of calibration are considered in this Handbook: of ● kilogram - mass
● artifact calibration at a single point
measurement
● meter - length
● instrument calibration over a regime ● second - time
● ampere - electric current
Types of The procedures in this Handbook are appropriate for calibrations at ● kelvin - thermodynamic temperature
calibration secondary or lower levels of the traceability chain where reference
not standards for the unit already exist. Calibration from first principles of ● mole - amount of substance
discussed physics and reciprocity calibration are not discussed. ● candela - luminous intensity
These units are maintained by the Bureau International des Poids et
Mesures in Paris. Local reference bases for these units and SI derived
units such as:
● pascal - pressure

● newton - force

● hertz - frequency

● ohm - resistance

● degrees Celsius - Celsius temperature, etc.

are maintained by national and regional standards laboratories.

Other Consensus values from interlaboratory tests or


sources instrumentation/standards as maintained in specific environments may
serve as reference bases for other units of measurement.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc31.htm [11/13/2003 5:37:45 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc311.htm (1 of 2) [11/13/2003 5:37:46 PM]


2.3.1.1. Reference base 2.3.1.2. Reference standards

2. Measurement Process Characterization


2.3. Calibration
2.3.1. Issues in calibration

2.3.1.2. Reference standards


Primary A reference standard for a unit of measurement is an artifact that
reference embodies the quantity of interest in a way that ties its value to the
standards reference base.
At the highest level, a primary reference standard is assigned a value by
direct comparison with the reference base. Mass is the only unit of
measurement that is defined by an artifact. The kilogram is defined as
the mass of a platinum-iridium kilogram that is maintained by the
Bureau International des Poids et Mesures in Sevres, France.
Primary reference standards for other units come from realizations of
the units embodied in artifact standards. For example, the reference base
for length is the meter which is defined as the length of the path by light
in vacuum during a time interval of 1/299,792,458 of a second.

Secondary Secondary reference standards are calibrated by comparing with primary


reference standards using a high precision comparator and making appropriate
standards corrections for non-ideal conditions of measurement.
Secondary reference standards for mass are stainless steel kilograms,
which are calibrated by comparing with a primary standard on a high
precision balance and correcting for the buoyancy of air. In turn these
weights become the reference standards for assigning values to test
weights.
Secondary reference standards for length are gage blocks, which are
calibrated by comparing with primary gage block standards on a
mechanical comparator and correcting for temperature. In turn, these
gage blocks become the reference standards for assigning values to test
sets of gage blocks.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc311.htm (2 of 2) [11/13/2003 5:37:46 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc312.htm [11/13/2003 5:37:46 PM]


2.3.2. What is artifact (single-point) calibration? 2.3.2. What is artifact (single-point) calibration?

Calibration One approach to eliminating bias is to select a reference standard that is


model for almost identical to the test item; measure the two artifacts with a
eliminating comparator type of instrument; and take the difference of the two
2. Measurement Process Characterization bias measurements to cancel the bias. The only requirement on the
2.3. Calibration requires a instrument is that it be linear over the small range needed for the two
reference artifacts.
standard
2.3.2. What is artifact (single-point) that is very The test item has value X*, as yet to be assigned, and the reference
close in standard has an assigned value R*. Given a measurement, X, on the
calibration? value to the
test item test item and a measurement, R, on the reference standard,
Purpose Artifact calibration is a measurement process that assigns values to the
property of an artifact relative to a reference standard(s). The purpose of
calibration is to eliminate or reduce bias in the user's measurement
system relative to the reference base. ,

The calibration procedure compares an "unknown" or test item(s) with a the difference between the test item and the reference is estimated by
reference standard(s) of the same nominal value (hence, the term
single-point calibration) according to a specific algorithm called a ,
calibration design. and the value of the test item is reported as

Assumptions The calibration procedure is based on the assumption that individual


readings on test items and reference standards are subject to: .
● Bias that is a function of the measuring system or instrument

● Random error that may be uncontrollable Need for A deficiency in relying on a single difference to estimate D is that there
redundancy is no way of assessing the effect of random errors. The obvious solution
What is The operational definition of bias is that it is the difference between leads to is to:
bias? values that would be assigned to an artifact by the client laboratory and calibration
● Repeat the calibration measurements J times
the laboratory maintaining the reference standards. Values, in this sense, designs
● Average the results
are understood to be the long-term averages that would be achieved in
● Compute a standard deviation from the J results
both laboratories.
Schedules of redundant intercomparisons involving measurements on
several reference standards and test items in a connected sequence are
called calibration designs and are discussed in later sections.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc32.htm (1 of 2) [11/13/2003 5:37:46 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc32.htm (2 of 2) [11/13/2003 5:37:46 PM]


2.3.3. What are calibration designs? 2.3.3. What are calibration designs?

Assumptions The assumptions that are necessary for working with calibration
for calibration designs are that:
designs include ● Random errors associated with the measurements are
2. Measurement Process Characterization demands on independent.
2.3. Calibration the quality of
● All measurements come from a distribution with the same
the artifacts
standard deviation.
2.3.3. What are calibration designs? ● Reference standards and test items respond to the measuring
environment in the same manner.
● Handling procedures are consistent from item to item.
Calibration Calibration designs are redundant schemes for intercomparing
designs are reference standards and test items in such a way that the values can ● Reference standards and test items are stable during the time of

redundant be assigned to the test items based on known values of reference measurement.
schemes for standards. Artifacts that traditionally have been calibrated using ● Bias is canceled by taking the difference between
intercomparing calibration designs are: measurements on the test item and the reference standard.
reference ● mass weights
standards and Important The restraint is the known value of the reference standard or, for
● resistors
test items concept - designs with two or more reference standards, the restraint is the
● voltage standards
Restraint summation of the values of the reference standards.
● length standards

● angle blocks Requirements Basic requirements are:


● indexing tables & properties of ● The differences must be nominally zero.
designs
● liquid-in-glass thermometers, etc. ● The design must be solvable for individual items given the
restraint.
Outline of The topics covered in this section are:
It is possible to construct designs which do not have these properties.
section ● Designs for elimination of left-right bias and linear drift This will happen, for example, if reference standards are only
● Solutions to calibration designs compared among themselves and test items are only compared among
themselves without any intercomparisons.
● Uncertainties of calibrated values

A catalog of calibration designs is provided in the next section. Practical We do not apply 'optimality' criteria in constructing calibration
considerations designs because the construction of a 'good' design depends on many
determine a factors, such as convenience in manipulating the test items, time,
'good' design expense, and the maximum load of the instrument.
● The number of measurements should be small.

● The degrees of freedom should be greater than three.

● The standard deviations of the estimates for the test items


should be small enough for their intended purpose.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc33.htm (1 of 3) [11/13/2003 5:37:46 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc33.htm (2 of 3) [11/13/2003 5:37:46 PM]


2.3.3. What are calibration designs? 2.3.3.1. Elimination of special types of bias

Check Designs listed in this Handbook have provision for a check standard
standard in a in each series of measurements. The check standard is usually an
design artifact, of the same nominal size, type, and quality as the items to be
calibrated. Check standards are used for: 2. Measurement Process Characterization
● Controlling the calibration process 2.3. Calibration
2.3.3. What are calibration designs?
● Quantifying the uncertainty of calibrated results

Estimates that Calibration designs are solved by a restrained least-squares technique 2.3.3.1. Elimination of special types of bias
can be (Zelen) which gives the following estimates:
computed from ● Values for individual reference standards Assumptions Two of the usual assumptions relating to calibration measurements are
a design which may
● Values for individual test items not always valid and result in biases. These assumptions are:
● Value for the check standard be violated ● Bias is canceled by taking the difference between the

● Repeatability standard deviation and degrees of freedom measurement on the test item and the measurement on the
reference standard
● Standard deviations associated with values for reference
● Reference standards and test items remain stable throughout the
standards and test items
measurement sequence

Ideal In the ideal situation, bias is eliminated by taking the difference


situation between a measurement X on the test item and a measurement R on the
reference standard. However, there are situations where the ideal is not
satisfied:
● Left-right (or constant instrument) bias

● Bias caused by instrument drift

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc33.htm (3 of 3) [11/13/2003 5:37:46 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc331.htm [11/13/2003 5:37:46 PM]


2.3.3.1.1. Left-right (constant instrument) bias 2.3.3.1.1. Left-right (constant instrument) bias

Calibration This type of scheme is called left-right balanced and the principle is
designs that extended to create a catalog of left-right balanced designs for
are left-right intercomparing reference standards among themselves. These designs
2. Measurement Process Characterization balanced are appropriate ONLY for comparing reference standards in the same
2.3. Calibration environment, or enclosure, and are not appropriate for comparing, say,
2.3.3. What are calibration designs? across standard voltage cells in two boxes.
2.3.3.1. Elimination of special types of bias 1. Left-right balanced design for a group of 3 artifacts
2. Left-right balanced design for a group of 4 artifacts
2.3.3.1.1. Left-right (constant instrument) 3. Left-right balanced design for a group of 5 artifacts
bias 4. Left-right balanced design for a group of 6 artifacts

Left-right A situation can exist in which a bias, P, which is constant and


bias which is independent of the direction of measurement, is introduced by the
not measurement instrument itself. This type of bias, which has been
eliminated by observed in measurements of standard voltage cells (Eicke &
differencing Cameron) and is not eliminated by reversing the direction of the
current, is shown in the following equations.

Elimination The difference between the test and the reference can be estimated
of left-right without bias only by taking the difference between the two
bias requires measurements shown above where P cancels in the differencing so
two
that
measurements
in reverse
direction .

The value of The test item, X, can then be estimated without bias by
the test item
depends on
the known
value of the
reference
standard, R* and P can be estimated by

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3311.htm (1 of 2) [11/13/2003 5:37:47 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3311.htm (2 of 2) [11/13/2003 5:37:47 PM]


2.3.3.1.2. Bias caused by instrument drift 2.3.3.1.2. Bias caused by instrument drift

Estimates of The drift-free difference between the test and the reference is
drift-free estimated by
difference and
2. Measurement Process Characterization size of drift
2.3. Calibration
2.3.3. What are calibration designs?
2.3.3.1. Elimination of special types of bias and the size of the drift is estimated by

2.3.3.1.2. Bias caused by instrument drift


Bias caused by The requirement that reference standards and test items be stable
linear drift over during the time of measurement cannot always be met because of Calibration This principle is extended to create a catalog of drift-elimination
the time of changes in temperature caused by body heat, handling, etc. designs for designs for multiple reference standards and test items. These
measurement eliminating designs are listed under calibration designs for gauge blocks because
linear drift they have traditionally been used to counteract the effect of
Representation Linear drift for an even number of measurements is represented by temperature build-up in the comparator during calibration.
of linear drift
..., -5d, -3d, -1d, +1d, +3d, +5d, ...
and for an odd number of measurements by

..., -3d, -2d, -1d, 0d, +1d, +2d, +3d, ... .


Assumptions for The effect can be mitigated by a drift-elimination scheme
drift elimination (Cameron/Hailes) which assumes:
● Linear drift over time
● Equally spaced measurements in time

Example of An example is given by substitution weighing where scale


drift-elimination deflections on a balance are observed for X, a test weight, and R, a
scheme reference weight.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3312.htm (1 of 2) [11/13/2003 5:37:47 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3312.htm (2 of 2) [11/13/2003 5:37:47 PM]


2.3.3.2. Solutions to calibration designs 2.3.3.2. Solutions to calibration designs

Limitation of This design has degrees of freedom


this design
v=n-m+1=1
2. Measurement Process Characterization
2.3. Calibration Convention The table shown below lists the coefficients for finding the estimates for the
2.3.3. What are calibration designs? for showing individual items. The estimates are computed by taking the cross-product of the
least-squares appropriate column for the item of interest with the column of measurement data
estimates for and dividing by the divisor shown at the top of the table.
2.3.3.2. Solutions to calibration designs individual
items
Solutions for Solutions for all designs that are cataloged in this Handbook are included with the SOLUTION MATRIX
designs listed designs. Solutions for other designs can be computed from the instructions on the DIVISOR = 3
in the catalog following page given some familiarity with matrices.
OBSERVATIONS 1 1 1
Measurements The use of the tables shown in the catalog are illustrated for three artifacts; namely,
for the 1,1,1 a reference standard with known value R* and a check standard and a test item with Y(1) 0 -2 -1
design unknown values. All artifacts are of the same nominal size. The design is referred Y(2) 0 -1 -2
to as a 1,1,1 design for Y(3) 0 1 -1
R* 3 3 3
● n = 3 difference measurements
● m = 3 artifacts Solutions for For example, the solution for the reference standard is shown under the first
individual column; for the check standard under the second column; and for the test item
Convention The convention for showing the measurement sequence is shown below. Nominal items from the under the third column. Notice that the estimate for the reference standard is
for showing values are underlined in the first line showing that this design is appropriate for table above guaranteed to be R*, regardless of the measurement results, because of the restraint
the comparing three items of the same nominal size such as three one-kilogram that is imposed on the design. The estimates are as follows:
measurement weights. The reference standard is the first artifact, the check standard is the second,
sequence and and the test item is the third.
identifying the
reference and
check
standards 1 1 1

Y(1) = + -

Y(2) = + -

Y(3) = + -

Restraint +

Check standard +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc332.htm (1 of 5) [11/13/2003 5:37:47 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc332.htm (2 of 5) [11/13/2003 5:37:47 PM]


2.3.3.2. Solutions to calibration designs 2.3.3.2. Solutions to calibration designs

Convention The standard deviations are computed from two tables of factors as shown below. Process In order to apply these equations, we need an estimate of the standard deviation,
for showing The standard deviations for combinations of items include appropriate covariance standard sdays, that describes day-to-day changes in the measurement process. This standard
standard terms. deviations deviation is in turn derived from the level-2 standard deviation, s2, for the check
deviations for must be standard. This standard deviation is estimated from historical data on the check
individual known from standard; it can be negligible, in which case the calculations are simplified.
items and FACTORS FOR REPEATABILITY STANDARD DEVIATIONS historical
combinations data The repeatability standard deviation s1, is estimated from historical data, usually
of items WT FACTOR from data of several designs.
K1 1 1 1
1 0.0000 + Steps in The steps in computing the standard deviation for a test item are:
1 0.8165 + computing ● Compute the repeatability standard deviation from the design or historical
1 0.8165 + standard data.
2 1.4142 + + deviations
● Compute the standard deviation of the check standard from historical data.
1 0.8165 +
● Locate the factors, K1 and K2 for the check standard; for the 1,1,1 design
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS the factors are 0.8165 and 1.4142, respectively, where the check standard
entries are last in the tables.
WT FACTOR
● Apply the unifying equation to the check standard to estimate the standard
deviation for days. Notice that the standard deviation of the check standard is
K2 1 1 1
the same as the level-2 standard deviation, s2, that is referred to on some
1 0.0000 +
1 1.4142 + pages. The equation for the between-days standard deviation from the
1 1.4142 + unifying equation is
2 2.4495 + +
1 1.4142 +

Unifying The standard deviation for each item is computed using the unifying equation: .
equation
Thus, for the example above

Standard For the 1,1,1 design, the standard deviations are:


deviations for
.
1,1,1 design
from the ● This is the number that is entered into the NIST mass calibration software as
tables of the between-time standard deviation. If you are using this software, this is the
factors only computation that you need to make because the standard deviations for
the test items are computed automatically by the software.
● If the computation under the radical sign gives a negative number, set
sdays=0. (This is possible and indicates that there is no contribution to
uncertainty from day-to-day effects.)
● For completeness, the computations of the standard deviations for the test
item and for the sum of the test and the check standard using the appropriate
factors are shown below.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc332.htm (3 of 5) [11/13/2003 5:37:47 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc332.htm (4 of 5) [11/13/2003 5:37:47 PM]


2.3.3.2. Solutions to calibration designs 2.3.3.2.1. General matrix solutions to calibration designs

2. Measurement Process Characterization


2.3. Calibration
2.3.3. Calibration designs
2.3.3.2. General solutions to calibration designs

2.3.3.2.1. General matrix solutions to calibration


designs
Requirements Solutions for all designs that are cataloged in this Handbook are included with the designs.
Solutions for other designs can be computed from the instructions below given some
familiarity with matrices. The matrix manipulations that are required for the calculations are:
● transposition (indicated by ')

● multiplication

● inversion

Notation ● n = number of difference measurements


● m = number of artifacts
● (n - m + 1) = degrees of freedom

● X= (nxm) design matrix


● r'= (mx1) vector identifying the restraint
● = (mx1) vector identifying ith item of interest consisting of a 1 in the ith position
and zeros elsewhere

● R*= value of the reference standard


● Y= (mx1) vector of observed difference measurements
Convention The convention for showing the measurement sequence is illustrated with the three
for showing measurements that make up a 1,1,1 design for 1 reference standard, 1 check standard, and 1
the
test item. Nominal values are underlined in the first line .
measurement
sequence
1 1 1
Y(1) = + -

Y(2) = + -

Y(3) = + -

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc332.htm (5 of 5) [11/13/2003 5:37:47 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3321.htm (1 of 5) [11/13/2003 5:37:48 PM]


2.3.3.2.1. General matrix solutions to calibration designs 2.3.3.2.1. General matrix solutions to calibration designs

The measurements are the differences between two measurements, as specified by the design
matrix, measured in grams. That is, Y(1) is the difference in measurement between NIST
Matrix The (mxn) design matrix X is constructed by replacing the pluses (+), minues (-) and blanks kilogram one and NIST kilogram two, Y(2) is the difference in measurement between NIST
algebra for with the entries 1, -1, and 0 respectively. kilogram one and the customer kilogram, and Y(3) is the difference in measurement between
solving a NIST kilogram two and the customer kilogram.
design The (mxm) matrix of normal equations, X'X, is formed and augmented by the restraint
The value of the reference standard, R*, is 0.82329.
vector to form an (m+1)x(m+1) matrix, A:
Then

Inverse of The A matrix is inverted and shown in the form:


design matrix If there are three weights with known values for weights one and two, then
r=[1 1 0]
Thus
where Q is an mxm matrix that, when multiplied by s2, yields the usual variance-covariance
matrix.

Estimates of The least-squares estimates for the values of the individual artifacts are contained in the
values of (mx1) matrix, B, where
individual and so
artifacts

where Q is the upper left element of the Ainv matrix shown above. The structure of the
individual estimates is contained in the QX' matrix; i.e. the estimate for the ith item can
computed from XQ and Yby
From A-1, we have
● Cross multiplying the ith column of XQ with Y

● And adding R*(nominal test)/(nominal restraint)


Clarify with We will clarify the above discussion with an example from the mass calibration process at
an example NIST. In this example, two NIST kilograms are compared with a customer's unknown We then compute XQ
kilogram.
The design matrix, X, is

We then compute B = QX'Y + h'R*

The first two columns represent the two NIST kilograms while the third column represents
the customers kilogram (i.e., the kilogram being calibrated).
The measurements obtained, i.e., the Y matrix, are
This yields the following least-squares coefficient estimates:

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3321.htm (2 of 5) [11/13/2003 5:37:48 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3321.htm (3 of 5) [11/13/2003 5:37:48 PM]


2.3.3.2.1. General matrix solutions to calibration designs 2.3.3.2.1. General matrix solutions to calibration designs

and
Standard The standard deviation for the ith item is:
deviations of
estimates

where and

The process standard deviation, which is a measure of the overall precision of the (NIST)
mass calibrarion process,

is the residual standard deviation from the design, and sdays is the standard deviation for
days, which can only be estimated from check standard measurements.

Example We continue the example started above. Since n = 3 and m = 3, the formula reduces to:

Substituting the values shown above for X, Y, and Q results in

and
Y'(I - XQX')Y = 4.9322
Finally, taking the square root gives
s1 = 2.2209
The next step is to compute the standard deviation of item 3 (the customers kilogram), that is
sitem3. We start by substitituting the values for X and Q and computing D

Next, we substitute = [0 0 1] and = 0.021112 (this value is taken from a check


standard and not computed from the values given in this example).
We obtain the following computations

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3321.htm (4 of 5) [11/13/2003 5:37:48 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3321.htm (5 of 5) [11/13/2003 5:37:48 PM]


2.3.3.3. Uncertainties of calibrated values 2.3.3.3.1. Type A evaluations for calibration designs

2. Measurement Process Characterization 2. Measurement Process Characterization


2.3. Calibration 2.3. Calibration
2.3.3. What are calibration designs? 2.3.3. What are calibration designs?
2.3.3.3. Uncertainties of calibrated values

2.3.3.3. Uncertainties of calibrated values


2.3.3.3.1. Type A evaluations for calibration
Uncertainty
analysis
This section discusses the calculation of uncertainties of calibrated
values from calibration designs. The discussion follows the guidelines
designs
follows the in the section on classifying and combining components of
ISO principles Change over Type A evaluations for calibration processes must take into account
uncertainty. Two types of evaluations are covered.
time changes in the measurement process that occur over time.
1. type A evaluations of time-dependent sources of random error
2. type B evaluations of other sources of error Historically, Historically, computations of uncertainties for calibrated values have
uncertainties treated the precision of the comparator instrument as the primary
The latter includes, but is not limited to, uncertainties from sources considered source of random uncertainty in the result. However, as the precision
that are not replicated in the calibration design such as uncertainties of only of instrumentation has improved, effects of other sources of variability
values assigned to reference standards. instrument have begun to show themselves in measurement processes. This is not
imprecision universally true, but for many processes, instrument imprecision
Uncertainties Uncertainties associated with calibrated values for test items from (short-term variability) cannot explain all the variation in the process.
for test items designs require calculations that are specific to the individual designs.
The steps involved are outlined below. Effects of Effects of humidity, temperature, and other environmental conditions
environmental which cannot be closely controlled or corrected must be considered.
Outline for ● Historical perspective changes These tend to exhibit themselves over time, say, as between-day
the section on ● Assumptions effects. The discussion of between-day (level-2) effects relating to
uncertainty gauge studies carries over to the calibration setting, but the
analysis ● Example of more realistic model
computations are not as straightforward.
● Computation of repeatability standard deviations
● Computation of level-2 standard deviations Assumptions The computations in this section depend on specific assumptions:
● Combination of repeatability and level-2 standard deviations which are 1. Short-term effects associated with instrument response
specific to
● Example of computations for 1,1,1,1 design ● come from a single distribution
this section
● Type B uncertainty associated with the restraint ● vary randomly from measurement to measurement within
a design.
● Expanded uncertainty of calibrated values
2. Day-to-day effects
● come from a single distribution

● vary from artifact to artifact but remain constant for a


single calibration
● vary from calibration to calibration

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc333.htm [11/13/2003 5:37:48 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3331.htm (1 of 3) [11/13/2003 5:37:49 PM]


2.3.3.3.1. Type A evaluations for calibration designs 2.3.3.3.1. Type A evaluations for calibration designs

These These assumptions have proved useful for characterizing high Standard For model (l), the standard deviation of the test item is
assumptions precision measurement processes, but more complicated models may deviations
have proved eventually be needed which take the relative magnitudes of the test from both
useful but items into account. For example, in mass calibration, a 100 g weight models
may need to can be compared with a summation of 50g, 30g and 20 g weights in a
be expanded single measurement. A sophisticated model might consider the size of For model (2), the standard deviation of the test item is
in the future the effect as relative to the nominal masses or volumes.

Example of To contrast the simple model with the more complicated model, a
the two measurement of the difference between X, the test item, with unknown
models for a and yet to be determined value, X*, and a reference standard, R, with
design for known value, R*, and the reverse measurement are shown below. .
calibrating
test item Model (1) takes into account only instrument imprecision so that:
Note on In both cases, is the repeatability standard deviation that describes
using 1 (1) relative
reference contributions the precision of the instrument and is the level-2 standard
standard of both deviation that describes day-to-day changes. One thing to notice in the
components standard deviation for the test item is the contribution of relative to
to uncertainty
the total uncertainty. If is large relative to , or dominates, the
with the error terms random errors that come from the imprecision of uncertainty will not be appreciably reduced by adding measurements
the measuring instrument. to the calibration design.
Model (2) allows for both instrument imprecision and level-2 effects
such that:
(2)

where the delta terms explain small changes in the values of the
artifacts that occur over time. For both models, the value of the test
item is estimated as

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3331.htm (2 of 3) [11/13/2003 5:37:49 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3331.htm (3 of 3) [11/13/2003 5:37:49 PM]


2.3.3.3.2. Repeatability and level-2 standard deviations 2.3.3.3.2. Repeatability and level-2 standard deviations

Level-2 The level-2 standard deviation cannot be estimated from the data of the
standard calibration design. It cannot generally be estimated from repeated
deviation is designs involving the test items. The best mechanism for capturing the
2. Measurement Process Characterization estimated day-to-day effects is a check standard, which is treated as a test item
2.3. Calibration from check and included in each calibration design. Values of the check standard,
2.3.3. What are calibration designs? standard estimated over time from the calibration design, are used to estimate
2.3.3.3. Uncertainties of calibrated values measurements the standard deviation.

Assumptions The check standard value must be stable over time, and the
2.3.3.3.2. Repeatability and level-2 standard measurements must be in statistical control for this procedure to be
deviations valid. For this purpose, it is necessary to keep a historical record of
values for a given check standard, and these values should be kept by
instrument and by design.
Repeatability The repeatability standard deviation of the instrument can be computed
standard in two ways.
deviation Computation Given K historical check standard values,
1. It can be computed as the residual standard deviation from the of level-2
comes from design and should be available as output from any software
the data of a standard
package that reduces data from calibration designs. The matrix deviation
single design equations for this computation are shown in the section on the standard deviation of the check standard values is computed as
solutions to calibration designs. The standard deviation has
degrees of freedom

v=n-m+1
for n difference measurements and m items. Typically the where
degrees of freedom are very small. For two differences
measurements on a reference standard and test item, the degrees
of freedom is v=1.

A more 2. A more reliable estimate of the standard deviation can be with degrees of freedom v = K - 1.
reliable computed by pooling variances from K calibrations (and then
estimate taking its square root) using the same instrument (assuming the
comes from instrument is in statistical control). The formula for the pooled
pooling over estimate is
historical
data

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3332.htm (1 of 2) [11/13/2003 5:37:49 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3332.htm (2 of 2) [11/13/2003 5:37:49 PM]


2.3.3.3.3. Combination of repeatability and level-2 standard deviations 2.3.3.3.4. Calculation of standard deviations for 1,1,1,1 design

2. Measurement Process Characterization


2. Measurement Process Characterization 2.3. Calibration
2.3. Calibration 2.3.3. What are calibration designs?
2.3.3. What are calibration designs? 2.3.3.3. Uncertainties of calibrated values
2.3.3.3. Uncertainties of calibrated values

2.3.3.3.4. Calculation of standard deviations for 1,1,1,1


2.3.3.3.3. Combination of repeatability and design
level-2 standard deviations Design with An example is shown below for a 1,1,1,1 design for two reference standards, R1 and R2, and
2 reference two test items, X1 and X2, and six difference measurements. The restraint, R*, is the sum of
Standard The final question is how to combine the repeatability standard standards values of the two reference standards, and the check standard, which is independent of the
deviation of deviation and the standard deviation of the check standard to estimate and 2 test restraint, is the difference between the values of the reference standards. The design and its
test item the standard deviation of the test item. This computation depends on: items solution are reproduced below.
depends on ● structure of the design
several Check
● position of the check standard in the design standard is
factors
● position of the reference standards in the design the
difference OBSERVATIONS 1 1 1 1
● position of the test item in the design
between the
2 reference Y(1) + -
Derivations Tables for estimating standard deviations for all test items are reported standards Y(2) + -
require along with the solutions for all designs in the catalog. The use of the
matrix
Y(3) + -
tables for estimating the standard deviations for test items is illustrated
algebra Y(4) + -
for the 1,1,1,1 design. Matrix equations can be used for deriving
Y(5) + -
estimates for designs that are not in the catalog.
Y(6) + -
The check standard for each design is either an additional test item in
the design, other than the test items that are submitted for calibration,
or it is a construction, such as the difference between two reference RESTRAINT + +
standards as estimated by the design.

CHECK STANDARD + -

DEGREES OF FREEDOM = 3

SOLUTION MATRIX
DIVISOR = 8

OBSERVATIONS 1 1 1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3333.htm [11/13/2003 5:37:49 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3334.htm (1 of 3) [11/13/2003 5:37:50 PM]


2.3.3.3.4. Calculation of standard deviations for 1,1,1,1 design 2.3.3.3.4. Calculation of standard deviations for 1,1,1,1 design

Y(1) 2 -2 0 0 Unifying The unifying equation is:


equation
Y(2) 1 -1 -3 -1
Y(3) 1 -1 -1 -3
Y(4) -1 1 -3 -1
Y(5) -1 1 -1 -3 Standard The steps in computing the standard deviation for a test item are:
Y(6) 0 0 2 -2 deviations ● Compute the repeatability standard deviation from historical data.
are
R* 4 4 4 4 computed ● Compute the standard deviation of the check standard from historical data.
using the ● Locate the factors, K1 and K2, for the check standard.
Explanation The solution matrix gives values for the test items of factors from ● Compute the between-day variance (using the unifying equation for the check standard).
of solution the tables For this example,
matrix with the
unifying
equation

Factors for
computing
contributions
of
FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
repeatability WT FACTOR
.
and level-2 K1 1 1 1 1
standard ● If this variance estimate is negative, set = 0. (This is possible and indicates that
1 0.3536 +
deviations to there is no contribution to uncertainty from day-to-day effects.)
uncertainty 1 0.3536 +
1 0.6124 + ● Locate the factors, K1 and K2, for the test items, and compute the standard deviations
using the unifying equation. For this example,
1 0.6124 +
0 0.7071 + -

and
FACTORS FOR LEVEL-2 STANDARD DEVIATIONS
WT FACTOR
K2 1 1 1 1
1 0.7071 +
1 0.7071 +
1 1.2247 +
1 1.2247 +
0 1.4141 + -
The first table shows factors for computing the contribution of the repeatability standard
deviation to the total uncertainty. The second table shows factors for computing the contribution
of the between-day standard deviation to the uncertainty. Notice that the check standard is the
last entry in each table.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3334.htm (2 of 3) [11/13/2003 5:37:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3334.htm (3 of 3) [11/13/2003 5:37:50 PM]


2.3.3.3.5. Type B uncertainty 2.3.3.3.5. Type B uncertainty

2. Measurement Process Characterization


2.3. Calibration
2.3.3. What are calibration designs?
2.3.3.3. Uncertainties of calibrated values

2.3.3.3.5. Type B uncertainty


Type B The reference standard is assumed to have known value, R*, for the
uncertainty purpose of solving the calibration design. For the purpose of computing
associated a standard uncertainty, it has a type B uncertainty that contributes to the
with the uncertainty of the test item.
restraint
The value of R* comes from a higher-level calibration laboratory or
process, and its value is usually reported along with its uncertainty, U. If
the laboratory also reports the k factor for computing U, then the
standard deviation of the restraint is

If k is not reported, then a conservative way of proceeding is to assume k


= 2.

Situation Usually, a reference standard and test item are of the same nominal size
where the and the calibration relies on measuring the small difference between the
test is two; for example, the intercomparison of a reference kilogram compared
different in with a test kilogram. The calibration may also consist of an
size from the intercomparison of the reference with a summation of artifacts where
reference the summation is of the same nominal size as the reference; for example,
a reference kilogram compared with 500 g + 300 g + 200 g test weights.

Type B The type B uncertainty that accrues to the test artifact from the
uncertainty uncertainty of the reference standard is proportional to their nominal
for the test sizes; i.e.,
artifact

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3335.htm (1 of 2) [11/13/2003 5:37:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3335.htm (2 of 2) [11/13/2003 5:37:50 PM]


2.3.3.3.6. Expanded uncertainties 2.3.3.3.6. Expanded uncertainties

Degrees of freedom Therefore, the degrees of freedom is approximated as


using the
Welch-Satterthwaite
2. Measurement Process Characterization approximation
2.3. Calibration
2.3.3. What are calibration designs?
2.3.3.3. Uncertainties of calibrated values

where n - 1 is the degrees of freedom associated with the check standard uncertainty.
2.3.3.3.6. Expanded uncertainties Notice that the standard deviation of the restraint drops out of the calculation because
of an infinite degrees of freedom.
Standard The standard uncertainty for the test item is
uncertainty

Expanded The expanded uncertainty is computed as


uncertainty

where k is either the critical value from the t table for degrees of freedom v or k is set
equal to 2.

Problem of the The calculation of degrees of freedom, v, can be a problem. Sometimes it can be
degrees of freedom computed using the Welch-Satterthwaite approximation and the structure of the
uncertainty of the test item. Degrees of freedom for the standard deviation of the
restraint is assumed to be infinite. The coefficients in the Welch-Satterthwaite formula
must all be positive for the approximation to be reliable.

Standard deviation For the 1,1,1,1 design, the standard deviation of the test items can be rewritten by
for test item from substituting in the equation
the 1,1,1,1 design

so that the degrees of freedom depends only on the degrees of freedom in the standard
deviation of the check standard. This device may not work satisfactorily for all designs.

Standard To complete the calculation shown in the equation at the top of the page, the nominal
uncertainty from the value of the test item (which is equal to 1) is divided by the nominal value of the
1,1,1,1 design restraint (which is also equal to 1), and the result is squared. Thus, the standard
uncertainty is

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3336.htm (1 of 2) [11/13/2003 5:37:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3336.htm (2 of 2) [11/13/2003 5:37:50 PM]


2.3.4. Catalog of calibration designs 2.3.4. Catalog of calibration designs

Convention Nominal sizes of standards and test items are shown at the top of the design. Pluses (+) indicate items
for showing that are measured together; and minuses (-) indicate items are not measured together. The difference
the measurements are constructed from the design of pluses and minuses. For example, a 1,1,1 design for
2. Measurement Process Characterization measurement one reference standard and two test items of the same nominal size with three measurements is shown
2.3. Calibration sequence below:

2.3.4. Catalog of calibration designs 1 1 1


Y(1) = + -
Important The designs are constructed for measuring differences among reference standards and test items, singly Y(2) = + -
concept - or in combinations. Values for individual standards and test items can be computed from the design Y(3) = + -
Restraint only if the value (called the restraint = R*) of one or more reference standards is known. The
methodology for constructing and solving calibration designs is described briefly in matrix solutions
and in more detail in a NIST publication. (Cameron et al.). Solution The cross-product of the column of difference measurements and R* with a column from the solution
matrix matrix, divided by the named divisor, gives the value for an individual item. For example,
Designs Designs are listed by traditional subject area although many of the designs are appropriate generally for
listed in this intercomparisons of artifact standards. Example and
catalog interpretation
● Designs for mass weights Solution matrix
● Drift-eliminating designs for gage blocks Divisor = 3
● Left-right balanced designs for electrical standards
● Designs for roundness standards 1 1 1
● Designs for angle blocks
Y(1) 0 -2 -1
Y(2) 0 -1 -2
● Drift-eliminating design for thermometers in a bath
Y(3) 0 +1 -1
● Drift-eliminating designs for humidity cylinders
R* +3 +3 +3
Properties of Basic requirements are:
designs in 1. The differences must be nominally zero.
this catalog implies that estimates for the restraint and the two test items are:
2. The design must be solvable for individual items given the restraint.
Other desirable properties are:
1. The number of measurements should be small.
2. The degrees of freedom should be greater than zero.
3. The standard deviations of the estimates for the test items should be small enough for their
intended purpose.

Information: Given
Design ● n = number of difference measurements
● m = number of artifacts (reference standards + test items) to be calibrated
Solution
the following information is shown for each design:
Factors for
● Design matrix -- (n x m)
computing
standard ● Vector that identifies standards in the restraint -- (1 x m)

deviations ● Degrees of freedom = (n - m + 1)

● Solution matrix for given restraint -- (n x m)

● Table of factors for computing standard deviations

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc34.htm (1 of 3) [11/13/2003 5:37:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc34.htm (2 of 3) [11/13/2003 5:37:51 PM]


2.3.4. Catalog of calibration designs 2.3.4.1. Mass weights

Interpretation The factors in this table provide information on precision. The repeatability standard deviation, , is
of table of multiplied by the appropriate factor to obtain the standard deviation for an individual item or
factors combination of items. For example,

2. Measurement Process Characterization


2.3. Calibration
Sum Factor 1 1 1
2.3.4. Catalog of calibration designs
1 0.0000 +
1 0.8166 +
1
2
0.8166
1.4142 +
+
+
2.3.4.1. Mass weights
Tie to Near-accurate mass measurements require a sequence of designs that
implies that the standard deviations for the estimates are:
kilogram relate the masses of individual weights to a reference kilogram(s)
reference standard ( Jaeger & Davis). Weights generally come in sets, and an
standards entire set may require several series to calibrate all the weights in the
set.

Example of A 5,3,2,1 weight set would have the following weights:


weight set
1000 g
500g, 300g, 200g, 100g
50g, 30g 20g, 10g
5g, 3g, 2g, 1g
0.5g, 0.3g, 0.2g, 0.1g

Depiction of
a design
with three
series for
calibrating
a 5,3,2,1
weight set
with weights
between 1
kg and 10 g

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc34.htm (3 of 3) [11/13/2003 5:37:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341.htm (1 of 4) [11/13/2003 5:37:51 PM]


2.3.4.1. Mass weights 2.3.4.1. Mass weights

2nd series The second series is a 5,3,2,1,1,1 design where the restraint over the
using 500g, 300g and 200g weights comes from the value assigned to the
5,3,2,1,1,1 summation in the first series; i.e.,
design

The weights assigned values by this series are:


● 500g, 300g, 200 g and 100g test weights

● 100 g check standard (2nd 100g weight in the design)

● Summation of the 50g, 30g, 20g weights.

Other The calibration sequence can also start with a 1,1,1 design. This design
starting has the disadvantage that it does not have provision for a check
points standard.

Better A better choice is a 1,1,1,1,1 design which allows for two reference
choice of kilograms and a kilogram check standard which occupies the 4th
design position among the weights. This is preferable to the 1,1,1,1 design but
has the disadvantage of requiring the laboratory to maintain three
kilogram standards.

First series The calibrations start with a comparison of the one kilogram test weight Important The solutions are only applicable for the restraints as shown.
using with the reference kilograms (see the graphic above). The 1,1,1,1 design detail
1,1,1,1 requires two kilogram reference standards with known values, R1* and
design R2*. The fourth kilogram in this design is actually a summation of the Designs for 1. 1,1,1 design
500, 300, 200 g weights which becomes the restraint in the next series. decreasing 2. 1,1,1,1 design
weight sets
The restraint for the first series is the known average mass of the 3. 1,1,1,1,1 design
reference kilograms, 4. 1,1,1,1,1,1 design
5. 2,1,1,1 design
6. 2,2,1,1,1 design
The design assigns values to all weights including the individual 7. 2,2,2,1,1 design
reference standards. For this design, the check standard is not an artifact 8. 5,2,2,1,1,1 design
standard but is defined as the difference between the values assigned to
9. 5,2,2,1,1,1,1 design
the reference kilograms by the design; namely,
10. 5,3,2,1,1,1 design
11. 5,3,2,1,1,1,1 design
12. 5,3,2,2,1,1,1 design
13. 5,4,4,3,2,2,1,1 design
14. 5,5,2,2,1,1,1,1 design

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341.htm (2 of 4) [11/13/2003 5:37:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341.htm (3 of 4) [11/13/2003 5:37:51 PM]


2.3.4.1. Mass weights 2.3.4.1.1. Design for 1,1,1

15. 5,5,3,2,1,1,1 design


16. 1,1,1,1,1,1,1,1 design
17. 3,2,1,1,1 design
2. Measurement Process Characterization
Design for 1. 1,2,2,1,1 design 2.3. Calibration
pound 2.3.4. Catalog of calibration designs
weights 2.3.4.1. Mass weights

Designs for 1. 1,1,1 design


increasing 2. 1,1,1,1 design 2.3.4.1.1. Design for 1,1,1
weight sets
3. 5,3,2,1,1 design
Design 1,1,1
4. 5,3,2,1,1,1 design
5. 5,2,2,1,1,1 design
6. 3,2,1,1,1 design OBSERVATIONS 1 1 1

Y(1) + -
Y(2) + -
Y(3) + -

RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 1

SOLUTION MATRIX
DIVISOR = 3

OBSERVATIONS 1 1 1

Y(1) 0 -2 -1
Y(2) 0 -1 -2
Y(3) 0 1 -1
R* 3 3 3

R* = value of reference weight

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341.htm (4 of 4) [11/13/2003 5:37:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3411.htm (1 of 2) [11/13/2003 5:37:51 PM]


2.3.4.1.1. Design for 1,1,1 2.3.4.1.2. Design for 1,1,1,1

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


WT FACTOR
1 1 1
1 0.0000 +
2. Measurement Process Characterization
1 0.8165 +
2.3. Calibration
1 0.8165 + 2.3.4. Catalog of calibration designs
2 1.4142 + + 2.3.4.1. Mass weights
1 0.8165 +

FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS 2.3.4.1.2. Design for 1,1,1,1


WT FACTOR
1 1 1
1 0.0000 +
1 1.4142 + Design 1,1,1,1
1 1.4142 +
2 2.4495 + +
1 1.4142 + OBSERVATIONS 1 1 1 1

Explanation of notation and interpretation of tables


Y(1) + -
Y(2) + -
Y(3) + -
Y(4) + -
Y(5) + -
Y(6) + -

RESTRAINT + +

CHECK STANDARD + -

DEGREES OF FREEDOM = 3

SOLUTION MATRIX
DIVISOR = 8

OBSERVATIONS 1 1 1 1

Y(1) 2 -2 0 0
Y(2) 1 -1 -3 -1
Y(3) 1 -1 -1 -3
Y(4) -1 1 -3 -1
Y(5) -1 1 -1 -3

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3411.htm (2 of 2) [11/13/2003 5:37:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3412.htm (1 of 2) [11/13/2003 5:37:51 PM]


2.3.4.1.2. Design for 1,1,1,1 2.3.4.1.3. Design for 1,1,1,1,1

Y(6) 0 0 2 -2
R* 4 4 4 4

R* = sum of two reference standards 2. Measurement Process Characterization


2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.1. Mass weights

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


WT FACTOR 2.3.4.1.3. Design for 1,1,1,1,1
K1 1 1 1 1 CASE 1: CHECK STANDARD = DIFFERENCE BETWEEN CASE 2: CHECK STANDARD = FOURTH WEIGHT
1 0.3536 + FIRST TWO WEIGHTS
1 0.3536 +
1 0.6124 + OBSERVATIONS 1 1 1 1 1
1 0.6124 + OBSERVATIONS 1 1 1 1 1
0 0.7071 + -
Y(1) + -
Y(1) + - Y(2) + -
Y(2) + - Y(3) + -
Y(3) + - Y(4) + -
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS Y(4) + - Y(5) + -
WT FACTOR Y(5) + - Y(6) + -
Y(6) + - Y(7) + -
K2 1 1 1 1 Y(7) + - Y(8) + -
1 0.7071 + Y(8) + - Y(9) + -
Y(9) + - Y(10) + -
1 0.7071 + Y(10) + -
1 1.2247 +
1 1.2247 + RESTRAINT + +
RESTRAINT + +
0 1.4141 + - CHECK STANDARD +
CHECK STANDARD + -
DEGREES OF FREEDOM = 6
Explanation of notation and interpretation of tables DEGREES OF FREEDOM = 6

SOLUTION MATRIX
SOLUTION MATRIX DIVISOR = 10
DIVISOR = 10
OBSERVATIONS 1 1 1 1 1
OBSERVATIONS 1 1 1 1 1

Y(1) 2 -2 0 0 0
Y(1) 2 -2 0 0 0 Y(2) 1 -1 -3 -1 -1
Y(2) 1 -1 -3 -1 -1 Y(3) 1 -1 -1 -3 -1
Y(3) 1 -1 -1 -3 -1 Y(4) 1 -1 -1 -1 -3
Y(4) 1 -1 -1 -1 -3 Y(5) -1 1 -3 -1 -1
Y(5) -1 1 -3 -1 -1 Y(6) -1 1 -1 -3 -1
Y(6) -1 1 -1 -3 -1 Y(7) -1 1 -1 -1 -3
Y(7) -1 1 -1 -1 -3 Y(8) 0 0 2 -2 0
Y(8) 0 0 2 -2 0 Y(9) 0 0 2 0 -2
Y(9) 0 0 2 0 -2 Y(10) 0 0 0 2 -2
Y(10) 0 0 0 2 -2 R* 5 5 5 5 5
R* 5 5 5 5 5
R* = sum of two reference standards
R* = sum of two reference standards

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


FACTORS FOR REPEATABILITY STANDARD DEVIATIONS WT FACTOR
WT FACTOR K1 1 1 1 1 1
K1 1 1 1 1 1 1 0.3162 +
1 0.3162 + 1 0.3162 +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3412.htm (2 of 2) [11/13/2003 5:37:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3413.htm (1 of 2) [11/13/2003 5:37:51 PM]


2.3.4.1.3. Design for 1,1,1,1,1 2.3.4.1.4. Design for 1,1,1,1,1,1
1 0.3162 + 1 0.5477 +
1 0.5477 + 1 0.5477 +
1 0.5477 + 1 0.5477 +
1 0.5477 + 2 0.8944 + +
2 0.8944 + + 3 1.2247 + + +
3 1.2247 + + + 1 0.5477 +
0 0.6325 + -
2. Measurement Process Characterization
2.3. Calibration
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS WT FACTOR 2.3.4. Catalog of calibration designs
WT FACTOR K2 1 1 1 1 1 2.3.4.1. Mass weights
K2 1 1 1 1 1 1 0.7071 +
1 0.7071 + 1 0.7071 +
1 0.7071 + 1 1.2247 +
1 1.2247
1 1.2247
+
+
1 1.2247
1 1.2247
+
+
2.3.4.1.4. Design for 1,1,1,1,1,1
1 1.2247 + 2 2.0000 + +
2 2.0000 + + 3 2.7386 + + +
3 2.7386 + + + 1 1.2247 +
0 1.4142 + - Design 1,1,1,1,1,1
Explanation of notation and interpretation of tables

OBSERVATIONS 1 1 1 1 1 1

X(1) + -
X(2) + -
X(3) + -
X(4) + -
X(5) + -
X(6) + -
X(7) + -
X(8) + -
X(9) + -
X(10) + -
X(11) + -
X(12) + -
X(13) + -
X(14) + -
X(15) + -

RESTRAINT + +

CHECK STANDARD +

DEGREES OF FREEDOM = 10

SOLUTION MATRIX

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3413.htm (2 of 2) [11/13/2003 5:37:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3414.htm (1 of 3) [11/13/2003 5:37:52 PM]


2.3.4.1.4. Design for 1,1,1,1,1,1 2.3.4.1.4. Design for 1,1,1,1,1,1

DIVISOR = 8 1 1.2247 +
1 1.2247 +
OBSERVATIONS 1 1 1 1 1 1 1 1.2247 +
2 2.0000 + +
Y(1) 1 -1 0 0 0 0 3 2.7386 + + +
Y(2) 1 0 -1 0 0 0 4 3.4641 + + + +
Y(3) 1 0 0 -1 0 0 1 1.2247 +
Y(4) 1 0 0 0 -1 0
Y(5) 2 1 1 1 1 0 Explanation of notation and interpretation of tables
Y(6) 0 1 -1 0 0 0
Y(7) 0 1 0 -1 0 0
Y(8) 0 1 0 0 -1 0
Y(9) 1 2 1 1 1 0
Y(10) 0 0 1 -1 0 0
Y(11) 0 0 1 0 -1 0
Y(12) 1 1 2 1 1 0
Y(13) 0 0 0 1 -1 0
Y(14) 1 1 1 2 1 0
Y(15) 1 1 1 1 2 0
R* 6 6 6 6 6 6

R* = sum of two reference standards

FACTORS FOR COMPUTING REPEATABILITY STANDARD DEVIATIONS


WT FACTOR
1 1 1 1 1 1
1 0.2887 +
1 0.2887 +
1 0.5000 +
1 0.5000 +
1 0.5000 +
1 0.5000 +
2 0.8165 + +
3 1.1180 + + +
4 1.4142 + + + +
1 0.5000 +

FACTORS FOR COMPUTING BETWEEN-DAY STANDARD DEVIATIONS


WT FACTOR
1 1 1 1 1 1
1 0.7071 +
1 0.7071 +
1 1.2247 +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3414.htm (2 of 3) [11/13/2003 5:37:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3414.htm (3 of 3) [11/13/2003 5:37:52 PM]


2.3.4.1.5. Design for 2,1,1,1 2.3.4.1.5. Design for 2,1,1,1

Y(4) 0 1 0 -1
Y(5) 0 1 -1 0
Y(6) 0 0 1 -1
R* 4 2 2 2
2. Measurement Process Characterization
R* = value of the reference standard
2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.1. Mass weights

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


2.3.4.1.5. Design for 2,1,1,1 WT FACTOR
2 1 1 1
2 0.0000 +
1 0.5000 +
Design 2,1,1,1 1 0.5000 +
1 0.5000 +
2 0.7071 + +
OBSERVATIONS 2 1 1 1 3 0.8660 + + +
1 0.5000 +
Y(1) + - -
Y(2) + - - FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS
Y(3) + - - WT FACTOR
Y(4) + - 2 1 1 1
Y(5) + - 2 0.0000 +
Y(6) + - 1 1.1180 +
1 1.1180 +
1 1.1180 +
RESTRAINT + 2 1.7321 + +
3 2.2913 + + +
CHECK STANDARD + 1 1.1180 +

Explanation of notation and interpretation of tables


DEGREES OF FREEDOM = 3

SOLUTION MATRIX
DIVISOR = 4

OBSERVATIONS 2 1 1 1

Y(1) 0 -1 0 -1
Y(2) 0 0 -1 -1
Y(3) 0 -1 -1 0

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3415.htm (1 of 2) [11/13/2003 5:37:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3415.htm (2 of 2) [11/13/2003 5:37:52 PM]


2.3.4.1.6. Design for 2,2,1,1,1 2.3.4.1.6. Design for 2,2,1,1,1

Y(1) 47 -3 -44 66 11
Y(2) 25 -25 0 -55 55
Y(3) 3 -47 44 -11 -66
Y(4) 25 -25 0 0 0
2. Measurement Process Characterization
Y(5) 29 4 -33 -33 22
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(6) 29 4 -33 22 -33
2.3.4.1. Mass weights Y(7) 7 -18 11 -44 -44
Y(8) 4 29 -33 -33 22
Y(9) 4 29 -33 22 -33
2.3.4.1.6. Design for 2,2,1,1,1 Y(10) -18 7 11 -44 -44
R* 110 110 55 55 55

R* = sum of three reference standards


Design 2,2,1,1,1

OBSERVATIONS 2 2 1 1 1 FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


WT FACTOR
2 2 1 1 1
Y(1) + - - + 2 0.2710 +
Y(2) + - - + 2 0.2710 +
Y(3) + - + - 1 0.3347 +
Y(4) + - 1 0.4382 +
Y(5) + - - 1 0.4382 +
Y(6) + - - 2 0.6066 + +
Y(7) + - - 3 0.5367 + + +
Y(8) + - - 1 0.4382 +
Y(9) + - -
Y(10) + - -
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS
WT FACTOR
RESTRAINT + + + 2 2 1 1 1
2 0.8246 +
CHECK STANDARD + 2 0.8246 +
1 0.8485 +
1 1.0583 +
DEGREES OF FREEDOM = 6 1 1.0583 +
2 1.5748 + +
3 1.6971 + + +
1 1.0583 +
SOLUTION MATRIX
DIVISOR = 275 Explanation of notation and interpretation of tables
OBSERVATIONS 2 2 1 1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3416.htm (1 of 3) [11/13/2003 5:37:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3416.htm (2 of 3) [11/13/2003 5:37:52 PM]


2.3.4.1.6. Design for 2,2,1,1,1 2.3.4.1.7. Design for 2,2,2,1,1

2. Measurement Process Characterization


2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.1. Mass weights

2.3.4.1.7. Design for 2,2,2,1,1

Design 2,2,2,1,1

OBSERVATIONS 2 2 2 1 1

Y(1) + -
Y(2) + -
Y(3) + -
Y(4) + - -
Y(5) + - -
Y(6) + - -
Y(7) + -

RESTRAINT + +

CHECK STANDARD +

DEGREES OF FREEDOM = 3

SOLUTION MATRIX
DIVISOR = 16

OBSERVATIONS 2 2 2 1 1

Y(1) 4 -4 0 0 0
Y(2) 2 -2 -6 -1 -1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3416.htm (3 of 3) [11/13/2003 5:37:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3417.htm (1 of 2) [11/13/2003 5:37:52 PM]


2.3.4.1.7. Design for 2,2,2,1,1 2.3.4.1.8. Design for 5,2,2,1,1,1

Y(3) -2 2 -6 -1 -1
Y(4) 2 -2 -2 -3 -3
Y(5) -2 2 -2 -3 -3
Y(6) 0 0 4 -2 -2
Y(7) 0 0 0 8 -8
2. Measurement Process Characterization
R* 8 8 8 4 4
2.3. Calibration
2.3.4. Catalog of calibration designs
R* = sum of the two reference standards 2.3.4.1. Mass weights

2.3.4.1.8. Design for 5,2,2,1,1,1


FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
WT FACTOR
2 2 2 1 1
2 0.3536 + Design 5,2,2,1,1,1
2 0.3536 +
2 0.6124 +
1 0.5863 + OBSERVATIONS 5 2 2 1 1 1
1 0.5863 +
2 0.6124 + +
4 1.0000 + + + Y(1) + - - - - +
1 0.5863 + Y(2) + - - - + -
Y(3) + - - + - -
Y(4) + - - - -
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS Y(5) + - - - -
WT FACTOR Y(6) + - + -
2 2 2 1 1 Y(7) + - - +
2 0.7071 + Y(8) + - + -
2 0.7071 +
2 1.2247 +
1 1.0607 + RESTRAINT + + + +
1 1.0607 +
2 1.5811 + + CHECK STANDARD +
4 2.2361 + + +
1 1.0607 +
DEGREES OF FREEDOM = 3
Explanation of notation and interpretation of tables

SOLUTION MATRIX
DIVISOR = 70

OBSERVATIONS 5 2 2 1 1 1

Y(1) 15 -8 -8 1 1 21

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3417.htm (2 of 2) [11/13/2003 5:37:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3418.htm (1 of 2) [11/13/2003 5:37:52 PM]


2.3.4.1.8. Design for 5,2,2,1,1,1 2.3.4.1.9. Design for 5,2,2,1,1,1,1

Y(2) 15 -8 -8 1 21 1
Y(3) 5 -12 -12 19 -1 -1
Y(4) 0 2 12 -14 -14 -14
Y(5) 0 12 2 -14 -14 -14
Y(6) -5 8 -12 9 -11 -1
2. Measurement Process Characterization
Y(7) 5 12 -8 -9 1 11
2.3. Calibration
Y(8) 0 10 -10 0 10 -10 2.3.4. Catalog of calibration designs
R* 35 14 14 7 7 7 2.3.4.1. Mass weights

R* = sum of the four reference standards


2.3.4.1.9. Design for 5,2,2,1,1,1,1
FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
WT FACTOR Design 5,2,2,1,1,1,1
5 2 2 1 1 1
5 0.3273 +
2 0.3854 + OBSERVATIONS 5 2 2 1 1 1 1
2 0.3854 +
1 0.4326 +
1 0.4645 + Y(1) + - - -
1 0.4645 + Y(2) + - - -
1 0.4645 + Y(3) + - - -
Y(4) + - - -
Y(5) + + - - -
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS Y(6) + + - - -
WT FACTOR Y(7) + + - - - -
5 2 2 1 1 1 Y(8) + -
5 1.0000 + Y(9) + -
2 0.8718 + Y(10) + -
2 0.8718 +
1 0.9165 +
1 1.0198 + RESTRAINT + + + +
1 1.0198 +
1 1.0198 + CHECK STANDARD +

Explanation of notation and interpretation of tables


DEGREES OF FREEDOM = 4

SOLUTION MATRIX
DIVISOR = 60

OBSERVATIONS 5 2 2 1 1 1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3418.htm (2 of 2) [11/13/2003 5:37:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3419.htm (1 of 3) [11/13/2003 5:37:52 PM]


2.3.4.1.9. Design for 5,2,2,1,1,1,1 2.3.4.1.9. Design for 5,2,2,1,1,1,1

Explanation of notation and interpretation of tables


Y(1) 12 0 0 -12 0 0 0
Y(2) 6 -4 -4 2 -12 3 3
Y(3) 6 -4 -4 2 3 -12 3
Y(4) 6 -4 -4 2 3 3 -12
Y(5) -6 28 -32 10 -6 -6 -6
Y(6) -6 -32 28 10 -6 -6 -6
Y(7) 6 8 8 -22 -6 -6 -6
Y(8) 0 0 0 0 15 -15 0
Y(9) 0 0 0 0 15 0 -15
Y(10) 0 0 0 0 0 15 -15
R* 30 12 12 6 6 6 6

R* = sum of the four reference standards

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


WT FACTOR
5 2 2 1 1 1 1
5 0.3162 +
2 0.7303 +
2 0.7303 +
1 0.4830 +
1 0.4472 +
1 0.4472 +
1 0.4472 +
2 0.5477 + +
3 0.5477 + + +
1 0.4472 +

FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS


WT FACTOR
5 2 2 1 1 1 1
5 1.0000 +
2 0.8718 +
2 0.8718 +
1 0.9165 +
1 1.0198 +
1 1.0198 +
1 1.0198 +
2 1.4697 + +
3 1.8330 + + +
1 1.0198 +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3419.htm (2 of 3) [11/13/2003 5:37:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3419.htm (3 of 3) [11/13/2003 5:37:52 PM]


2.3.4.1.10. Design for 5,3,2,1,1,1 2.3.4.1.10. Design for 5,3,2,1,1,1

Y(1) 100 -68 -32 119 -111 4


Y(2) 100 -68 -32 4 119 -111
Y(3) 100 -68 -32 -111 4 119
Y(4) 100 -68 -32 4 4 4
Y(5) 60 -4 -56 -108 -108 -108
2. Measurement Process Characterization
Y(6) -20 124 -104 128 -102 -102
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(7) -20 124 -104 -102 128 -102
2.3.4.1. Mass weights Y(8) -20 124 -104 -102 -102 128
Y(9) -20 -60 80 -125 -125 -10
Y(10) -20 -60 80 -125 -10 -125
2.3.4.1.10. Design for 5,3,2,1,1,1 Y(11) -20 -60 80 -10 -125 -125
R* 460 276 184 92 92 92

R* = sum of the three reference standards

OBSERVATIONS 5 3 2 1 1 1
FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
WT FACTOR
Y(1) + - - + - 5 3 2 1 1 1
Y(2) + - - + - 5 0.2331 +
Y(3) + - - - + 3 0.2985 +
Y(4) + - - 2 0.2638 +
Y(5) + - - - - 1 0.3551 +
Y(6) + - + - - 1 0.3551 +
Y(7) + - - + - 1 0.3551 +
Y(8) + - - - + 2 0.5043 + +
Y(9) + - - 3 0.6203 + + +
Y(10) + - - 1 0.3551 +
Y(11) + - -
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS
WT FACTOR
RESTRAINT + + + 5 3 2 1 1 1
5 0.8660 +
CHECK STANDARD + 3 0.8185 +
2 0.8485 +
1 1.0149 +
DEGREES OF FREEDOM = 6 1 1.0149 +
1 1.0149 +
2 1.4560 + +
SOLUTION MATRIX 3 1.8083 + + +
DIVISOR = 920 1 1.0149 +
OBSERVATIONS 5 3 2 1 1 1 Explanation of notation and interpretation of tables

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341a.htm (1 of 3) [11/13/2003 5:37:53 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341a.htm (2 of 3) [11/13/2003 5:37:53 PM]


2.3.4.1.10. Design for 5,3,2,1,1,1 2.3.4.1.11. Design for 5,3,2,1,1,1,1

2. Measurement Process Characterization


2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.1. Mass weights

2.3.4.1.11. Design for 5,3,2,1,1,1,1

Design 5,3,2,1,1,1,1

OBSERVATIONS 5 3 2 1 1 1 1

Y(1) + - -
Y(2) + - - -
Y(3) + - - -
Y(4) + - - - -
Y(5) + - - - -
Y(6) + - - - -
Y(7) + - - - -
Y(8) + - -
Y(9) + - -
Y(10) + - -
Y(11) + - -

RESTRAINT + + +

CHECK STANDARD +

DEGREES OF FREEDOM = 5

SOLUTION MATRIX
DIVISOR = 40

OBSERVATIONS 5 3 2 1 1 1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341a.htm (3 of 3) [11/13/2003 5:37:53 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341b.htm (1 of 3) [11/13/2003 5:37:53 PM]


2.3.4.1.11. Design for 5,3,2,1,1,1,1 2.3.4.1.11. Design for 5,3,2,1,1,1,1

2 1.4560 + +
3 1.8083 + + +
Y(1) 20 -4 -16 12 12 12 12 4 2.1166 + + + +
Y(2) 0 -4 4 -8 -8 2 2 1 1.0149 +
Y(3) 0 -4 4 2 2 -8 -8
Y(4) 0 0 0 -5 -5 -10 10 Explanation of notation and interpretation of tables
Y(5) 0 0 0 -5 -5 10 -10
Y(6) 0 0 0 -10 10 -5 -5
Y(7) 0 0 0 10 -10 -5 -5
Y(8) 0 4 -4 -12 8 3 3
Y(9) 0 4 -4 8 -12 3 3
Y(10) 0 4 -4 3 3 -12 8
Y(11) 0 4 -4 3 3 8 -12
R* 20 12 8 4 4 4 4

R* = sum of the three reference standards

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


WT FACTOR
5 3 2 1 1 1 1
5 0.5000 +
3 0.2646 +
2 0.4690 +
1 0.6557 +
1 0.6557 +
1 0.6557 +
1 0.6557 +
2 0.8485 + +
3 1.1705 + + +
4 1.3711 + + + +
1 0.6557 +

FACTORS FOR LEVEL-2 STANDARD DEVIATIONS


WT FACTOR
5 3 2 1 1 1 1
5 0.8660 +
3 0.8185 +
2 0.8485 +
1 1.0149 +
1 1.0149 +
1 1.0149 +
1 1.0149 +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341b.htm (2 of 3) [11/13/2003 5:37:53 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341b.htm (3 of 3) [11/13/2003 5:37:53 PM]


2.3.4.1.12. Design for 5,3,2,2,1,1,1 2.3.4.1.12. Design for 5,3,2,2,1,1,1

Y(1) 2 0 -2 2 0 0 0
Y(2) 0 -6 6 -4 -2 -2 -2
Y(3) 1 1 -2 0 -1 1 1
2. Measurement Process Characterization
Y(4) 1 1 -2 0 1 -1 1
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(5) 1 1 -2 0 1 1 -1
2.3.4.1. Mass weights Y(6) -1 1 0 -2 -1 1 1
Y(7) -1 1 0 -2 1 -1 1
Y(8) -1 1 0 -2 1 1 -1
2.3.4.1.12. Design for 5,3,2,2,1,1,1 Y(9) 0 -2 2 2 -4 -4 -4
Y(10) 0 0 0 0 2 -2 0
Y(11) 0 0 0 0 0 2 -2
Y(12) 0 0 0 0 -2 0 2
R* 5 3 2 2 1 1 1
OBSERVATIONS 5 3 2 2 1 1 1 R* = sum of the three reference standards

Y(1) + - - FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


Y(2) + - - WT FACTOR
Y(3) + - - - 5 3 2 2 1 1 1
Y(4) + - - - 5 0.3162 +
Y(5) + - - - 3 0.6782 +
Y(6) + - - 2 0.7483 +
Y(7) + - - 2 0.6000 +
Y(8) + - - 1 0.5831 +
Y(9) + - - - 1 0.5831 +
Y(10) + - 1 0.5831 +
Y(11) + - 3 0.8124 + +
Y(12) - + 4 1.1136 + + +
1 0.5831 +
RESTRAINT + + + FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS
WT FACTOR
CHECK STANDARDS + 5 3 2 2 1 1 1
5 0.8660 +
3 0.8185 +
DEGREES OF FREEDOM = 6 2 0.8485 +
2 1.0583 +
1 1.0149 +
1 1.0149 +
SOLUTION MATRIX 1 1.0149 +
DIVISOR = 10 3 1.5067 + +
4 1.8655 + + +
OBSERVATIONS 5 3 2 2 1 1 1 1 1.0149 +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341c.htm (1 of 3) [11/13/2003 5:38:02 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341c.htm (2 of 3) [11/13/2003 5:38:02 PM]


2.3.4.1.12. Design for 5,3,2,2,1,1,1 2.3.4.1.13. Design for 5,4,4,3,2,2,1,1

Explanation of notation and interpretation of tables

2. Measurement Process Characterization


2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.1. Mass weights

2.3.4.1.13. Design for 5,4,4,3,2,2,1,1

OBSERVATIONS 5 4 4 3 2 2 1 1

Y(1) + + - - - - -
Y(2) + + - - - - -
Y(3) + - -
Y(4) + - -
Y(5) + - -
Y(6) + - -
Y(7) + - - -
Y(8) + - - -
Y(9) + - -
Y(10) + - -
Y(11) + - -
Y(12) + - -

RESTRAINT + +

CHECK STANDARD + -

DEGREES OF FREEDOM = 5

SOLUTION MATRIX
DIVISOR = 916

OBSERVATIONS 5 4 4 3 2 2 1 1

Y(1) 232 325 123 8 -37 135 -1 1


Y(2) 384 151 401 108 73 105 101 -101

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341c.htm (3 of 3) [11/13/2003 5:38:02 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341d.htm (1 of 3) [11/13/2003 5:38:03 PM]


2.3.4.1.13. Design for 5,4,4,3,2,2,1,1 2.3.4.1.13. Design for 5,4,4,3,2,2,1,1

Y(3) 432 84 308 236 168 204 -144 144 20 6.2893 + + + + + +


Y(4) 608 220 196 400 440 -120 408 -408 0 1.4226 + -
Y(5) 280 258 30 136 58 234 -246 246
Y(6) 24 -148 68 64 -296 164 -8 8
Y(7) -104 -122 -142 28 214 -558 -118 118 Explanation of notation and interpretation of tables
Y(8) -512 -354 -382 -144 -250 -598 18 -18
Y(9) 76 -87 139 -408 55 443 51 -51
Y(10) -128 26 -210 -36 -406 194 -110 110
Y(11) -76 87 -139 -508 -55 473 -51 51
Y(12) -300 -440 -392 116 36 -676 100 -100
R* 1224 696 720 516 476 120 508 408

R* = sum of the two reference standards (for going-up calibrations)

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


WT FACTOR
5 4 4 3 2 2 1 1
5 1.2095 +
4 0.8610 +
4 0.9246 +
3 0.9204 +
2 0.8456 +
2 1.4444 +
1 0.5975 +
1 0.5975 +
4 1.5818 + +
7 1.7620 + + +
11 2.5981 + + + +
15 3.3153 + + + + +
20 4.4809 + + + + + +
0 1.1950 + -

FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS


WT FACTOR
5 4 4 3 2 2 1 1
5 2.1380 +
4 1.4679 +
4 1.4952 +
3 1.2785 +
2 1.2410 +
2 1.0170 +
1 0.7113 +
1 0.7113 +
4 1.6872 + +
7 2.4387 + + +
11 3.4641 + + + +
15 4.4981 + + + + +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341d.htm (2 of 3) [11/13/2003 5:38:03 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341d.htm (3 of 3) [11/13/2003 5:38:03 PM]


2.3.4.1.14. Design for 5,5,2,2,1,1,1,1 2.3.4.1.14. Design for 5,5,2,2,1,1,1,1

Y(2) -30 30 -12 -12 -10 -22 -2 10


Y(3) 30 -30 -12 -12 10 -2 -22 -10
Y(4) -30 30 -12 -12 -2 10 -10 -22
Y(5) 0 0 6 6 -12 -12 -12 -12
Y(6) -30 30 33 -27 -36 24 -36 24
2. Measurement Process Characterization
Y(7) 30 -30 33 -27 24 -36 24 -36
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(8) 0 0 -27 33 -18 6 6 -18
2.3.4.1. Mass weights Y(9) 0 0 -27 33 6 -18 -18 6
Y(10) 0 0 0 0 32 8 -32 -8
Y(11) 0 0 0 0 8 32 -8 -32
2.3.4.1.14. Design for 5,5,2,2,1,1,1,1 R* 60 60 24 24 12 12 12 12

R* = sum of the two reference standards


Design 5,5,2,2,1,1,1,1
FACTORS FOR COMPUTING REPEATABILITY STANDARD DEVIATIONS
WT FACTOR
OBSERVATIONS 5 5 2 2 1 1 1 1 5 5 2 2 1 1 1 1
5 0.6124 +
5 0.6124 +
Y(1) + - - - 2 0.5431 +
Y(2) + - - - 2 0.5431 +
Y(3) + - - - 1 0.5370 +
Y(4) + - - - 1 0.5370 +
Y(5) + + - - - - 1 0.5370 +
Y(6) + - - 1 0.5370 +
Y(7) + - - 2 0.6733 + +
Y(8) + - - 4 0.8879 + + +
Y(9) + - - 6 0.8446 + + + +
Y(10) + - 11 1.0432 + + + + +
Y(11) + - 16 0.8446 + + + + + +
1 0.5370 +
RESTRAINT + + FACTORS FOR COMPUTING LEVEL-2 STANDARD DEVIATIONS
WT FACTOR
CHECK STANDARD + 5 5 2 2 1 1 1 1
5 0.7071 +
DEGREES OF FREEDOM = 4 5 0.7071 +
2 1.0392 +
2 1.0392 +
1 1.0100 +
SOLUTION MATRIX 1 1.0100 +
DIVISOR = 120 1 1.0100 +
1 1.0100 +
OBSERVATIONS 5 5 2 2 1 1 1 1 2 1.4422 + +
4 1.8221 + + +
6 2.1726 + + + +
Y(1) 30 -30 -12 -12 -22 -10 10 -2 11 2.2847 + + + + +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341e.htm (1 of 3) [11/13/2003 5:38:03 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341e.htm (2 of 3) [11/13/2003 5:38:03 PM]


2.3.4.1.14. Design for 5,5,2,2,1,1,1,1 2.3.4.1.15. Design for 5,5,3,2,1,1,1

16 2.1726 + + + + + +
1 1.0100 +

Explanation of notation and interpretation of tables


2. Measurement Process Characterization
2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.1. Mass weights

2.3.4.1.15. Design for 5,5,3,2,1,1,1

OBSERVATIONS 5 5 3 2 1 1 1

Y(1) + - -
Y(2) + - -
Y(3) + - - - -
Y(4) + - - - -
Y(5) + - - -
Y(6) + - - -
Y(7) + - - -
Y(8) + - - -
Y(9) + - - -
Y(10) + - - -

RESTRAINT + +

CHECK STANDARD +

DEGREES OF FREEDOM = 4

SOLUTION MATRIX
DIVISOR = 10

OBSERVATIONS 5 5 3 2 1 1 1

Y(1) 1 -1 -2 -3 1 1 1
Y(2) -1 1 -2 -3 1 1 1
Y(3) 1 -1 2 -2 -1 -1 -1
Y(4) -1 1 2 -2 -1 -1 -1
Y(5) 1 -1 -1 1 -2 -2 3
Y(6) 1 -1 -1 1 -2 3 -2
Y(7) 1 -1 -1 1 3 -2 -2
Y(8) -1 1 -1 1 -2 -2 3
Y(9) -1 1 -1 1 -2 3 -2
Y(10) -1 1 -1 1 3 -2 -2

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341e.htm (3 of 3) [11/13/2003 5:38:03 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341f.htm (1 of 2) [11/13/2003 5:38:03 PM]


2.3.4.1.15. Design for 5,5,3,2,1,1,1 2.3.4.1.16. Design for 1,1,1,1,1,1,1,1 weights

R* 5 5 3 2 1 1 1

R* = sum of the two reference standards

2. Measurement Process Characterization


FACTORS FOR REPEATABILITY STANDARD DEVIATIONS 2.3. Calibration
WT FACTOR 2.3.4. Catalog of calibration designs
5 5 3 2 1 1 1 2.3.4.1. Mass weights
5 0.3162 +
5 0.3162 +
3 0.4690
2 0.5657
+
+ 2.3.4.1.16. Design for 1,1,1,1,1,1,1,1 weights
1 0.6164 +
1 0.6164 +
1 0.6164 + OBSERVATIONS 1 1 1 1 1 1 1 1
3 0.7874 + +
6 0.8246 + + +
11 0.8832 + + + +
16 0.8246 + + + + + Y(1) + -
1 0.6164 + Y(2) + -
Y(3) + -
Y(4) + -
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS Y(5) + -
WT FACTOR Y(6) + -
5 5 3 2 1 1 1
5 0.7071 +
Y(7) + -
5 0.7071 + Y(8) + -
3 1.0863 + Y(9) + -
2 1.0392 + Y(10) + -
1 1.0100 + Y(11) + -
1 1.0100 + Y(12) + -
1 1.0100 +
3 1.4765 + +
6 1.9287 + + +
11 2.0543 + + + +
RESTRAINT + +
16 1.9287 + + + + +
1 1.0100 + CHECK STANDARD +

Explanation of notation and interpretation of tables


DEGREES OF FREEDOM = 5

SOLUTION MATRIX
DIVISOR = 12

OBSERVATIONS 1 1 1 1 1 1 1 1

Y(1) 1 -1 -6 0 0 0 0 0
Y(2) 1 -1 0 -6 0 0 0 0
Y(3) 1 -1 0 0 -6 0 0 0

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341f.htm (2 of 2) [11/13/2003 5:38:03 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341g.htm (1 of 3) [11/13/2003 5:38:03 PM]


2.3.4.1.16. Design for 1,1,1,1,1,1,1,1 weights 2.3.4.1.16. Design for 1,1,1,1,1,1,1,1 weights

Y(4) 1 -1 0 0 0 -6 0 0 6 4.8990 + + + + + +
Y(5) 1 -1 0 0 0 0 -6 0 1 1.2247 +
Y(6) 1 -1 0 0 0 0 0 -6
Y(7) -1 1 -6 0 0 0 0 0
Y(8) -1 1 0 -6 0 0 0 0 Explanation of notation and interpretation of tables
Y(9) -1 1 0 0 -6 0 0 0
Y(10) -1 1 0 0 0 -6 0 0
Y(11) -1 1 0 0 0 0 -6 0
Y(12) -1 1 0 0 0 0 0 -6
R* 6 6 6 6 6 6 6 6

R* = sum of the two reference standards

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS

WT K1 1 1 1 1 1 1 1 1
1 0.2887 +
1 0.2887 +
1 0.7071 +
1 0.7071 +
1 0.7071 +
1 0.7071 +
1 0.7071 +
1 0.7071 +
2 1.0000 + +
3 1.2247 + + +
4 1.4142 + + + +
5 1.5811 + + + + +
6 1.7321 + + + + + +
1 0.7071 +

FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS

WT K2 1 1 1 1 1 1 1 1
1 0.7071 +
1 0.7071 +
1 1.2247 +
1 1.2247 +
1 1.2247 +
1 1.2247 +
1 1.2247 +
1 1.2247 +
2 2.0000 + +
3 2.7386 + + +
4 3.4641 + + + +
5 4.1833 + + + + +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341g.htm (2 of 3) [11/13/2003 5:38:03 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341g.htm (3 of 3) [11/13/2003 5:38:03 PM]


2.3.4.1.17. Design for 3,2,1,1,1 weights 2.3.4.1.17. Design for 3,2,1,1,1 weights
R* = sum of the two reference standards

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


2. Measurement Process Characterization
2.3. Calibration
2.3.4. Catalog of calibration designs
WT K1 3 2 1 1 1
2.3.4.1. Mass weights 3 0.2530 +
2 0.2530 +
1 0.4195 +
2.3.4.1.17. Design for 3,2,1,1,1 weights 1 0.4195 +
1 0.4195 +
2 0.5514 + +
3 0.6197 + + +
1 0.4195 +
OBSERVATIONS 3 2 1 1 1
FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS
Y(1) + - - WT K2 3 2 1 1 1
Y(2) + - - 3 0.7211 +
Y(3) + - - 2 0.7211 +
Y(4) + - - - 1 1.0392 +
Y(5) + - - 1 1.0392 +
Y(6) + - - 1 1.0392 +
Y(7) + - - 2 1.5232 + +
Y(8) + - 3 1.9287 + + +
Y(9) + - 1 1.0392 +
Y(10) + -

Explanation of notation and interpretation of tables


RESTRAINT + +

CHECK STANDARD +

DEGREES OF FREEDOM = 6

SOLUTION MATRIX
DIVISOR = 25

OBSERVATIONS 3 2 1 1 1

Y(1) 3 -3 -4 1 1
Y(2) 3 -3 1 -4 1
Y(3) 3 -3 1 1 -4
Y(4) 1 -1 -3 -3 -3
Y(5) -2 2 -4 -4 1
Y(6) -2 2 -4 1 -4
Y(7) -2 2 1 -4 -4
Y(8) 0 0 5 -5 0
Y(9) 0 0 5 0 -5
Y(10) 0 0 0 5 -5
R* 15 10 5 5 5

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341h.htm (1 of 2) [11/13/2003 5:38:03 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341h.htm (2 of 2) [11/13/2003 5:38:03 PM]


2.3.4.1.18. Design for 10-and 20-pound weights 2.3.4.1.18. Design for 10-and 20-pound weights

Y(3) 0 -9 -3 -4 4
Y(4) 0 -3 -9 4 -4
Y(5) 0 -9 -3 4 -4
Y(6) 0 -3 -9 -4 4
Y(7) 0 6 -6 0 0
2. Measurement Process Characterization
R* 24 48 48 24 24
2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.1. Mass weights R* = Value of the reference standard

2.3.4.1.18. Design for 10-and 20-pound FACTORS FOR REPEATABILITY STANDARD DEVIATIONS

weights WT K1 1 2 2 1 1
2 0.9354 +
2 0.9354 +
1 0.8165 +
OBSERVATIONS 1 2 2 1 1 1 0.8165 +
4 1.7321 + +
5 2.3805 + + +
Y(1) + - 6 3.0000 + + + +
Y(2) + - 1 0.8165 +
Y(3) + - +
Y(4) + - + FACTORS FOR BETWEEN-DAY STANDARD DEVIATIONS
Y(5) + - +
Y(6) + - + WT K2 1 2 2 1 1
Y(7) + - 2 2.2361 +
2 2.2361 +
1 1.4142 +
RESTRAINT + 1 1.4142 +
4 4.2426 + +
5 5.2915 + + +
CHECK STANDARD + 6 6.3246 + + + +
1 1.4142 +
DEGREES OF FREEDOM = 3
Explanation of notation and interpretation of tables

SOLUTION MATRIX
DIVISOR = 24

OBSERVATIONS 1 2 2 1 1

Y(1) 0 -12 -12 -16 -8


Y(2) 0 -12 -12 -8 -16

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341i.htm (1 of 2) [11/13/2003 5:38:03 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc341i.htm (2 of 2) [11/13/2003 5:38:03 PM]


2.3.4.2. Drift-elimination designs for gauge blocks 2.3.4.2. Drift-elimination designs for gauge blocks

Elimination of The designs in this catalog are constructed so that the solutions are
linear drift immune to linear drift if the measurements are equally spaced over
time. The size of the drift is the average of the n difference
2. Measurement Process Characterization measurements. Keeping track of drift from design to design is
2.3. Calibration useful because a marked change from its usual range of values may
2.3.4. Catalog of calibration designs indicate a problem with the measurement system.

Assumption for Mechanical measurements on gauge blocks take place successively


2.3.4.2. Drift-elimination designs for gauge Doiron designs with one block being inserted into the comparator followed by a
second block and so on. This scenario leads to the assumption that
blocks the individual measurements are subject to drift (Doiron). Doiron
lists designs meeting this criterion which also allow for:
Tie to the defined The unit of length in many industries is maintained and ● two master blocks, R1 and R2
unit of length disseminated by gauge blocks. The highest accuracy calibrations of
● one check standard = difference between R1 and R2
gauge blocks are done by laser intererometry which allows the
transfer of the unit of length to a gauge piece. Primary standards ● one - nine test blocks
laboratories maintain master sets of English gauge blocks and
metric gauge blocks which are calibrated in this manner. Gauge Properties of The designs are constructed to:
blocks ranging in sizes from 0.1 to 20 inches are required to drift-elimination ● Be immune to linear drift
support industrial processes in the United States. designs that use 1
● Minimize the standard deviations for test blocks (as much as
master block
possible)
Mechanical However, the majority of gauge blocks are calibrated by
● Spread the measurements on each block throughout the
comparison of comparison with master gauges using a mechanical comparator
gauge blocks specifically designed for measuring the small difference between design
two blocks of the same nominal length. The measurements are ● Be completed in 5-10 minutes to keep the drift at the 5 nm
temperature corrected from readings taken directly on the surfaces level
of the blocks. Measurements on 2 to 20 inch blocks require special
handling techniques to minimize thermal effects. A typical Caution Because of the large number of gauge blocks that are being
calibration involves a set of 81 gauge blocks which are compared intercompared and the need to eliminate drift, the Doiron designs
one-by-one with master gauges of the same nominal size. are not completely balanced with respect to the test blocks.
Therefore, the standard deviations are not equal for all blocks. If all
Calibration Calibration designs allow comparison of several gauge blocks of the blocks are being calibrated for use in one facility, it is easiest to
designs for gauge the same nominal size to one master gauge in a manner that quote the largest of the standard deviations for all blocks rather
blocks promotes economy of operation and minimizes wear on the master than try to maintain a separate record on each block.
gauge. The calibration design is repeated for each size until
measurements on all the blocks in the test sets are completed.

Problem of Measurements on gauge blocks are subject to drift from heat


thermal drift build-up in the comparator. This drift must be accounted for in the
calibration experiment or the lengths assigned to the blocks will be
contaminated by the drift term.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342.htm (1 of 4) [11/13/2003 5:38:04 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342.htm (2 of 4) [11/13/2003 5:38:04 PM]


2.3.4.2. Drift-elimination designs for gauge blocks 2.3.4.2. Drift-elimination designs for gauge blocks

Definition of At the National Institute of Standards and Technology (NIST), the Important The check standards for the designs in this section are not artifact
master block and first two blocks in the design are NIST masters which are concept - check standards but constructions from the design. The value of one
check standard designated R1 and R2, respectively. The R1 block is a steel block, standard master block or the average of two master blocks is the restraint for
and the R2 block is a chrome-carbide block. If the test blocks are the design, and values for the masters, R1 and R2, are estimated
steel, the reference is R1; if the test blocks are chrome-carbide, the from a set of measurements taken according to the design. The
reference is R2. The check standard is always the difference check standard value is the difference between the estimates, R1
between R1 and R2 as estimated from the design and is and R2. Measurement control is exercised by comparing the current
independent of R1 and R2. The designs are listed in this section of value of the check standard with its historical average.
the catalog as:
1. Doiron design for 3 gauge blocks - 6 measurements
2. Doiron design for 3 gauge blocks - 9 measurements
3. Doiron design for 4 gauge blocks - 8 measurements
4. Doiron design for 4 gauge blocks - 12 measurements
5. Doiron design for 5 gauge blocks - 10 measurements
6. Doiron design for 6 gauge blocks - 12 measurements
7. Doiron design for 7 gauge blocks - 14 measurements
8. Doiron design for 8 gauge blocks - 16 measurements
9. Doiron design for 9 gauge blocks - 18 measurements
10. Doiron design for 10 gauge blocks - 20 measurements
11. Doiron design for 11 gauge blocks - 22 measurements

Properties of Historical designs for gauge blocks (Cameron and Hailes) work on
designs that use 2 the assumption that the difference measurements are contaminated
master blocks by linear drift. This assumption is more restrictive and covers the
case of drift in successive measurements but produces fewer
designs. The Cameron/Hailes designs meeting this criterion allow
for:
● two reference (master) blocks, R1 and R2

● check standard = difference between the two master blocks

and assign equal uncertainties to values of all test blocks.


The designs are listed in this section of the catalog as:
1. Cameron-Hailes design for 2 masters + 2 test blocks
2. Cameron-Hailes design for 2 masters + 3 test blocks
3. Cameron-Hailes design for 2 masters + 4 test blocks
4. Cameron-Hailes design for 2 masters + 5 test blocks

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342.htm (3 of 4) [11/13/2003 5:38:04 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342.htm (4 of 4) [11/13/2003 5:38:04 PM]


2.3.4.2.1. Doiron 3-6 Design 2.3.4.2.1. Doiron 3-6 Design

Y(4) 0 2 1
Y(5) 0 -1 1
Y(6) 0 -1 -2
R* 6 6 6
2. Measurement Process Characterization
R* = Value of the reference standard
2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.2. Drift-elimination designs for gage blocks
FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
NOM FACTOR
2.3.4.2.1. Doiron 3-6 Design 1 1 1
1 0.0000 +
1 0.5774 +
1 0.5774 +
Doiron 3-6 design 1 0.5774 +

OBSERVATIONS 1 1 1 Explanation of notation and interpretation of tables

Y(1) + -
Y(2) - +
Y(3) + -
Y(4) - +
Y(5) - +
Y(6) + -

RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 4

SOLUTION MATRIX
DIVISOR = 6

OBSERVATIONS 1 1 1

Y(1) 0 -2 -1
Y(2) 0 1 2
Y(3) 0 1 -1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3421.htm (1 of 2) [11/13/2003 5:38:04 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3421.htm (2 of 2) [11/13/2003 5:38:04 PM]


2.3.4.2.2. Doiron 3-9 Design 2.3.4.2.2. Doiron 3-9 Design

FACTORS FOR COMPUTING REPEATABILITY STANDARD DEVIATIONS


NOM FACTOR
1 1 1
1 0.0000 +
2. Measurement Process Characterization
1 0.4714 +
2.3. Calibration
2.3.4. Catalog of calibration designs
1 0.4714 +
2.3.4.2. Drift-elimination designs for gage blocks 1 0.4714 +

2.3.4.2.2. Doiron 3-9 Design Explanation of notation and interpretation of tables

Doiron 3-9 Design

OBSERVATIONS 1 1 1

Y(1) + -
Y(2) - +
Y(3) + -
Y(4) - +
Y(5) - +
Y(6) + -
Y(7) - +
Y(8) - +
Y(9) + -

RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 7

SOLUTION MATRIX
DIVISOR = 9

OBSERVATIONS 1 1 1

Y(1) 0 -2 -1
Y(2) 0 -1 1
Y(3) 0 -1 -2
Y(4) 0 2 1
Y(5) 0 1 2
Y(6) 0 1 -1
Y(7) 0 2 1
Y(8) 0 -1 1
Y(9) 0 -1 -2
R(1) 9 9 9

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3422.htm (1 of 2) [11/13/2003 5:38:04 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3422.htm (2 of 2) [11/13/2003 5:38:04 PM]


2.3.4.2.3. Doiron 4-8 Design 2.3.4.2.3. Doiron 4-8 Design
1 0.0000 +
1 0.6124 +
1 0.7071 +
1 0.6124 +
1 0.6124 +
2. Measurement Process Characterization
2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.2. Drift-elimination designs for gage blocks Explanation of notation and interpretation of tables

2.3.4.2.3. Doiron 4-8 Design

Doiron 4-8 Design

OBSERVATIONS 1 1 1 1

Y(1) + -
Y(2) + -
Y(3) - +
Y(4) + -
Y(5) - +
Y(6) - +
Y(7) + -
Y(8) - +

RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 5

SOLUTION MATRIX
DIVISOR = 8

OBSERVATIONS 1 1 1 1

Y(1) 0 -3 -2 -1
Y(2) 0 1 2 -1
Y(3) 0 1 2 3
Y(4) 0 1 -2 -1
Y(5) 0 3 2 1
Y(6) 0 -1 -2 1
Y(7) 0 -1 -2 -3
Y(8) 0 -1 2 1
R* 8 8 8 8

R* = Value of the reference standard

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


NOM FACTOR
1 1 1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3423.htm (1 of 2) [11/13/2003 5:38:04 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3423.htm (2 of 2) [11/13/2003 5:38:04 PM]


2.3.4.2.4. Doiron 4-12 Design 2.3.4.2.4. Doiron 4-12 Design
R* 6 6 6 4

R* = Value of the reference standard

2. Measurement Process Characterization


FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
2.3. Calibration
2.3.4. Catalog of calibration designs
NOM FACTOR
2.3.4.2. Drift-elimination designs for gage blocks 1 1 1 1
1 0.0000 +
1 0.5000 +
2.3.4.2.4. Doiron 4-12 Design 1 0.5000 +
1 0.5000 +
1 0.5000 +
Doiron 4-12 Design

OBSERVATIONS 1 1 1 1 Explanation of notation and interpretation of tables


Y(1) + -
Y(2) + +
Y(3) + -
Y(4) - +
Y(5) + -
Y(6) - +
Y(7) + -
Y(8) + -
Y(9) + -
Y(10) - +
Y(11) - +
Y(12) - +

RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 9

SOLUTION MATRIX
DIVISOR = 8

OBSERVATIONS 1 1 1 1

Y(1) 0 -2 -1 -1
Y(2) 0 1 1 2
Y(3) 0 0 1 -1
Y(4) 0 2 1 1
Y(5) 0 1 -1 0
Y(6) 0 -1 0 1
Y(7) 0 -1 -2 -1
Y(8) 0 1 0 -1
Y(9) 0 -1 -1 -2
Y(10) 0 -1 1 0
Y(11) 0 1 2 1
Y(12) 0 0 -1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3424.htm (1 of 2) [11/13/2003 5:38:04 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3424.htm (2 of 2) [11/13/2003 5:38:04 PM]


2.3.4.2.5. Doiron 5-10 Design 2.3.4.2.5. Doiron 5-10 Design

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


NOM FACTOR
1 1 1 1 1
1 0.0000 +
2. Measurement Process Characterization
1 0.7454 +
2.3. Calibration
2.3.4. Catalog of calibration designs
1 0.5676 +
2.3.4.2. Drift-elimination designs for gage blocks 1 0.5676 +
1 0.7071 +
1 0.7454 +
2.3.4.2.5. Doiron 5-10 Design
Explanation of notation and interpretation of tables
Doiron 5-10 Design

OBSERVATIONS 1 1 1 1 1

Y(1) + -
Y(2) - +
Y(3) + -
Y(4) - +
Y(5) - +
Y(6) + -
Y(7) - +
Y(8) + -
Y(9) - +
Y(10) + -

RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 6

SOLUTION MATRIX
DIVISOR = 90

OBSERVATIONS 1 1 1 1 1

Y(1) 0 -50 -10 -10 -30


Y(2) 0 20 4 -14 30
Y(3) 0 -10 -29 -11 -15
Y(4) 0 -20 5 5 15
Y(5) 0 0 -18 18 0
Y(6) 0 -10 -11 -29 -15
Y(7) 0 10 29 11 15
Y(8) 0 -20 14 -4 -30
Y(9) 0 10 11 29 15
Y(10) 0 20 -5 -5 -15
R* 90 90 90 90 90

R* = Value of the reference standard

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3425.htm (1 of 2) [11/13/2003 5:38:04 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3425.htm (2 of 2) [11/13/2003 5:38:04 PM]


2.3.4.2.6. Doiron 6-12 Design 2.3.4.2.6. Doiron 6-12 Design
R* 360 360 360 360 360 360

R* = Value of the reference standard

2. Measurement Process Characterization


FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
2.3. Calibration
2.3.4. Catalog of calibration designs
NOM FACTOR
2.3.4.2. Drift-elimination designs for gage blocks 1 1 1 1 1 1
1 0.0000 +
1 0.6146 +
2.3.4.2.6. Doiron 6-12 Design 1 0.7746 +
1 0.6476 +
1 0.6325 +
1 0.6476 +
Doiron 6-12 Design 1 0.6146 +
OBSERVATIONS 1 1 1 1 1 1
Explanation of notation and interpretation of tables
Y(1) + -
Y(2) - +
Y(3) - +
Y(4) - +
Y(5) - +
Y(6) + -
Y(7) + -
Y(8) + -
Y(9) + -
Y(10) - +
Y(11) + -
Y(12) - +

RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 7

SOLUTION MATRIX
DIVISOR = 360

OBSERVATIONS 1 1 1 1 1 1

Y(1) 0 -136 -96 -76 -72 -76


Y(2) 0 -4 -24 -79 72 11
Y(3) 0 -20 -120 -35 0 55
Y(4) 0 4 24 -11 -72 79
Y(5) 0 -60 0 75 0 -15
Y(6) 0 20 120 -55 0 35
Y(7) 0 -76 -96 -61 -72 -151
Y(8) 0 64 24 4 -72 4
Y(9) 0 40 -120 -20 0 -20
Y(10) 0 72 72 72 144 72
Y(11) 0 60 0 15 0 -75
Y(12) 0 76 96 151 72 61

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3426.htm (1 of 2) [11/13/2003 5:38:05 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3426.htm (2 of 2) [11/13/2003 5:38:05 PM]


2.3.4.2.7. Doiron 7-14 Design 2.3.4.2.7. Doiron 7-14 Design
Y(11) 0 -203 26 11 61 221 -29
Y(12) 0 0 -360 160 -220 -15 -145
Y(13) 0 203 264 424 229 214 174
Y(14) 0 0 -130 -55 -305 -90 145
R* 1015 1015 1015 1015 1015 1015 1015
2. Measurement Process Characterization
2.3. Calibration
2.3.4. Catalog of calibration designs
R* = Value of the reference standard
2.3.4.2. Drift-elimination designs for gage blocks

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


2.3.4.2.7. Doiron 7-14 Design NOM FACTOR
1 1 1 1 1 1 1
1 0.0000 +
1 0.6325 +
Doiron 7-14 Design 1 0.7841 +
1 0.6463 +
OBSERVATIONS 1 1 1 1 1 1 1 1 0.7841 +
1 0.6463 +
Y(1) + - 1 0.6761 +
Y(2) - + 1 0.6325 +
Y(3) + -
Y(4) + -
Y(5) + - Explanation of notation and interpretation of tables
Y(6) - +
Y(7) + -
Y(8) + -
Y(9) + -
Y(10) - +
Y(11) - +
Y(12) - +
Y(13) - +
Y(14) - +

RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 8

PARAMETER VALUES
DIVISOR = 1015

OBSERVATIONS 1 1 1 1 1 1 1

Y(1) 0 -406 -203 -203 -203 -203 -203


Y(2) 0 0 -35 -210 35 210 0
Y(3) 0 0 175 35 -175 -35 0
Y(4) 0 203 -116 29 -116 29 -261
Y(5) 0 -203 -229 -214 -264 -424 -174
Y(6) 0 0 -175 -35 175 35 0
Y(7) 0 203 -61 -221 -26 -11 29
Y(8) 0 0 305 90 130 55 -145
Y(9) 0 0 220 15 360 -160 145
Y(10) 0 203 319 174 319 174 464

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3427.htm (1 of 2) [11/13/2003 5:38:05 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3427.htm (2 of 2) [11/13/2003 5:38:05 PM]


2.3.4.2.8. Doiron 8-16 Design 2.3.4.2.8. Doiron 8-16 Design

Y(1) 0 -1392 -620 -472 -516 -976 -824 -916


Y(2) 0 60 248 -78 96 878 -112 -526
Y(3) 0 352 124 -315 278 255 864 289
Y(4) 0 516 992 470 1396 706 748 610
Y(5) 0 -356 620 35 286 -979 -96 -349
2. Measurement Process Characterization
2.3. Calibration
Y(6) 0 92 0 23 -138 253 -552 667
2.3.4. Catalog of calibration designs Y(7) 0 -148 -992 335 -522 -407 -104 -81
2.3.4.2. Drift-elimination designs for gage blocks Y(8) 0 -416 372 113 190 995 16 177
Y(9) 0 308 -248 170 -648 134 756 342
Y(10) 0 472 620 955 470 585 640 663
2.3.4.2.8. Doiron 8-16 Design Y(11) 0 476 -124 -191 -94 -117 -128 -703
Y(12) 0 -104 -620 -150 404 -286 4 -134
Y(13) 0 472 620 955 470 585 640 663
Y(14) 0 444 124 -292 140 508 312 956
Doiron 8-16 Design
Y(15) 0 104 620 150 -404 286 -4 134
Y(16) 0 568 -124 -168 -232 136 -680 -36
R* 2852 2852 2852 2852 2852 2852 2852 2852
OBSERVATIONS 1 1 1 1 1 1 1 1
R* = value of reference block
Y(1) + -
Y(2) + -
Y(3) - +
FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
Y(4) - +
WT FACTOR
Y(5) + -
1 1 1 1 1 1 1 1
Y(6) - +
1 0.0000 +
Y(7) - +
1 0.6986 +
Y(8) - +
1 0.7518 +
Y(9) - +
1 0.5787 +
Y(10) - +
1 0.6996 +
Y(11) + -
1 0.8313 +
Y(12) - +
1 0.7262 +
Y(13) - +
1 0.7534 +
Y(14) - +
1 0.6986 +
Y(15) + -
Y(16) + -

RESTRAINT +
Explanation of notation and interpretation of tables

CHECK STANDARD +

DEGREES OF FREEDOM = 9

SOLUTION MATRIX
DIVISOR = 2852

OBSERVATIONS 1 1 1 1 1 1 1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3428.htm (1 of 2) [11/13/2003 5:38:05 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3428.htm (2 of 2) [11/13/2003 5:38:05 PM]


2.3.4.2.9. Doiron 9-18 Design 2.3.4.2.9. Doiron 9-18 Design

Y(3) 0 1375 -3139 196 -491 -1279 -1266 -894 -540


Y(4) 0 -909 -222 -1707 1962 -432 675 633 327
Y(5) 0 619 1004 736 -329 2771 -378 -1674 -513
Y(6) 0 -1596 -417 1140 342 303 42 186 57
Y(7) 0 955 2828 496 -401 971 -1689 -411 -525
2. Measurement Process Characterization
Y(8) 0 612 966 741 1047 1434 852 2595 -1200
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(9) 0 1175 1666 1517 3479 1756 2067 2085 1038
2.3.4.2. Drift-elimination designs for gage blocks Y(10) 0 199 -1276 1036 -239 -3226 -801 -1191 -498
Y(11) 0 654 1194 711 1038 1209 1719 1722 2922
Y(12) 0 91 494 -65 -1394 887 504 2232 684
2.3.4.2.9. Doiron 9-18 Design Y(13) 0 2084 1888 3224 1517 2188 1392 1452 711
Y(14) 0 1596 417 -1140 -342 -303 -42 -186 -57
Y(15) 0 175 950 -125 -1412 437 2238 486 681
Y(16) 0 -654 -1194 -711 -1038 -1209 -1719 -1722 -2922
Doiron 9-18 Design Y(17) 0 -420 -2280 300 90 2250 -423 483 15
Y(18) 0 84 456 -60 -18 -450 1734 -1746 -3
R* 8247 8247 8247 8247 8247 8247 8247 8247 8247
OBSERVATIONS 1 1 1 1 1 1 1 1 1
R* = Value of the reference standard
Y(1) + -
Y(2) - +
Y(3) + - FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
Y(4) - + NOM FACTOR
Y(5) + - 1 1 1 1 1 1 1 1 1
Y(6) - + 1 0.0000 +
Y(7) + - 1 0.6680 +
Y(8) + - 1 0.8125 +
Y(9) - + 1 0.6252 +
Y(10) + - 1 0.6495 +
Y(11) - + 1 0.8102 +
Y(12) - + 1 0.7225 +
Y(13) - + 1 0.7235 +
Y(14) + - 1 0.5952 +
Y(15) - + 1 0.6680 +
Y(16) + -
Y(17) - +
Y(18) + - Explanation of notation and interpretation of tables
RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 10

SOLUTION MATRIX
DIVISOR = 8247

OBSERVATIONS 1 1 1 1 1 1 1 1 1

Y(1) 0 -3680 -2305 -2084 -1175 -1885 -1350 -1266 -654


Y(2) 0 -696 -1422 -681 -1029 -984 -2586 -849 1203

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3429.htm (1 of 2) [11/13/2003 5:38:05 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3429.htm (2 of 2) [11/13/2003 5:38:05 PM]


2.3.4.2.10. Doiron 10-20 Design 2.3.4.2.10. Doiron 10-20 Design
Y(4) 0 -3600 -1536 816 5856 -9120 -1632 -1728 -3744
Y(5) 0 6060 306 -1596 -906 -1050 -978 -2262 -8376
Y(6) 0 2490 8207 -8682 -1187 1165 2769 2891 588
Y(7) 0 -2730 809 -1494 -869 -2885 903 6557 -8844
Y(8) 0 5580 7218 11412 6102 6630 6366 5514 8472
2. Measurement Process Characterization
Y(9) 0 1800 -2012 -408 -148 7340 -7524 -1916 1872
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(10) 0 3660 1506 -3276 774 3990 2382 3258 9144
2.3.4.2. Drift-elimination designs for gage blocks Y(11) 0 -1800 -3548 408 5708 -1780 -9156 -3644 -1872
Y(12) 0 6270 -9251 -3534 -1609 455 -3357 -3023 516
Y(13) 0 960 2856 7344 2664 1320 1992 1128 -336
2.3.4.2.10. Doiron 10-20 Design Y(14) 0 -330 -391 186 -2549 -7925 -2457 1037 6996
Y(15) 0 2520 8748 3432 1572 1380 1476 -5796 -48
Y(16) 0 -5970 -7579 -8766 -15281 -9425 -9573 -6007 -6876
Y(17) 0 -1260 -7154 -1716 1994 2090 7602 118 24
Doiron 10-20 Design Y(18) 0 570 2495 9990 -6515 -1475 -1215 635 1260
Y(19) 0 6510 9533 6642 6007 7735 9651 15329 8772
Y(20) 0 -5730 85 1410 3455 8975 3435 1225 1380
OBSERVATIONS 1 1 1 1 1 1 1 1 1 1 R* 33360 33360 33360 33360 33360 33360 33360 33360 33360
Y(1) + - R* = Value of the reference standard
Y(2) + -
Y(3) - +
Y(4) + -
Y(5) + - FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
Y(6) + - NOM FACTOR
Y(7) + - 1 1 1 1 1 1 1 1 1 1
Y(8) - + 1 0.0000 +
Y(9) + - 1 0.6772 +
Y(10) + - 1 0.7403 +
Y(11) + - 1 0.7498 +
Y(12) + - 1 0.6768 +
Y(13) + - 1 0.7456 +
Y(14) - + 1 0.7493 +
Y(15) + - 1 0.6779 +
Y(16) + - 1 0.7267 +
Y(17) - + 1 0.6961 +
Y(18) + - 1 0.6772 +
Y(19) - +
Y(20) - +
Explanation of notation and interpretation of tables
RESTRAINT +

CHECK STANDARD +

DEGREES OF FREEDOM = 11

SOLUTION MATRIX
DIVISOR = 33360

OBSERVATIONS 1 1 1 1 1 1 1 1 1

Y(1) 0 -15300 -9030 -6540 -5970 -9570 -7770 -6510 -9240


Y(2) 0 1260 1594 1716 3566 3470 9078 -5678 -24
Y(3) 0 -960 -2856 -7344 -2664 -1320 -1992 -1128 336

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342a.htm (1 of 2) [11/13/2003 5:38:05 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342a.htm (2 of 2) [11/13/2003 5:38:05 PM]


2.3.4.2.11. Doiron 11-22 Design 2.3.4.2.11. Doiron 11-22 Design
Y(3) 0 5082 4446 3293 4712 160 5882 15395 3527 -9954
487
Y(4) 0 -968 -1935 10496 2246 -635 -4143 -877 -13125 -643
-1060
Y(5) 0 8360 -18373 -8476 -3240 -3287 -8075 -1197 -9443 -1833
2. Measurement Process Characterization
-2848
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(6) 0 -6908 -7923 -9807 -2668 431 -4753 -1296 -10224 9145
2.3.4.2. Drift-elimination designs for gage blocks -18413
Y(7) 0 1716 3084 6091 404 -2452 -10544 -2023 15073 332
5803
2.3.4.2.11. Doiron 11-22 Design Y(8) 0 9944 13184 15896 24476 11832 13246 14318 13650 9606
12274
Y(9) 0 2860 12757 -11853 -2712 145 3585 860 578 -293
-2177
Doiron 11-22 Design Y(10) 0 -8778 -12065 -11920 -11832 -23589 -15007 -11819 -12555 -11659
-11228
OBSERVATIONS 1 1 1 1 1 1 1 1 1 1 1 Y(11) 0 11286 1729 -271 -4374 -3041 -3919 -14184 -180 -3871
1741
Y(1) + - Y(12) 0 -3608 -13906 -4734 62 2942 11102 2040 -2526 604
Y(2) + - -2566
Y(3) + - Y(13) 0 -6006 -10794 -7354 -1414 8582 -18954 -6884 -10862 -1162
Y(4) + - -6346
Y(5) + - Y(14) 0 -9460 1748 6785 2330 2450 2790 85 6877 4680
Y(6) + - 16185
Y(7) - + Y(15) 0 5588 10824 19965 -8580 88 6028 1485 11715 2904
Y(8) - + 10043
Y(9) + - Y(16) 0 -792 5803 3048 1376 1327 5843 1129 15113 -1911
Y(10) + - -10100
Y(11) + - Y(17) 0 -682 6196 3471 -1072 3188 15258 -10947 6737 -1434
Y(12) - + 2023
Y(13) + - Y(18) 0 10384 12217 12510 9606 11659 12821 14255 13153 24209
Y(14) - + 15064
Y(15) + - Y(19) 0 1892 10822 -1357 -466 -490 -558 -17 -12547 -936
Y(16) + - -3237
Y(17) + - Y(20) 0 5522 3479 -93 -10158 -13 5457 15332 3030 4649
Y(18) - + 3277
Y(19) + - Y(21) 0 1760 -3868 -13544 -3622 -692 -1700 -252 -1988 2554
Y(20) - + 11160
Y(21) - + Y(22) 0 -1606 -152 -590 2226 11930 2186 -2436 -598 -12550
Y(22) + - -3836
R* 55858 55858 55858 55858 55858 55858 55858 55858 55858 55858
RESTRAINT + 55858

R* = Value of the reference standard


CHECK STANDARD +
FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
NOM FACTOR
DEGREES OF FREEDOM = 12 1 1 1 1 1 1 1 1 1 1 1
1 0.0000 +
1 0.6920 +
SOLUTION MATRIX 1 0.8113 +
DIVISOR = 55858 1 0.8013 +
1 0.6620 +
OBSERVATIONS 1 1 1 1 1 1 1 1 1 1 1 0.6498 +
1 0.7797 +
Y(1) 0 -26752 -18392 -15532 -9944 -8778 -14784 -15466 -16500 -10384 1 0.7286 +
-17292 1 0.8301 +
Y(2) 0 1166 1119 3976 12644 -11757 -1761 2499 1095 -2053 1 0.6583 +
1046 1 0.6920 +

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342b.htm (1 of 3) [11/13/2003 5:38:05 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342b.htm (2 of 3) [11/13/2003 5:38:05 PM]


2.3.4.2.11. Doiron 11-22 Design 2.3.4.3. Designs for electrical quantities

Explanation of notation and interpretation of tables

2. Measurement Process Characterization


2.3. Calibration
2.3.4. Catalog of calibration designs

2.3.4.3. Designs for electrical quantities


Standard Banks of saturated standard cells that are nominally one volt are the
cells basis for maintaining the unit of voltage in many laboratories.

Bias It has been observed that potentiometer measurements of the difference


problem between two saturated standard cells, connected in series opposition, are
effected by a thermal emf which remains constant even when the
direction of the circuit is reversed.

Designs for A calibration design for comparing standard cells can be constructed to
eliminating be left-right balanced so that:
bias
● A constant bias, P, does not contaminate the estimates for the
individual cells.

● P is estimated as the average of difference measurements.


Designs for Designs are given for the following classes of electrical artifacts. These
electrical designs are left-right balanced and may be appropriate for artifacts other
quantities than electrical standards.
● Saturated standard reference cells

● Saturated standard test cells


● Zeners
● Resistors

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc342b.htm (3 of 3) [11/13/2003 5:38:05 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc343.htm (1 of 2) [11/13/2003 5:38:06 PM]


2.3.4.3. Designs for electrical quantities 2.3.4.3.1. Left-right balanced design for 3 standard cells

Standard Left-right balanced designs for comparing standard cells among


cells in a themselves where the restraint is over all reference cells are listed
single box below. These designs are not appropriate for assigning values to test
cells. 2. Measurement Process Characterization
2.3. Calibration
Estimates for individual standard cells and the bias term, P, are shown 2.3.4. Catalog of calibration designs
under the heading, 'SOLUTION MATRIX'. These designs also have the 2.3.4.3. Designs for electrical quantities
advantage of requiring a change of connections to only one cell at a
time.
1. Design for 3 standard cells 2.3.4.3.1. Left-right balanced design for 3
2. Design for 4 standard cells standard cells
3. Design for 5 standard cells
4. Design for 6 standard cells Design 1,1,1

Test cells Calibration designs for assigning values to test cells in a common
environment on the basis of comparisons with reference cells with CELLS
known values are shown below. The designs in this catalog are left-right OBSERVATIONS 1 1 1
balanced. Y(1) + -
1. Design for 4 test cells and 4 reference cells Y(2) + -
Y(3) + -
2. Design for 8 test cells and 8 reference cells Y(4) - +
Y(5) - +
Zeners Increasingly, zeners are replacing saturated standard cells as artifacts for Y(6) - +
maintaining and disseminating the volt. Values are assigned to test
zeners, based on a group of reference zeners, using calibration designs. RESTRAINT + + +
1. Design for 4 reference zeners and 2 test zeners
DEGREES OF FREEDOM = 3
2. Design for 4 reference zeners and 3 test zeners

Standard Designs for comparing standard resistors that are used for maintaining SOLUTION MATRIX
resistors and disseminating the ohm are listed in this section. DIVISOR = 6
1. Design for 3 reference resistors and 1 test resistor OBSERVATIONS 1 1 1 P
2. Design for 4 reference resistors and 1 test resistor Y(1) 1 -1 0 1
Y(2) 1 0 -1 1
Y(3) 0 1 -1 1
Y(4) -1 1 0 1
Y(5) -1 0 1 1
Y(6) 0 -1 1 1
R* 2 2 2 0

R* = AVERAGE VALUE OF 3 REFERENCE CELLS

P = LEFT-RIGHT BIAS

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc343.htm (2 of 2) [11/13/2003 5:38:06 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3431.htm (1 of 2) [11/13/2003 5:38:06 PM]


2.3.4.3.1. Left-right balanced design for 3 standard cells 2.3.4.3.2. Left-right balanced design for 4 standard cells

FACTORS FOR COMPUTING STANDARD DEVIATIONS


V FACTOR CELLS
1 1 1
2. Measurement Process Characterization
1 0.3333 +
2.3. Calibration
1 0.3333 + 2.3.4. Catalog of calibration designs
1 0.3333 + 2.3.4.3. Designs for electrical quantities

Explanation of notation and interpretation of tables


2.3.4.3.2. Left-right balanced design for 4
standard cells
Design 1,1,1,1

OBSERVATIONS 1 1 1 1
Y(1) + -
Y(2) + -
Y(3) + -
Y(4) + -
Y(5) + -
Y(6) - +
Y(7) - +
Y(8) - +
Y(9) - +
Y(10) - +
Y(11) - +
Y(12) + -

RESTRAINT + + + +

DEGREES OF FREEDOM = 8

SOLUTION MATRIX
DIVISOR = 8
OBSERVATIONS 1 1 1 1 P

Y(1) 1 -1 0 0 1
Y(2) 1 0 -1 0 1
Y(3) 0 1 -1 0 1
Y(4) 0 1 0 -1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3431.htm (2 of 2) [11/13/2003 5:38:06 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3432.htm (1 of 2) [11/13/2003 5:38:06 PM]


2.3.4.3.2. Left-right balanced design for 4 standard cells 2.3.4.3.3. Left-right balanced design for 5 standard cells

Y(5) 0 0 1 -1 1
Y(6) -1 0 1 0 1
Y(7) 0 -1 1 0 1
Y(8) 0 -1 0 1 1
Y(9) -1 0 0 1 1
2. Measurement Process Characterization
Y(10) 0 0 -1 1 1
2.3. Calibration
Y(11) -1 1 0 0 1 2.3.4. Catalog of calibration designs
Y(12) 1 0 0 -1 1 2.3.4.3. Designs for electrical quantities
R* 2 2 2 2 0

R* = AVERAGE VALUE OF 4 REFERENCE CELLS 2.3.4.3.3. Left-right balanced design for 5


P = LEFT-RIGHT BIAS standard cells
Design 1,1,1,1,1
FACTORS FOR COMPUTING STANDARD DEVIATIONS
V FACTOR CELLS
1 1 1 1
OBSERVATIONS 1 1 1 1 1
1 0.3062 +
1 0.3062 +
1 0.3062 +
Y(1) + -
1 0.3062 +
Y(2) + -
Y(3) + -
Explanation of notation and interpretation of tables Y(4) + -
Y(5) + -
Y(6) + -
Y(7) + -
Y(8) - +
Y(9) - +
Y(10) - +

RESTRAINT + + + + +

DEGREES OF FREEDOM = 5

SOLUTION MATRIX
DIVISOR = 5

OBSERVATIONS 1 1 1 1 1 P

Y(1) 1 -1 0 0 0 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3432.htm (2 of 2) [11/13/2003 5:38:06 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3433.htm (1 of 2) [11/13/2003 5:38:06 PM]


2.3.4.3.3. Left-right balanced design for 5 standard cells 2.3.4.3.4. Left-right balanced design for 6 standard cells

Y(2) 1 0 -1 0 0 1
Y(3) 0 1 -1 0 0 1
Y(4) 0 1 0 -1 0 1
Y(5) 0 0 1 -1 0 1
Y(6) 0 0 1 0 -1 1
2. Measurement Process Characterization
Y(7) 0 0 0 1 -1 1
2.3. Calibration
Y(8) -1 0 0 1 0 1 2.3.4. Catalog of calibration designs
Y(9) -1 0 0 0 1 1 2.3.4.3. Designs for electrical quantities
Y(10) 0 -1 0 0 1 1
R* 1 1 1 1 1 0
2.3.4.3.4. Left-right balanced design for 6
R* = AVERAGE VALUE OF 5 REFERENCE CELLS
standard cells
P = LEFT-RIGHT BIAS
Design 1,1,1,1,1,1
FACTORS FOR COMPUTING REPEATABILITY STANDARD DEVIATIONS
V FACTOR CELLS
CELLS
1 1 1 1 1
OBSERVATIONS 1 1 1 1 1 1
1 0.4000 +
Y(1) + -
1 0.4000 +
Y(2) + -
1 0.4000 +
Y(3) + -
1 0.4000 +
Y(4) + -
1 0.4000 +
Y(5) + -
Y(6) + -
Explanation of notation and interpretation of tables Y(7) + -
Y(8) + -
Y(9) + -
Y(10) - +
Y(11) - +
Y(12) - +
Y(13) + -
Y(14) + -
Y(15) + -

RESTRAINT + + + + + +

DEGREES OF FREEDOM = 9

SOLUTION MATRIX
DIVISOR = 6

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3433.htm (2 of 2) [11/13/2003 5:38:06 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3434.htm (1 of 2) [11/13/2003 5:38:06 PM]


2.3.4.3.4. Left-right balanced design for 6 standard cells 2.3.4.3.5. Left-right balanced design for 4 references and 4 test items

OBSERVATIONS 1 1 1 1 1 1 P

Y(1) 1 -1 0 0 0 0 1
Y(2) 1 0 -1 0 0 0 1
Y(3) 0 1 -1 0 0 0 1 2. Measurement Process Characterization
Y(4) 0 1 0 -1 0 0 1 2.3. Calibration
2.3.4. Catalog of calibration designs
Y(5) 0 0 1 -1 0 0 1 2.3.4.3. Designs for electrical quantities
Y(6) 0 0 1 0 -1 0 1
Y(7) 0 0 0 1 -1 0 1
Y(8) 0 0 0 1 0 -1 1 2.3.4.3.5. Left-right balanced design for 4 references
Y(9) 0 0 0 0 1 -1 1
Y(10) -1 0 0 0 1 0 1
and 4 test items
Y(11) -1 0 0 0 0 1 1
Design for 4 references and 4 test items.
Y(12) 0 -1 0 0 0 1 1
Y(13) 1 0 0 -1 0 0 1
Y(14) 0 1 0 0 -1 0 1
OBSERVATIONS 1 1 1 1 1 1 1 1
Y(15) 0 0 1 0 0 -1 1
R* 1 1 1 1 1 1 0
Y(1) + -
R* = AVERAGE VALUE OF 6 REFERENCE CELLS Y(2) + -
Y(3) + -
P = LEFT-RIGHT BIAS Y(4) + -
Y(5) + -
Y(6) + -
Y(7) + -
FACTORS FOR COMPUTING STANDARD DEVIATIONS Y(8) + -
V FACTOR CELLS Y(9) - +
1 1 1 1 1 1 Y(10) - +
1 0.3727 + Y(11) - +
1 0.3727 + Y(12) - +
1 0.3727 + Y(13) - +
1 0.3727 + Y(14) - +
Y(15) - +
1 0.3727 +
Y(16) - +
1 0.3727 +
RESTRAINT + + + +
Explanation of notation and interpretation of tables
DEGREES OF FREEDOM = 8

SOLUTION MATRIX
DIVISOR = 16

OBSERVATIONS 1 1 1 1 1 1 1 1 P

Y(1) 3 -1 -1 -1 -4 0 0 0 1
Y(2) 3 -1 -1 -1 0 0 -4 0 1
Y(3) -1 -1 3 -1 0 0 -4 0 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3434.htm (2 of 2) [11/13/2003 5:38:06 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3435.htm (1 of 2) [11/13/2003 5:38:06 PM]


2.3.4.3.5. Left-right balanced design for 4 references and 4 test items 2.3.4.3.6. Design for 8 references and 8 test items

Y(4) -1 -1 3 -1 -4 0 0 0 1
Y(5) -1 3 -1 -1 0 -4 0 0 1
Y(6) -1 3 -1 -1 0 0 0 -4 1
Y(7) -1 -1 -1 3 0 0 0 -4 1
Y(8) -1 -1 -1 3 0 -4 0 0 1 2. Measurement Process Characterization
Y(9) -3 1 1 1 0 4 0 0 1 2.3. Calibration
Y(10) -3 1 1 1 0 0 0 4 1 2.3.4. Catalog of calibration designs
Y(11) 1 1 -3 1 0 0 0 4 1 2.3.4.3. Designs for electrical quantities
Y(12) 1 1 -3 1 0 4 0 0 1
Y(13) 1 -3 1 1 4 0 0 0 1
Y(14) 1 -3 1 1 0 0 4 0 1 2.3.4.3.6. Design for 8 references and 8 test items
Y(15) 1 1 1 -3 0 0 4 0 1
Y(16) 1 1 1 -3 4 0 0 0 1
R* 4 4 4 4 4 4 4 4 0 Design for 8 references and 8 test items.

R* = AVERAGE VALUE OF REFERENCE CELLS TEST CELLS REFERENCE CELLS


OBSERVATIONS 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
P = ESTIMATE OF LEFT-RIGHT BIAS
Y(1) + -
Y(2) - +
FACTORS FOR COMPUTING STANDARD DEVIATIONS Y(3) - +
V FACTORS CELLS Y(4) + -
1 1 1 1 1 1 1 1 Y(5) + -
1 0.4330 + Y(6) - +
1 0.4330 + Y(7) - +
1 0.4330 + Y(8) + -
1 0.4330 + Y(9) + -
1 0.5000 + Y(10) + -
1 0.5000 + Y(11) - +
1 0.5000 + Y(12) - +
1 0.5000 + Y(13) + -
Y(14) + -
Y(15) - +
Y(16) - +
Explanation of notation and interpretation of tables
RESTRAINT + + + + + + + +

DEGREES OF FREEDOM = 0

SOLUTION MATRIX FOR TEST CELLS


DIVISOR = 16
OBSERVATIONS 1 1 1 1 1 1 1 1

Y(1) 8 4 0 -4 -6 6 2 -2
Y(2) -8 4 0 -4 -6 6 2 -2
Y(3) 4 -8 -4 0 2 6 -6 -2
Y(4) 4 8 -4 0 2 6 -6 -2
Y(5) 0 -4 8 4 2 -2 -6 6
Y(6) 0 -4 -8 4 2 -2 -6 6
Y(7) -4 0 4 -8 -6 -2 2 6
Y(8) -4 0 4 8 -6 -2 2 6
Y(9) -6 -2 2 6 8 -4 0 4

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3435.htm (2 of 2) [11/13/2003 5:38:06 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3436.htm (1 of 3) [11/13/2003 5:38:07 PM]


2.3.4.3.6. Design for 8 references and 8 test items 2.3.4.3.6. Design for 8 references and 8 test items

Y(10) -6 6 2 -2 -4 8 4 0
Y(11) -6 6 2 -2 -4 -8 4 0
Y(12) 2 6 -6 -2 0 4 -8 -4
Y(13) 2 6 -6 -2 0 4 8 -4
Y(14) 2 -2 -6 6 4 0 -4 8
Y(15) 2 -2 -6 6 4 0 -4 -8
Y(16) -6 -2 2 6 -8 -4 0 4
R 2 2 2 2 2 2 2 2

SOLUTION MATRIX FOR REFERENCE CELLS


DIVISOR = 16
OBSERVATIONS 1 1 1 1 1 1 1 1 P

Y(1) -7 7 5 3 1 -1 -3 -5 1
Y(2) -7 7 5 3 1 -1 -3 -5 1
Y(3) 3 5 7 -7 -5 -3 -1 1 1
Y(4) 3 5 7 -7 -5 -3 -1 1 1
Y(5) 1 -1 -3 -5 -7 7 5 3 1
Y(6) 1 -1 -3 -5 -7 7 5 3 1
Y(7) -5 -3 -1 1 3 5 7 -7 1
Y(8) -5 -3 -1 1 3 5 7 -7 1
Y(9) -7 -5 -3 -1 1 3 5 7 1
Y(10) -5 -7 7 5 3 1 -1 -3 1
Y(11) -5 -7 7 5 3 1 -1 -3 1
Y(12) 1 3 5 7 -7 -5 -3 -1 1
Y(13) 1 3 5 7 -7 -5 -3 -1 1
Y(14) 3 1 -1 -3 -5 -7 7 5 1
Y(15) 3 1 -1 -3 -5 -7 7 5 1
Y(16) -7 -5 -3 -1 1 3 5 7 1
R* 2 2 2 2 2 2 2 2 0

R* = AVERAGE VALUE OF 8 REFERENCE CELLS

P = ESTIMATE OF LEFT-RIGHT BIAS

FACTORS FOR COMPUTING STANDARD DEVIATIONS FOR TEST CELLS


V FACTORS TEST CELLS
1 1 1 1 1 1 1 1
1 1.1726 +
1 1.1726 +
1 1.1726 +
1 1.1726 +
1 1.1726 +
1 1.1726 +
1 1.1726 +
1 1.1726 +

Explanation of notation and interpretation of tables

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3436.htm (2 of 3) [11/13/2003 5:38:07 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3436.htm (3 of 3) [11/13/2003 5:38:07 PM]


2.3.4.3.7. Design for 4 reference zeners and 2 test zeners 2.3.4.3.7. Design for 4 reference zeners and 2 test zeners

SOLUTION MATRIX
DIVISOR = 16

OBSERVATIONS 1 1 1 1 1 1 P
2. Measurement Process Characterization
2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.3. Designs for electrical quantities Y(1) 3 -1 -1 -1 -2 0 1
Y(2) 3 -1 -1 -1 0 -2 1
Y(3) -1 3 -1 -1 -2 0 1
2.3.4.3.7. Design for 4 reference zeners and 2 Y(4) -1 3 -1 -1 0 -2 1
Y(5) -1 -1 3 -1 -2 0 1
test zeners Y(6) -1 -1 3 -1 0 -2 1
Y(7) -1 -1 -1 3 -2 0 1
Design for 4 references zeners and 2 test zeners. Y(8) -1 -1 -1 3 0 -2 1
Y(9) 1 1 1 -3 2 0 1
Y(10) 1 1 1 -3 0 2 1
Y(11) 1 1 -3 1 2 0 1
ZENERS
Y(12) 1 1 -3 1 0 2 1
OBSERVATIONS 1 1 1 1 1 1
Y(13) 1 -3 1 1 2 0 1
Y(14) 1 -3 1 1 0 2 1
Y(1) + -
Y(15) -3 1 1 1 2 0 1
Y(2) + -
Y(16) -3 1 1 1 0 2 1
Y(3) + -
R* 4 4 4 4 4 4 0
Y(4) + -
Y(5) + -
R* = AVERAGE VALUE OF 4 REFERENCE STANDARDS
Y(6) + -
Y(7) + -
P = LEFT-RIGHT EFFECT
Y(8) + -
Y(9) - +
Y(10) - +
Y(11) - +
FACTORS FOR COMPUTING STANDARD DEVIATIONS
Y(12) - +
V FACTORS ZENERS
Y(13) - +
1 1 1 1 1 1 P
Y(14) - +
1 0.4330 +
Y(15) - +
1 0.4330 +
Y(16) - +
1 0.4330 +
1 0.4330 +
1 0.3536 +
RESTRAINT + + + +
1 0.3536 +
1 0.2500 +
CHECK STANDARD + -
Explanation of notation and interpretation of tables
DEGREES OF FREEDOM = 10

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3437.htm (1 of 3) [11/13/2003 5:38:07 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3437.htm (2 of 3) [11/13/2003 5:38:07 PM]


2.3.4.3.7. Design for 4 reference zeners and 2 test zeners 2.3.4.3.8. Design for 4 reference zeners and 3 test zeners

2. Measurement Process Characterization


2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.3. Designs for electrical quantities

2.3.4.3.8. Design for 4 reference zeners and 3 test


zeners
Design for 4 references and 3 test zeners.

ZENERS

OBSERVATIONS 1 1 1 1 1 1 1

Y(1) - +
Y(2) - +
Y(3) + -
Y(4) + -
Y(5) + -
Y(6) + -
Y(7) - +
Y(8) - +
Y(9) - +
Y(10) - +
Y(11) - +
Y(12) - +
Y(13) + -
Y(14) + -
Y(15) + -
Y(16) + -
Y(17) + -
Y(18) - +

RESTRAINT + + + +

CHECK STANDARD + -

DEGREES OF FREEDOM = 11

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3437.htm (3 of 3) [11/13/2003 5:38:07 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3438.htm (1 of 3) [11/13/2003 5:38:07 PM]


2.3.4.3.8. Design for 4 reference zeners and 3 test zeners 2.3.4.3.8. Design for 4 reference zeners and 3 test zeners

Explanation of notation and interpretation of tables


SOLUTION MATRIX
DIVISOR = 1260

OBSERVATIONS 1 1 1 1 1 1 1 P

Y(1) -196 196 -56 56 0 0 0 70


Y(2) -160 -20 160 20 0 0 0 70
Y(3) 20 160 -20 -160 0 0 0 70
Y(4) 143 -53 -17 -73 0 0 -315 70
Y(5) 143 -53 -17 -73 0 -315 0 70
Y(6) 143 -53 -17 -73 -315 0 0 70
Y(7) 53 -143 73 17 315 0 0 70
Y(8) 53 -143 73 17 0 315 0 70
Y(9) 53 -143 73 17 0 0 315 70
Y(10) 17 73 -143 53 0 0 315 70
Y(11) 17 73 -143 53 0 315 0 70
Y(12) 17 73 -143 53 315 0 0 70
Y(13) -73 -17 -53 143 -315 0 0 70
Y(14) -73 -17 -53 143 0 -315 0 70
Y(15) -73 -17 -53 143 0 0 -315 70
Y(16) 56 -56 196 -196 0 0 0 70
Y(17) 20 160 -20 -160 0 0 0 70
Y(18) -160 -20 160 20 0 0 0 70
R* 315 315 315 315 315 315 315 0

R* = Average value of the 4 reference zeners

P = left-right effect

FACTORS FOR REPEATABILITY STANDARD DEVIATIONS

V K1 1 1 1 1 1 1 1
1 0.5000 +
1 0.5000 +
1 0.5000 +
2 0.7071 + +
3 0.8660 + + +
0 0.5578 + -

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3438.htm (2 of 3) [11/13/2003 5:38:07 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3438.htm (3 of 3) [11/13/2003 5:38:07 PM]


2.3.4.3.9. Design for 3 references and 1 test resistor 2.3.4.3.9. Design for 3 references and 1 test resistor

FACTORS FOR COMPUTING STANDARD DEVIATIONS


OHM FACTORS RESISTORS
1 1 1 1
1 0.3333 +
1 0.5270 +
2. Measurement Process Characterization
1 0.5270 +
2.3. Calibration
2.3.4. Catalog of calibration designs
1 0.7817 +
2.3.4.3. Designs for electrical quantities

Explanation of notation and interpretation of tables


2.3.4.3.9. Design for 3 references and 1 test
resistor
Design 1,1,1,1

OBSERVATIONS 1 1 1 1

Y(1) + -
Y(2) + -
Y(3) + -
Y(4) - +
Y(5) - +
Y(6) - +

RESTRAINT + + +

DEGREES OF FREEDOM = 3

SOLUTION MATRIX
DIVISOR = 6

OBSERVATIONS 1 1 1 1

Y(1) 1 -2 1 1
Y(2) 1 1 -2 1
Y(3) 0 0 0 -3
Y(4) 0 0 0 3
Y(5) -1 -1 2 -1
Y(6) -1 2 -1 -1
R 2 2 2 2

R = AVERAGE VALUE OF 3 REFERENCE RESISTORS

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3439.htm (1 of 2) [11/13/2003 5:38:07 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3439.htm (2 of 2) [11/13/2003 5:38:07 PM]


2.3.4.3.10. Design for 4 references and 1 test resistor 2.3.4.3.10. Design for 4 references and 1 test resistor

Y(4) -1 -1 -1 3 -1
Y(5) 1 1 1 -3 1
Y(6) 1 1 -3 1 1
Y(7) 1 -3 1 1 1
Y(8) -3 1 1 1 1
2. Measurement Process Characterization
R 2 2 2 2 2
2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.3. Designs for electrical quantities R = AVERAGE VALUE OF REFERENCE RESISTORS

2.3.4.3.10. Design for 4 references and 1


FACTORS FOR COMPUTING STANDARD DEVIATIONS
test resistor OHM FACTORS
1 1 1 1 1
Design 1,1,1,1,1 1 0.6124 +
1 0.6124 +
1 0.6124 +
1 0.6124 +
OBSERVATIONS 1 1 1 1 1
1 0.3536 +

Y(1) + - Explanation of notation and interpretation of tables


Y(2) + -
Y(3) + -
Y(4) + -
Y(5) - +
Y(6) - +
Y(7) - +
Y(8) - +

RESTRAINT + + + +

DEGREES OF FREEDOM = 4

SOLUTION MATRIX
DIVISOR = 8

OBSERVATIONS 1 1 1 1 1

Y(1) 3 -1 -1 -1 -1
Y(2) -1 3 -1 -1 -1
Y(3) -1 -1 3 -1 -1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc343a.htm (1 of 2) [11/13/2003 5:38:07 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc343a.htm (2 of 2) [11/13/2003 5:38:07 PM]


2.3.4.4. Roundness measurements 2.3.4.4. Roundness measurements

Low precision Some measurements of roundness do not require a high level of precision, such as
measurements measurements on cylinders, spheres, and ring gages where roundness is not of
primary importance. For this purpose, a single trace is made of the workpiece.
2. Measurement Process Characterization
2.3. Calibration Weakness of The weakness of this method is that the deviations contain both the spindle error
2.3.4. Catalog of calibration designs single trace and the workpiece error, and these two errors cannot be separated with the single
method trace. Because the spindle error is usually small and within known limits, its effect
can be ignored except when the most precise measurements are needed.
2.3.4.4. Roundness measurements
High precision High precision measurements of roundness are appropriate where an object, such
Roundness Measurements of roundness require 360° traces of the workpiece made with a measurements as a hemisphere, is intended to be used primarily as a roundness standard.
measurements turntable-type instrument or a stylus-type instrument. A least squares fit of points
on the trace to a circle define the parameters of noncircularity of the workpiece. A Measurement The measurement sequence involves making multiple traces of the roundness
diagram of the measurement method is shown below. method standard where the standard is rotated between traces. Least-squares analysis of the
resulting measurements enables the noncircularity of the spindle to be separated
The diagram from the profile of the standard.
shows the
trace and Y, Choice of A synopsis of the measurement method and the estimation technique are given in
the distance measurement this chapter for:
from the method ● Single-trace method
spindle center
● Multiple-trace method
to the trace at
the angle. The reader is encouraged to obtain a copy of the publication on roundness (Reeve)
for a more complete description of the measurement method and analysis.
A least
squares circle
fit to data at
equally spaced
angles gives
estimates of P
- R, the
noncircularity,
where R =
radius of the
circle and P =
distance from
the center of
the circle to
the trace.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc344.htm (1 of 2) [11/13/2003 5:38:08 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc344.htm (2 of 2) [11/13/2003 5:38:08 PM]


2.3.4.4.1. Single-trace roundness design 2.3.4.4.1. Single-trace roundness design

Noncircularity The deviation of the trace from the circle at angle , which defines
of workpiece
the noncircularity of the workpiece, is estimated by:

2. Measurement Process Characterization


2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.4. Roundness measurements
Weakness of The weakness of this method is that the deviations contain both the
single trace spindle error and the workpiece error, and these two errors cannot be
2.3.4.4.1. Single-trace roundness design method separated with the single trace. Because the spindle error is usually
small and within known limits, its effect can be ignored except when
the most precise measurements are needed.
Low precision Some measurements of roundness do not require a high level of
measurements precision, such as measurements on cylinders, spheres, and ring gages
where roundness is not of primary importance. The diagram of the
measurement method shows the trace and Y, the distance from the
spindle center to the trace at the angle. A least-squares circle fit to data
at equally spaced angles gives estimates of P - R, the noncircularity,
where R = radius of the circle and P = distance from the center of the
circle to the trace.

Single trace For this purpose, a single trace covering exactly 360° is made of the
method workpiece and measurements at angles of the distance between
the center of the spindle and the trace, are made at

equally spaced angles. A least-squares circle fit to the data gives the
following estimators of the parameters of the circle.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3441.htm (1 of 2) [11/13/2003 5:38:08 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3441.htm (2 of 2) [11/13/2003 5:38:08 PM]


2.3.4.4.2. Multiple-trace roundness designs 2.3.4.4.2. Multiple-trace roundness designs

Terms For the jth graph, let the three parameters that define the LSC be given
relating to by
parameters of
2. Measurement Process Characterization least squares
2.3. Calibration circle
2.3.4. Catalog of calibration designs defining the radius R, a, and b as shown in the graph. In an idealized
2.3.4.4. Roundness measurements measurement system these parameters would be constant for all j. In
reality, each rotation of the workpiece causes it to shift a small amount
vertically and horizontally. To account for this shift, separate
2.3.4.4.2. Multiple-trace roundness designs parameters are needed for each trace.

High High precision roundness measurements are required when an object, Correction Let be the observed distance (in polar graph units) from the center
precision such as a hemisphere, is intended to be used primarily as a roundness for
measurements standard. The method outlined on this page is appropriate for either a obstruction to of the jth graph to the point on the curve that corresponds to the
turntable-type instrument or a spindle-type instrument. stylus position of the spindle. If K is the magnification factor of the
instrument in microinches/polar graph unit and is the angle between
Measurement The measurement sequence involves making multiple traces of the the lever arm of the stylus and the tangent to the workpiece at the point
method roundness standard where the standard is rotated between traces. of contact (which normally can be set to zero if there is no
Least-squares analysis of the resulting measurements enables the obstruction), the transformed observations to be used in the estimation
noncircularity of the spindle to be separated from the profile of the equations are:
standard. The reader is referred to the publication on the subject
(Reeve) for details covering measurement techniques and analysis.

Method of n The number of traces that are made on the workpiece is arbitrary but .
traces should not be less than four. The workpiece is centered as well as
possible under the spindle. The mark on the workpiece which denotes Estimates for The estimation of the individual parameters is obtained as a
the zero angular position is aligned with the zero position of the parameters least-squares solution that requires six restraints which essentially
spindle as shown in the graph. A trace is made with the workpiece in guarantee that the sum of the vertical and horizontal deviations of the
this position. The workpiece is then rotated clockwise by 360/n spindle from the center of the LSC are zero. The expressions for the
degrees and another trace is made. This process is continued until n estimators are as follows:
traces have been recorded.

Mathematical For i = 1,...,n, the ith angular position is denoted by


model for
estimation

Definition of The deviation from the least squares circle (LSC) of the workpiece at
terms relating the position is .
to distances
to the least The deviation of the spindle from its LSC at the position is .
squares circle

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3442.htm (1 of 4) [11/13/2003 5:38:09 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3442.htm (2 of 4) [11/13/2003 5:38:09 PM]


2.3.4.4.2. Multiple-trace roundness designs 2.3.4.4.2. Multiple-trace roundness designs

Computation The computation of the residual standard deviation of the fit requires,
of standard first, the computation of the predicted values,
deviation

The residual standard deviation with v = n*n - 5n + 6 degrees of


freedom is

where

Finally, the standard deviations of the profile estimators are given by:

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3442.htm (3 of 4) [11/13/2003 5:38:09 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3442.htm (4 of 4) [11/13/2003 5:38:09 PM]


2.3.4.5. Designs for angle blocks 2.3.4.5. Designs for angle blocks

Block sizes Angle blocks normally come in sets of


1, 3, 5, 20, and 30 seconds
2. Measurement Process Characterization 1, 3, 5, 20, 30 minutes
2.3. Calibration
2.3.4. Catalog of calibration designs 1, 3, 5, 15, 30, 45 degrees
and blocks of the same nominal size from 4, 5 or 6 different sets can be
calibrated simultaneously using one of the designs shown in this catalog.
2.3.4.5. Designs for angle blocks ● Design for 4 angle blocks

Purpose The purpose of this section is to explain why calibration of angle blocks of ● Design for 5 angle blocks
the same size in groups is more efficient than calibration of angle blocks ● Design for 6 angle blocks
individually.
Restraint The solution to the calibration design depends on the known value of a
Calibration A schematic of a calibration scheme for 1 reference block, 1 check standard, reference block, which is compared with the test blocks. The reference block
schematic for and three test blocks is shown below. The reference block, R, is shown in the is designated as block 1 for the purpose of this discussion.
five angle center of the diagram and the check standard, C, is shown at the top of the
blocks diagram. Check It is suggested that block 2 be reserved for a check standard that is maintained
showing the standard in the laboratory for quality control purposes.
reference as
block 1 in the
Calibration A calibration scheme developed by Charles Reeve (Reeve) at the National
center of the
scheme Institute of Standards and Technology for calibrating customer angle blocks
diagram, the
check is explained on this page. The reader is encouraged to obtain a copy of the
standard as publication for details on the calibration setup and quality control checks for
block 2 at the angle block calibrations.
top; and the
test blocks as Series of For all of the designs, the measurements are made in groups of seven starting
blocks 3, 4, measurements with the measurements of blocks in the following order: 2-3-2-1-2-4-2.
and 5. for calibrating Schematically, the calibration design is completed by counter-clockwise
4, 5, and 6 rotation of the test blocks about the reference block, one-at-a-time, with 7
angle blocks readings for each series reduced to 3 difference measurements. For n angle
simultaneously blocks (including the reference block), this amounts to n - 1 series of 7
readings. The series for 4, 5, and 6 angle blocks are shown below.

Measurements
for 4 angle Series 1: 2-3-2-1-2-4-2
blocks Series 2: 4-2-4-1-4-3-4
Series 3: 3-4-3-1-3-2-3

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc345.htm (1 of 6) [11/13/2003 5:38:16 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc345.htm (2 of 6) [11/13/2003 5:38:16 PM]


2.3.4.5. Designs for angle blocks 2.3.4.5. Designs for angle blocks

Measurements Calibration The check block, C, is measured before and after each test block, and the
for 5 angle procedure difference measurements (which are not the same as the difference
blocks (see depends on measurements for calibrations of mass weights, gage blocks, etc.) are
diagram) difference constructed to take advantage of this situation. Thus, the 7 readings are
Series 1: 2-3-2-1-2-4-2
measurements reduced to 3 difference measurements for the first series as follows:
Series 2: 5-2-5-1-5-3-5
Series 3: 4-5-4-1-4-2-4
Series 4: 3-4-3-1-3-5-3

Measurements
for 6 angle Series 1: 2-3-2-1-2-4-2 For all series, there are 3(n - 1) difference measurements, with the first
blocks Series 2: 6-2-6-1-6-3-6 subscript in the equations above referring to the series number. The difference
Series 3: 5-6-5-1-5-2-5 measurements are free of drift and instrument bias.
Series 4: 4-5-4-1-4-6-4
Series 5: 3-4-3-1-3-5-3 Design matrix As an example, the design matrix for n = 4 angle blocks is shown below.

Equations for The equations explaining the seven measurements for the first series in terms 1 1 1 1
the of the errors in the measurement system are:
measurements 0 1 -1 0
in the first Z11 = B + X1 + error11 -1 1 0 0
series showing 0 1 0 -1
error sources
Z12 = B + X2 + d + error12 0 -1 0 1
Z13 = B + X3 + 2d + error13 -1 0 0 1
0 0 -1 1
Z14 = B + X4 + 3d + error14 0 0 1 -1
Z15 = B + X5 + 4d + error15 -1 0 1 0
0 -1 1 0
Z16 = B + X6 + 5d + error16
Z17 = B + X7 + 6d + error17 The design matrix is shown with the solution matrix for identification
purposes only because the least-squares solution is weighted (Reeve) to
with B a bias associated with the instrument, d is a linear drift factor, X is the account for the fact that test blocks are measured twice as many times as the
value of the angle block to be determined; and the error terms relate to reference block. The weight matrix is not shown.
random errors of measurement.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc345.htm (3 of 6) [11/13/2003 5:38:16 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc345.htm (4 of 6) [11/13/2003 5:38:16 PM]


2.3.4.5. Designs for angle blocks 2.3.4.5. Designs for angle blocks

Solutions to Solutions to the angle block designs are shown on the following pages. The Calculation of For n blocks, the differences between the values for the blocks measured in
the calibration solution matrix and factors for the repeatability standard deviation are to be standard the top ( denoted by "t") and bottom (denoted by "b") positions are denoted
designs interpreted as explained in solutions to calibration designs . As an example, deviations by:
measurements the solution for the design for n=4 angle blocks is as follows: when the
blocks are
The solution for the reference standard is shown under the first column of the measured in The standard deviation of the average (for each block) is calculated from
solution matrix; for the check standard under the second column; for the first two these differences to be:
test block under the third column; and for the second test block under the orientations
fourth column. Notice that the estimate for the reference block is guaranteed
to be R*, regardless of the measurement results, because of the restraint that
is imposed on the design. Specifically,

Standard If the blocks are measured in only one orientation, there is no way to estimate
deviations the between-series component of variability and the standard deviation for the
when the value of each block is computed as
blocks are
measured in stest = K1s1
only one where K1 is shown under "Factors for computing repeatability standard
orientation
deviations" for each design and is the repeatability standard deviation as
estimated from the design. Because this standard deviation may seriously
underestimate the uncertainty, a better approach is to estimate the standard
deviation from the data on the check standard over time. An expanded
uncertainty is computed according to the ISO guidelines.

Solutions are correct only for the restraint as shown.

Calibrations The calibration series is run with the blocks all face "up" and is then repeated
can be run for with the blocks all face "down", and the results averaged. The difference
top and between the two series can be large compared to the repeatability standard
bottom faces deviation, in which case a between-series component of variability must be
of blocks included in the calculation of the standard deviation of the reported average.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc345.htm (5 of 6) [11/13/2003 5:38:16 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc345.htm (6 of 6) [11/13/2003 5:38:16 PM]


2.3.4.5.1. Design for 4 angle blocks 2.3.4.5.1. Design for 4 angle blocks

Y(31) 0 -1.2206578 2.2723000 -5.0516438


Y(32) 0 7.3239479 9.3521166 7.3239479
Y(33) 0 -5.0516438 2.2723000 -1.2206578
R* 1 1. 1. 1.
2. Measurement Process Characterization
R* = VALUE OF REFERENCE ANGLE BLOCK
2.3. Calibration
2.3.4. Catalog of calibration designs
2.3.4.5. Designs for angle blocks
FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
SIZE K1
2.3.4.5.1. Design for 4 angle blocks 1 1 1 1
1 0.0000 +
1 0.9749 +
1 0.9749 +
DESIGN MATRIX 1 0.9749 +
1 1 1 1 1 0.9749 +
Y(1) 0 1 -1 0 Explanation of notation and interpretation of tables
Y(2) -1 1 0 0
Y(3) 0 1 0 -1
Y(4) 0 -1 0 1
Y(5) -1 0 0 1
Y(6) 0 0 -1 1
Y(7) 0 0 1 -1
Y(8) -1 0 1 0
Y(9) 0 -1 1 0

REFERENCE +

CHECK STANDARD +

DEGREES OF FREEDOM = 6

SOLUTION MATRIX
DIVISOR = 24

OBSERVATIONS 1 1 1 1

Y(11) 0 2.2723000 -5.0516438 -1.2206578


Y(12) 0 9.3521166 7.3239479 7.3239479
Y(13) 0 2.2723000 -1.2206578 -5.0516438
Y(21) 0 -5.0516438 -1.2206578 2.2723000
Y(22) 0 7.3239479 7.3239479 9.3521166
Y(23) 0 -1.2206578 -5.0516438 2.2723000

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3451.htm (1 of 2) [11/13/2003 5:38:16 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3451.htm (2 of 2) [11/13/2003 5:38:16 PM]


2.3.4.5.2. Design for 5 angle blocks 2.3.4.5.2. Design for 5 angle blocks

Y(13) 0.00000 2.48697 -0.89818 -4.80276 -0.78603


Y(21) 0.00000 -5.48893 -0.21200 -1.56370 3.26463
Y(22) 0.00000 5.38908 5.93802 4.71618 7.95672
Y(23) 0.00000 -0.89818 -4.80276 -0.78603 2.48697
Y(31) 0.00000 -0.21200 -1.56370 3.26463 -5.48893
2. Measurement Process Characterization
Y(32) 0.00000 5.93802 4.71618 7.95672 5.38908
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(33) 0.00000 -4.80276 -0.78603 2.48697 -0.89818
2.3.4.5. Designs for angle blocks Y(41) 0.00000 -1.56370 3.26463 -5.48893 -0.21200
Y(42) 0.00000 4.71618 7.95672 5.38908 5.93802
Y(43) 0.00000 -0.78603 2.48697 -0.89818 -4.80276
2.3.4.5.2. Design for 5 angle blocks R* 1. 1. 1. 1. 1.

R* = VALUE OF REFERENCE ANGLE BLOCK


DESIGN MATRIX

1 1 1 1 1 FACTORS FOR REPEATABILITY STANDARD DEVIATIONS


SIZE K1
0 1 -1 0 0 1 1 1 1 1
-1 1 0 0 0 1 0.0000 +
0 1 0 -1 0 1 0.7465 +
0 -1 0 0 1 1 0.7465 +
-1 0 0 0 1 1 0.7456 +
0 0 -1 0 1 1 0.7456 +
0 0 0 1 -1 1 0.7465 +
-1 0 0 1 0
0 -1 0 1 0 Explanation of notation and interpretation of tables
0 0 1 -1 0
-1 0 1 0 0
0 0 1 0 -1

REFERENCE +

CHECK STANDARD +

DEGREES OF FREEDOM = 8

SOLUTION MATRIX
DIVISOR = 24

OBSERVATIONS 1 1 1 1 1

Y(11) 0.00000 3.26463 -5.48893 -0.21200 -1.56370


Y(12) 0.00000 7.95672 5.38908 5.93802 4.71618

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3452.htm (1 of 2) [11/13/2003 5:38:16 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3452.htm (2 of 2) [11/13/2003 5:38:16 PM]


2.3.4.5.3. Design for 6 angle blocks 2.3.4.5.3. Design for 6 angle blocks

Y(11) 0.0000 3.2929 -5.2312 -0.7507 -0.6445 -0.6666


Y(12) 0.0000 6.9974 4.6324 4.6495 3.8668 3.8540
Y(13) 0.0000 3.2687 -0.7721 -5.2098 -0.6202 -0.6666
Y(21) 0.0000 -5.2312 -0.7507 -0.6445 -0.6666 3.2929
Y(22) 0.0000 4.6324 4.6495 3.8668 3.8540 6.9974
2. Measurement Process Characterization
Y(23) 0.0000 -0.7721 -5.2098 -0.6202 -0.6666 3.2687
2.3. Calibration
2.3.4. Catalog of calibration designs
Y(31) 0.0000 -0.7507 -0.6445 -0.6666 3.2929 -5.2312
2.3.4.5. Designs for angle blocks Y(32) 0.0000 4.6495 3.8668 3.8540 6.9974 4.6324
Y(33) 0.0000 -5.2098 -0.6202 -0.6666 3.2687 -0.7721
Y(41) 0.0000 -0.6445 -0.6666 3.2929 -5.2312 -0.7507
2.3.4.5.3. Design for 6 angle blocks Y(42) 0.0000 3.8668 3.8540 6.9974 4.6324 4.6495
Y(43) 0.0000 -0.6202 -0.6666 3.2687 -0.7721 -5.2098
Y(51) 0.0000 -0.6666 3.2929 -5.2312 -0.7507 -0.6445
DESIGN MATRIX Y(52) 0.0000 3.8540 6.9974 4.6324 4.6495 3.8668
Y(53) 0.0000 -0.6666 3.2687 -0.7721 -5.2098 -0.6202
1 1 1 1 1 1 R* 1. 1. 1. 1. 1. 1.
0 1 -1 0 0 0 R* = VALUE OF REFERENCE ANGLE BLOCK
-1 1 0 0 0 0
0 1 0 -1 0 0
0 -1 0 0 0 1 FACTORS FOR REPEATABILITY STANDARD DEVIATIONS
-1 0 0 0 0 1 SIZE K1
0 0 -1 0 0 1 1 1 1 1 1 1
0 0 0 0 1 -1 1 0.0000 +
-1 0 0 0 1 0 1 0.7111 +
0 -1 0 0 1 0 1 0.7111 +
0 0 0 1 -1 0 1 0.7111 +
-1 0 0 1 0 0 1 0.7111 +
0 0 0 1 0 -1 1 0.7111 +
0 0 1 -1 0 0 1 0.7111 +
-1 0 1 0 0 0
0 0 1 0 -1 0 Explanation of notation and interpretation of tables

REFERENCE +

CHECK STANDARD +

DEGREES OF FREEDOM = 10

SOLUTION MATRIX
DIVISOR = 24

OBSERVATIONS 1 1 1 1 1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3453.htm (1 of 2) [11/13/2003 5:38:16 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3453.htm (2 of 2) [11/13/2003 5:38:16 PM]


2.3.4.6. Thermometers in a bath 2.3.4.6. Thermometers in a bath

Estimates of The estimates of the shift due to the resistance thermometer and
drift temperature drift are given by:

2. Measurement Process Characterization


2.3. Calibration
2.3.4. Catalog of calibration designs

2.3.4.6. Thermometers in a bath


Standard The residual variance is given by
Measurement Calibration of liquid in glass thermometers is usually carried out in a
deviations
sequence controlled bath where the temperature in the bath is increased steadily
over time to calibrate the thermometers over their entire range. One
way of accounting for the temperature drift is to measure the
temperature of the bath with a standard resistance thermometer at the .
beginning, middle and end of each run of K test thermometers. The test
thermometers themselves are measured twice during the run in the The standard deviation of the indication assigned to the ith test
following time sequence: thermometer is

where R1, R2, R3 represent the measurements on the standard resistance


thermometer and T1, T2, ... , TK and T'1, T'2, ... , T'K represent the pair and the standard deviation for the estimates of shift and drift are
of measurements on the K test thermometers.

Assumptions The assumptions for the analysis are that:


regarding ● Equal time intervals are maintained between measurements on
temperature the test items.
● Temperature increases by with each interval.
● A temperature change of is allowed for the reading of the
resistance thermometer in the middle of the run.

Indications It can be shown (Cameron and Hailes) that the average reading for a respectively.
for test test thermometer is its indication at the temperature implied by the
thermometers average of the three resistance readings. The standard deviation
associated with this indication is calculated from difference readings
where

is the difference for the ith thermometer. This difference is an estimate


of .

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc346.htm (1 of 2) [11/13/2003 5:38:17 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc346.htm (2 of 2) [11/13/2003 5:38:17 PM]


2.3.4.7. Humidity standards 2.3.4.7.1. Drift-elimination design for 2 reference weights and 3 cylinders

2. Measurement Process Characterization 2. Measurement Process Characterization


2.3. Calibration 2.3. Calibration
2.3.4. Catalog of calibration designs 2.3.4. Catalog of calibration designs
2.3.4.7. Humidity standards

2.3.4.7. Humidity standards


2.3.4.7.1. Drift-elimination design for 2
Humidity standards The calibration of humidity standards
usually involves the comparison of
reference weights and 3 cylinders
reference weights with cylinders
containing moisture. The designs shown
in this catalog are drift-eliminating and
may be suitable for artifacts other than OBSERVATIONS 1 1 1 1 1
humidity cylinders.

List of designs Y(1) + -


Y(2) + -
● 2 reference weights and 3 cylinders Y(3) + -
Y(4) + -
Y(5) - +
Y(6) - +
Y(7) + -
Y(8) + -
Y(9) - +
Y(10) + -

RESTRAINT + +

CHECK STANDARD + -

DEGREES OF FREEDOM = 6

SOLUTION MATRIX
DIVISOR = 10

OBSERVATIONS 1 1 1 1 1

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc347.htm [11/13/2003 5:38:17 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3471.htm (1 of 2) [11/13/2003 5:38:17 PM]


2.3.4.7.1. Drift-elimination design for 2 reference weights and 3 cylinders 2.3.5. Control of artifact calibration

Y(1) 2 -2 0 0 0
Y(2) 0 0 0 2 -2
Y(3) 0 0 2 -2 0
Y(4) -1 1 -3 -1 -1
2. Measurement Process Characterization
Y(5) -1 1 1 1 3
2.3. Calibration
Y(6) -1 1 1 3 1
Y(7) 0 0 2 0 -2
Y(8)
Y(9)
-1
1
1
-1
-1
1
-3
1
-1
3
2.3.5. Control of artifact calibration
Y(10) 1 -1 -3 -1 -1
R* 5 5 5 5 5 Purpose The purpose of statistical control in the calibration process is to
guarantee the 'goodness' of calibration results within predictable limits
R* = average value of the two reference weights and to validate the statement of uncertainty of the result. Two types of
control can be imposed on a calibration process that makes use of
statistical designs:
FACTORS FOR REPEATABILITY STANDARD DEVIATIONS 1. Control of instrument precision or short-term variability
2. Control of bias and long-term variability
WT K1 1 1 1 1 1
1 0.5477 + ❍ Example of a Shewhart control chart
1 0.5477 + ❍ Example of an EWMA control chart
1 0.5477 +
2 0.8944 + + Short-term The short-term standard deviation from each design is the basis for
3 1.2247 + + + standard controlling instrument precision. Because the measurements for a single
0 0.6325 + - deviation design are completed in a short time span, this standard deviation
estimates the basic precision of the instrument. Designs should be
Explanation of notation and interpretation of tables chosen to have enough measurements so that the standard deviation
from the design has at least 3 degrees of freedom where the degrees of
freedom are (n - m + 1) with
● n = number of difference measurements

● m = number of artifacts.

Check Measurements on a check standard provide the mechanism for


standard controlling the bias and long-term variability of the calibration process.
The check standard is treated as one of the test items in the calibration
design, and its value as computed from each calibration run is the basis
for accepting or rejecting the calibration. All designs cataloged in this
Handbook have provision for a check standard.
The check standard should be of the same type and geometry as items
that are measured in the designs. These artifacts must be stable and
available to the calibration process on a continuing basis. There should
be a check standard at each critical level of measurement. For example,
for mass calibrations there should be check standards at the 1 kg; 100 g,
10 g, 1 g, 0.1 g levels, etc. For gage blocks, there should be check

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3471.htm (2 of 2) [11/13/2003 5:38:17 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc35.htm (1 of 2) [11/13/2003 5:38:17 PM]


2.3.5. Control of artifact calibration 2.3.5.1. Control of precision

standards at all nominal lengths.


A check standard can also be a mathematical construction, such as the
computed difference between the calibrated values of two reference
standards in a design.
2. Measurement Process Characterization
Database of The creation and maintenance of the database of check standard values 2.3. Calibration
check is an important aspect of the control process. The results from each 2.3.5. Control of artifact calibration
standard calibration run are recorded in the database. The best way to record this
values information is in one file with one line (row in a spreadsheet) of
information in fixed fields for each calibration run. A list of typical 2.3.5.1. Control of precision
entries follows:
1. Date Control A modified control chart procedure is used for controlling instrument
parameters precision. The procedure is designed to be implemented in real time
2. Identification for check standard from after a baseline and control limit for the instrument of interest have been
3. Identification for the calibration design historical established from the database of short-term standard deviations. A
4. Identification for the instrument data separate control chart is required for each instrument -- except where
5. Check standard value instruments are of the same type with the same basic precision, in which
case they can be treated as one.
6. Repeatability standard deviation from design
7. Degrees of freedom The baseline is the process standard deviation that is pooled from k = 1,
8. Operator identification ..., K individual repeatability standard deviations, , in the database,
9. Flag for out-of-control signal each having degrees of freedom. The pooled repeatability standard
10. Environmental readings (if pertinent) deviation is

with degrees of freedom

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc35.htm (2 of 2) [11/13/2003 5:38:17 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc351.htm (1 of 2) [11/13/2003 5:38:17 PM]


2.3.5.1. Control of precision 2.3.5.1.1. Example of control chart for precision

Control The control procedure compares each new repeatability standard


procedure is deviation that is recorded for the instrument with an upper control limit,
invoked in UCL. Usually, only the upper control limit is of interest because we are
real-time for primarily interested in detecting degradation in the instrument's 2. Measurement Process Characterization
2.3. Calibration
each precision. A possible complication is that the control limit is dependent
2.3.5. Control of artifact calibration
calibration on the degrees of freedom in the new standard deviation and is 2.3.5.1. Control of precision
run computed as follows:

2.3.5.1.1. Example of control chart for precision


.
Example of a Mass calibrations usually start with the comparison of kilograms standards using a high
The quantity under the radical is the upper percentage point from the control chart precision balance as a comparator. Many of the measurements at the kilogram level that
F table where is chosen small to be, say, 05. The other two terms for precision were made at NIST between 1975 and 1990 were made on balance #12 using a 1,1,1,1
refer to the degrees of freedom in the new standard deviation and the of a mass calibration design. The redundancy in the calibration design produces estimates for the
degrees of freedom in the process standard deviation. balance individual kilograms and a repeatability standard deviation with three degrees of freedom
for each calibration run. These standard deviations estimate the precision of the balance.
Limitation The graphical method of plotting every new estimate of repeatability on
of graphical a control chart does not work well when the UCL can change with each Need for The precision of the balance is monitored to check for:
method calibration design, depending on the degrees of freedom. The algebraic monitoring 1. Slow degradation in the balance
equivalent is to test if the new standard deviation exceeds its control precision
2. Anomalous behavior at specific times
limit, in which case the short-term precision is judged to be out of
control and the current calibration run is rejected. For more guidance, Monitoring The standard deviations over time and many calibrations are tracked and monitored using a
see Remedies and strategies for dealing with out-of-control signals. technique for control chart for standard deviations. The database and control limits are updated on a
standard yearly or bi-yearly basis and standard deviations for each calibration run in the next cycle
As long as the repeatability standard deviations are in control, there is deviations are compared with the control limits. In this case, the standard deviations from 117
reason for confidence that the precision of the instrument has not calibrations between 1975 and 1985 were pooled to obtain a repeatability standard
degraded. deviation with v = 3*117 = 351 degrees of freedom, and the control limits were computed
at the 1% significance level.
Case study: It is recommended that the repeatability standard deviations be plotted
Mass against time on a regular basis to check for gradual degradation in the Run the Dataplot commands for creating the control chart are as follows:
instrument. Individual failures may not trigger a suspicion that the software
balance
instrument is in need of adjustment or tuning. macro for dimension 30 columns
precision
creating the skip 4
control chart read mass.dat t id y bal s ds
for balance let n = size s
#12 y1label MICROGRAMS
x1label TIME IN YEARS
xlimits 75 90
x2label STANDARD DEVIATIONS ON BALANCE 12
characters * blank blank blank
lines blank solid dotted dotted
let ss=s*s
let sp=mean ss
let sp=sqrt(sp)
let scc=sp for i = 1 1 n
let f = fppf(.99,3,351)

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc351.htm (2 of 2) [11/13/2003 5:38:17 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3511.htm (1 of 2) [11/13/2003 5:38:18 PM]


2.3.5.1.1. Example of control chart for precision 2.3.5.2. Control of bias and long-term variability

let f=sqrt(f)
let sul=f*scc
plot s scc sul vs t

Control chart
for precision 2. Measurement Process Characterization
2.3. Calibration
2.3.5. Control of artifact calibration

2.3.5.2. Control of bias and long-term


variability
Control A control chart procedure is used for controlling bias and long-term
parameters variability. The procedure is designed to be implemented in real time
are estimated after a baseline and control limits for the check standard of interest
using have been established from the database of check standard values. A
historical separate control chart is required for each check standard. The control
data procedure outlined here is based on a Shewhart control chart with
upper and lower control limits that are symmetric about the average.
The EWMA control procedure that is sensitive to small changes in the
process is discussed on another page.

For a The check standard values are denoted by


Shewhart
control
procedure, the
average and The baseline is the process average which is computed from the check
TIME IN YEARS standard standard values as
deviation of
Interpretation The control chart shows that the precision of the balance remained in control through 1990 historical
of the control with only two violations of the control limits. For those occasions, the calibrations were check
chart discarded and repeated. Clearly, for the second violation, something significant occurred
standard
that invalidated the calibration results.
values are the The process standard deviation is
Further However, it is also clear from the pattern of standard deviations over time that the precision parameters of
interpretation of the balance was gradually degrading and more and more points were approaching the interest
of the control control limits. This finding led to a decision to replace this balance for high accuracy
chart calibrations.

with (K - 1) degrees of freedom.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3511.htm (2 of 2) [11/13/2003 5:38:18 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc352.htm (1 of 3) [11/13/2003 5:38:18 PM]


2.3.5.2. Control of bias and long-term variability 2.3.5.2. Control of bias and long-term variability

The control If has been computed from historical data, the upper and lower Actions to be If the check standard value exceeds one of the control limits, the
limits depend control limits are: taken process is judged to be out of control and the current calibration run is
on the t- rejected. The best strategy in this situation is to repeat the calibration
distribution to see if the failure was a chance occurrence. Check standard values
and the that remain in control, especially over a period of time, provide
degrees of confidence that no new biases have been introduced into the
freedom in the measurement process and that the long-term variability of the process
process with denoting the upper critical value from the has not changed.
standard
deviation t-table with v = (K - 1) degrees of freedom.
Out-of-control Out-of-control signals, particularly if they recur, can be symptomatic
signals that of one of the following conditions:
Run software Dataplot can compute the value of the t-statistic. For a conservative recur require ● Change or damage to the reference standard(s)
macro for case with = 0.05 and K = 6, the commands investigation
● Change or damage to the check standard
computing the
● Change in the long-term variability of the calibration process
t-factor
let alphau = 1 - 0.05/2 For more guidance, see Remedies and strategies for dealing with
let k = 6 out-of-control signals.
let v1 = k-1
let t = tppf(alphau, v1) Caution - be If the tests for control are carried out algebraically, it is recommended
return the following value: sure to plot that, at regular intervals, the check standard values be plotted against
the data time to check for drift or anomalies in the measurement process.
THE COMPUTED VALUE OF THE CONSTANT T =
0.2570583E+01

Simplification It is standard practice to use a value of 3 instead of a critical value


for large from the t-table, given the process standard deviation has large degrees
degrees of of freedom, say, v > 15.
freedom

The control The control procedure compares the check standard value, C, from
procedure is each calibration run with the upper and lower control limits. This
invoked in procedure should be implemented in real time and does not necessarily
real-time and require a graphical presentation. The check standard value can be
a failure compared algebraically with the control limits. The calibration run is
implies that judged to be out-of-control if either:
the current
calibration C > UCL
should be
rejected
or
C < LCL

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc352.htm (2 of 3) [11/13/2003 5:38:18 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc352.htm (3 of 3) [11/13/2003 5:38:18 PM]


2.3.5.2.1. Example of Shewhart control chart for mass calibrations 2.3.5.2.1. Example of Shewhart control chart for mass calibrations

let ll=cc-3*sd
characters * blank blank blank * blank blank blank
lines blank solid dotted dotted blank solid dotted dotted
plot y cc ul ll vs t
.end of calculations
2. Measurement Process Characterization
2.3. Calibration
2.3.5. Control of artifact calibration Control chart
2.3.5.2. Control of bias and long-term variability of
measurements
of kilogram
2.3.5.2.1. Example of Shewhart control chart for mass check standard
showing a
calibrations change in the
process after
Example of a Mass calibrations usually start with the comparison of four kilogram standards using a high precision 1985
control chart balance as a comparator. Many of the measurements at the kilogram level that were made at NIST
for mass between 1975 and 1990 were made on balance #12 using a 1,1,1,1 calibration design. The restraint for
calibrations at this design is the known average of two kilogram reference standards. The redundancy in the
the kilogram calibration design produces individual estimates for the two test kilograms and the two reference
level standards.

Check There is no slot in the 1,1,1,1 design for an artifact check standard when the first two kilograms are
standard reference standards; the third kilogram is a test weight; and the fourth is a summation of smaller
weights that act as the restraint in the next series. Therefore, the check standard is a computed
difference between the values of the two reference standards as estimated from the design. The
convention with mass calibrations is to report the correction to nominal, in this case the correction to
1000 g, as shown in the control charts below.

Need for The kilogram check standard is monitored to check for:


monitoring 1. Long-term degradation in the calibration process
2. Anomalous behavior at specific times

Monitoring Check standard values over time and many calibrations are tracked and monitored using a Shewhart
technique for control chart. The database and control limits are updated when needed and check standard values for Interpretation The control chart shows only two violations of the control limits. For those occasions, the calibrations
check standard each calibration run in the next cycle are compared with the control limits. In this case, the values of the control were discarded and repeated. The configuration of points is unacceptable if many points are close to a
values from 117 calibrations between 1975 and 1985 were averaged to obtain a baseline and process standard chart control limit and there is an unequal distribution of data points on the two sides of the control chart --
deviation with v = 116 degrees of freedom. Control limits are computed with a factor of k = 3 to indicating a change in either:
identify truly anomalous data points. ● process average which may be related to a change in the reference standards

or
Run the Dataplot commands for creating the control chart are as follows: ● variability which may be caused by a change in the instrument precision or may be the result of
software other factors on the measurement process.
macro for dimension 500 30
creating the skip 4 Small changes Unfortunately, it takes time for the patterns in the data to emerge because individual violations of the
Shewhart read mass.dat t id y bal s ds only become control limits do not necessarily point to a permanent shift in the process. The Shewhart control chart
control chart let n = size y obvious over is not powerful for detecting small changes, say of the order of at most one standard deviation, which
title mass check standard 41 time appears to be approximately the case in this application. This level of change might seem
y1label micrograms insignificant, but the calculation of uncertainties for the calibration process depends on the control
x1label time in years limits.
xlimits 75 90
let ybar=mean y subset t < 85
let sd=standard deviation y subset t < 85
let cc=ybar for i = 1 1 n
let ul=cc+3*sd

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3521.htm (1 of 3) [11/13/2003 5:38:18 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3521.htm (2 of 3) [11/13/2003 5:38:18 PM]


2.3.5.2.1. Example of Shewhart control chart for mass calibrations 2.3.5.2.2. Example of EWMA control chart for mass calibrations

Re-establishing If the limits for the control chart are re-calculated based on the data after 1985, the extent of the
the limits change is obvious. Because the exponentially weighted moving average (EWMA) control chart is
based on capable of detecting small changes, it may be a better choice for a high precision process that is
recent data producing many control values. 2. Measurement Process Characterization
and EWMA 2.3. Calibration
option 2.3.5. Control of artifact calibration
2.3.5.2. Control of bias and long-term variability
Run Dataplot commands for updating the control chart are as follows:
continuation of
software let ybar2=mean y subset t > 85 2.3.5.2.2. Example of EWMA control chart for mass
macro for let sd2=standard deviation y subset t > 85
let n = size y
calibrations
updating
Shewhart let cc2=ybar2 for i = 1 1 n
let ul2=cc2+3*sd2 Small Unfortunately, it takes time for the patterns in the data to emerge because individual violations of the
control chart changes only control limits do not necessarily point to a permanent shift in the process. The Shewhart control chart
let ll2=cc2-3*sd2
plot y cc ul ll vs t subset t < 85 and become is not powerful for detecting small changes, say of the order of at most one standard deviation, which
plot y cc2 ul2 ll2 vs t subset t > 85 obvious over appears to be the case for the calibration data shown on the previous page. The EWMA (exponentially
time weighted moving average) control chart is better suited for this purpose.
Revised
control chart Explanation The exponentially weighted moving average (EWMA) is a statistic for monitoring the process that
based on check of EWMA averages the data in a way that gives less and less weight to data as they are further removed in time
standard statistic at from the current measurement. The EWMA statistic at time t is computed recursively from individual
measurements the kilogram data points which are ordered in time to be
after 1985 level

where the first EWMA statistic is the average of historical data.

Control The EWMA control chart can be made sensitive to small changes or a gradual drift in the process by
mechanism the choice of the weighting factor, . A weighting factor between 0.2 - 0.3 has been suggested for
for EWMA this purpose (Hunter), and 0.15 is another popular choice.

Limits for the The target or center line for the control chart is the average of historical data. The upper (UCL) and
control chart lower (LCL) limits are

where s is the standard deviation of the historical data; the function under the radical is a good
approximation to the component of the standard deviation of the EWMA statistic that is a function of
time; and k is the multiplicative factor, defined in the same manner as for the Shewhart control chart,
which is usually taken to be 3.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3521.htm (3 of 3) [11/13/2003 5:38:18 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3522.htm (1 of 3) [11/13/2003 5:38:19 PM]


2.3.5.2.2. Example of EWMA control chart for mass calibrations 2.3.5.2.2. Example of EWMA control chart for mass calibrations

let upper = mean + 3*fudge*s


Example of The target (average) and process standard deviation are computed from the check standard data taken let lower = mean - 3*fudge*s
EWMA chart prior to 1985. The computation of the EWMA statistic begins with the data taken at the start of 1985. let nm1 = n-1
for check In the control chart below, the control data after 1985 are shown in green, and the EWMA statistics let start = 106
standard data are shown as black dots superimposed on the raw data. The control limits are calculated according to let pred2 = mean
for kilogram the equation above where the process standard deviation, s = 0.03065 mg and k = 3. The EWMA loop for i = start 1 nm1
calibrations statistics, and not the raw data, are of interest in looking for out-of-control signals. Because the let ip1 = i+1
showing EWMA statistic is a weighted average, it has a smaller standard deviation than a single control let yi = y(i)
multiple measurement, and, therefore, the EWMA control limits are narrower than the limits for a Shewhart let predi = pred2(i)
violations of control chart. let predip1 = lambda*yi + (1-lambda)*predi
the control let pred2(ip1) = predip1
limits for the end loop
EWMA char * blank * circle blank blank
statistics char size 2 2 2 1 2 2
char fill on all
lines blank dotted blank solid solid solid
plot y mean versus x and
plot y pred2 lower upper versus x subset x > cutoff

Interpretation The EWMA control chart shows many violations of the control limits starting at approximately the
of the control mid-point of 1986. This pattern emerges because the process average has actually shifted about one
chart standard deviation, and the EWMA control chart is sensitive to small changes.

Run the Dataplot commands for creating the control chart are as follows:
software
macro for dimension 500 30
creating the skip 4
Shewhart read mass.dat x id y bal s ds
control chart let n = number y
let cutoff = 85.0
let tag = 2 for i = 1 1 n
let tag = 1 subset x < cutoff
xlimits 75 90
let m = mean y subset tag 1
let s = sd y subset tag 1
let lambda = .2
let fudge = sqrt(lambda/(2-lambda))
let mean = m for i = 1 1 n

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3522.htm (2 of 3) [11/13/2003 5:38:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3522.htm (3 of 3) [11/13/2003 5:38:19 PM]


2.3.6. Instrument calibration over a regime 2.3.6. Instrument calibration over a regime

Instruments ● The instrument reads in different units than the reference


whose standards. The purpose of the calibration is to convert the
measurements instrument readings to the units of interest. An example is
2. Measurement Process Characterization act as densitometer measurements that act as surrogates for
2.3. Calibration surrogates for measurements of radiation dosage. For this purpose, reference
other standards are irradiated at several dosage levels and then
measurements measured by radiometry. The same reference standards are
2.3.6. Instrument calibration over a regime measured by densitometer. The calibrated results of future
densitometer readings on medical devices are the basis for
deciding if the devices have been sterilized at the proper
Topics This section discusses the creation of a calibration curve for calibrating
radiation level.
instruments (gauges) whose responses cover a large range. Topics are:
● Models for instrument calibration
Basic steps The calibration method is the same for both situations and requires the
● Data collection for correcting following basic steps:
● Assumptions the ● Selection of reference standards with known values to cover the
instrument for range of interest.
● Conditions that can invalidate the calibration procedure bias
● Measurements on the reference standards with the instrument to
● Data analysis and model validation be calibrated.
● Calibration of future measurements ● Functional relationship between the measured and known values
● Uncertainties of calibrated values of the reference standards (usually a least-squares fit to the data)
called a calibration curve.
Purpose of Instrument calibration is intended to eliminate or reduce bias in an ● Correction of all measurements by the inverse of the calibration
instrument instrument's readings over a range for all continuous values. For this curve.
calibration purpose, reference standards with known values for selected points
covering the range of interest are measured with the instrument in Schematic A schematic explanation is provided by the figure below for load cell
question. Then a functional relationship is established between the example of a calibration. The loadcell measurements (shown as *) are plotted on the
values of the standards and the corresponding measurements. There are calibration y-axis against the corresponding values of known load shown on the
two basic situations. curve and y-axis.
resulting
value A quadratic fit to the loadcell data produces the calibration curve that
Instruments ● The instrument reads in the same units as the reference
is shown as the solid line. For a future measurement with the load cell,
which require standards. The purpose of the calibration is to identify and
Y' = 1.344 on the y-axis, a dotted line is drawn through Y' parallel to
correction for eliminate any bias in the instrument relative to the defined unit
the x-axis. At the point where it intersects the calibration curve,
bias of measurement. For example, optical imaging systems that
another dotted line is drawn parallel to the y-axis. Its point of
measure the width of lines on semiconductors read in
intersection with the x-axis at X' = 13.417 is the calibrated value.
micrometers, the unit of interest. Nonetheless, these instruments
must be calibrated to values of reference standards if line width
measurements across the industry are to agree with each other.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc36.htm (1 of 3) [11/13/2003 5:38:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc36.htm (2 of 3) [11/13/2003 5:38:19 PM]


2.3.6. Instrument calibration over a regime 2.3.6.1. Models for instrument calibration

2. Measurement Process Characterization


2.3. Calibration
2.3.6. Instrument calibration over a regime

2.3.6.1. Models for instrument calibration


Notation The following notation is used in this chapter in discussing models for
calibration curves.

● Y denotes a measurement on a reference standard


● X denotes the known value of a reference standard
● denotes measurement error.

● a, b and c denote coefficients to be determined


Possible forms There are several models for calibration curves that can be considered
for calibration for instrument calibration. They fall into the following classes:
curves ● Linear:

● Quadratic:

● Power:

● Non-linear:

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc36.htm (3 of 3) [11/13/2003 5:38:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc361.htm (1 of 4) [11/13/2003 5:38:19 PM]


2.3.6.1. Models for instrument calibration 2.3.6.1. Models for instrument calibration

Special case An instrument requires no calibration if Warning A plot of the data, although always recommended, is not sufficient for
of linear identifying the correct model for the calibration curve. Instrument
model - no a=0 and b=1 responses may not appear non-linear over a large interval. If the
calibration response and the known values are in the same units, differences from
required i.e., if measurements on the reference standards agree with their the known values should be plotted versus the known values.
known values given an allowance for measurement error, the
instrument is already calibrated. Guidance on collecting data, Power model The power model is appropriate when the measurement error is
estimating and testing the coefficients is given on other pages. treated as a proportional to the response rather than being additive. It is frequently
linear model used for calibrating instruments that measure dosage levels of
Advantages of The linear model ISO 11095 is widely applied to instrument irradiated materials.
the linear calibration because it has several advantages over more complicated
model The power model is a special case of a non-linear model that can be
models.
linearized by a natural logarithm transformation to
● Computation of coefficients and standard deviations is easy.

● Correction for bias is easy.

● There is often a theoretical basis for the model.


so that the model to be fit to the data is of the familiar linear form
● The analysis of uncertainty is tractable.

Warning on It is often tempting to exclude the intercept, a, from the model


excluding the because a zero stimulus on the x-axis should lead to a zero response where W, Z and e are the transforms of the variables, Y, X and the
intercept term on the y-axis. However, the correct procedure is to fit the full model measurement error, respectively, and a' is the natural logarithm of a.
from the and test for the significance of the intercept term.
model
Non-linear Instruments whose responses are not linear in the coefficients can
models and sometimes be described by non-linear models. In some cases, there are
Quadratic Responses of instruments or measurement systems which cannot be their theoretical foundations for the models; in other cases, the models are
model and linearized, and for which no theoretical model exists, can sometimes limitations developed by trial and error. Two classes of non-linear functions that
higher order be described by a quadratic model (or higher-order polynomial). An have been shown to have practical value as calibration functions are:
polynomials example is a load cell where force exerted on the cell is a non-linear
function of load. 1. Exponential
2. Rational
Disadvantages Disadvantages of quadratic and higher-order polynomials are: Non-linear models are an important class of calibration models, but
of quadratic ● They may require more reference standards to capture the they have several significant limitations.
models region of curvature. ● The model itself may be difficult to ascertain and verify.
● There is rarely a theoretical justification; however, the adequacy
● There can be severe computational difficulties in estimating the
of the model can be tested statistically. coefficients.
● The correction for bias is more complicated than for the linear
● Correction for bias cannot be applied algebraically and can only
model. be approximated by interpolation.
● The uncertainty analysis is difficult.
● Uncertainty analysis is very difficult.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc361.htm (2 of 4) [11/13/2003 5:38:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc361.htm (3 of 4) [11/13/2003 5:38:19 PM]


2.3.6.1. Models for instrument calibration 2.3.6.2. Data collection

Example of an An exponential function is shown in the equation below. Instruments


exponential for measuring the ultrasonic response of reference standards with
function various levels of defects (holes) that are submerged in a fluid are
described by this function. 2. Measurement Process Characterization
2.3. Calibration
2.3.6. Instrument calibration over a regime

2.3.6.2. Data collection


Example of a A rational function is shown in the equation below. Scanning electron
rational microscope measurements of line widths on semiconductors are Data The process of collecting data for creating the calibration curve is
function described by this function (Kirby). collection critical to the success of the calibration program. General rules for
designing calibration experiments apply, and guidelines that are
adequate for the calibration models in this chapter are given below.

Selection of A minimum of five reference standards is required for a linear


reference calibration curve, and ten reference standards should be adequate for
standards more complicated calibration models.
The optimal strategy in selecting the reference standards is to space the
reference standards at points corresponding to equal increments on the
y-axis, covering the range of the instrument. Frequently, this strategy is
not realistic because the person producing the reference materials is
often not the same as the person who is creating the calibration curve.
Spacing the reference standards at equal intervals on the x-axis is a good
alternative.

Exception to If the instrument is not to be calibrated over its entire range, but only
the rule over a very short range for a specific application, then it may not be
above - necessary to develop a complete calibration curve, and a bracketing
bracketing technique (ISO 11095) will provide satisfactory results. The bracketing
technique assumes that the instrument is linear over the interval of
interest, and, in this case, only two reference standards are required --
one at each end of the interval.

Number of A minimum of two measurements on each reference standard is required


repetitions and four is recommended. The repetitions should be separated in time
on each by days or weeks. These repetitions provide the data for determining
reference whether a candidate model is adequate for calibrating the instrument.
standard

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc361.htm (4 of 4) [11/13/2003 5:38:19 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc362.htm (1 of 2) [11/13/2003 5:38:20 PM]


2.3.6.2. Data collection 2.3.6.3. Assumptions for instrument calibration

2. Measurement Process Characterization


2.3. Calibration
2.3.6. Instrument calibration over a regime

2.3.6.3. Assumptions for instrument


calibration
Assumption The basic assumption regarding the reference values of artifacts that
regarding are measured in the calibration experiment is that they are known
reference without error. In reality, this condition is rarely met because these
values values themselves usually come from a measurement process.
Systematic errors in the reference values will always bias the results,
and random errors in the reference values can bias the results.

Rule of thumb It has been shown by Bruce Hoadly, in an internal NIST publication,
that the best way to mitigate the effect of random fluctuations in the
reference values is to plan for a large spread of values on the x-axis
relative to the precision of the instrument.

Assumptions The basic assumptions regarding measurement errors associated with


regarding the instrument are that they are:
measurement ● free from outliers
errors
● independent
● of equal precision
● from a normal distribution.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc362.htm (2 of 2) [11/13/2003 5:38:20 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc363.htm [11/13/2003 5:38:20 PM]


2.3.6.4. What can go wrong with the calibration procedure 2.3.6.4. What can go wrong with the calibration procedure

Systematic It is possible for different operators to produce measurements with


differences biases that differ in sign and magnitude. This is not usually a problem
among for automated instrumentation, but for instruments that depend on line
2. Measurement Process Characterization operators of sight, results may differ significantly by operator. To diagnose this
2.3. Calibration problem, measurements by different operators on the same artifacts are
2.3.6. Instrument calibration over a regime plotted and compared. Small differences among operators can be
accepted as part of the imprecision of the measurement process, but
large systematic differences among operators require resolution.
2.3.6.4. What can go wrong with the Possible solutions are to retrain the operators or maintain separate
calibration curves by operator.
calibration procedure
Lack of The calibration procedure, once established, relies on the instrument
Calibration There are several circumstances where the calibration curve will not system continuing to respond in the same way over time. If the system drifts or
procedure reduce or eliminate bias as intended. Some are discussed on this page. A control takes unpredictable excursions, the calibrated values may not be
may fail to critical exploratory analysis of the calibration data should expose such properly corrected for bias, and depending on the direction of change,
eliminate problems. the calibration may further degrade the accuracy of the measurements.
bias To assure that future measurements are properly corrected for bias, the
calibration procedure should be coupled with a statistical control
Lack of Poor instrument precision or unsuspected day-to-day effects may result procedure for the instrument.
precision in standard deviations that are large enough to jeopardize the calibration.
There is nothing intrinsic to the calibration procedure that will improve Example of An important point, but one that is rarely considered, is that there can be
precision, and the best strategy, before committing to a particular differences differences in responses from repetition to repetition that will invalidate
instrument, is to estimate the instrument's precision in the environment among the analysis. A plot of the aggregate of the calibration data may not
of interest to decide if it is good enough for the precision required. identify changes in the instrument response from day-to-day. What is
repetitions
in the needed is a plot of the fine structure of the data that exposes any day to
Outliers in Outliers in the calibration data can seriously distort the calibration day differences in the calibration data.
the calibration
curve, particularly if they lie near one of the endpoints of the calibration
calibration interval. data
data ● Isolated outliers (single points) should be deleted from the
calibration data. Warning - A straight-line fit to the aggregate data will produce a 'calibration curve'.
calibration However, if straight lines fit separately to each day's measurements
● An entire day's results which are inconsistent with the other data
can fail show very disparate responses, the instrument, at best, will require
should be examined and rectified before proceeding with the because of calibration on a daily basis and, at worst, may be sufficiently lacking in
analysis. day-to-day control to be usable.
changes

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc364.htm (1 of 2) [11/13/2003 5:38:20 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc364.htm (2 of 2) [11/13/2003 5:38:20 PM]


2.3.6.4.1. Example of day-to-day changes in calibration 2.3.6.4.1. Example of day-to-day changes in calibration

This plot
shows the
differences
2. Measurement Process Characterization between each
2.3. Calibration measurement
2.3.6. Instrument calibration over a regime and the
2.3.6.4. What can go wrong with the calibration procedure corresponding
reference
value.
2.3.6.4.1. Example of day-to-day changes in Because days
are not
calibration identified, the
plot gives no
Calibration Line width measurements on 10 NIST reference standards were made with an optical indication of
data over 4 imaging system on each of four days. The four data points for each reference value problems in
days appear to overlap in the plot because of the wide spread in reference values relative the control of
to the precision. The plot suggests that a linear calibration line is appropriate for the imaging
calibrating the imaging system. system from
from day to
This plot day.
shows
measurements
made on 10
reference REFERENCE VALUES (µm)
materials
repeated on This plot, with
four days with linear
the 4 points calibration
for each day lines fit to
overlapping each day's
measurements
individually,
shows how
the response
of the imaging
system
changes
dramatically
from day to
day. Notice
that the slope
of the
REFERENCE VALUES (µm) calibration
line goes from
positive on
day 1 to

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3641.htm (1 of 3) [11/13/2003 5:38:20 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3641.htm (2 of 3) [11/13/2003 5:38:20 PM]


2.3.6.4.1. Example of day-to-day changes in calibration 2.3.6.5. Data analysis and model validation

negative on
day 3.

2. Measurement Process Characterization


2.3. Calibration
2.3.6. Instrument calibration over a regime

2.3.6.5. Data analysis and model validation


First step - If the model for the calibration curve is not known from theoretical considerations
plot the or experience, it is necessary to identify and validate a model for the calibration
calibration curve. To begin this process, the calibration data are plotted as a function of known
data values of the reference standards; this plot should suggest a candidate model for
describing the data. A linear model should always be a consideration. If the
responses and their known values are in the same units, a plot of differences
between responses and known values is more informative than a plot of the data for
exposing structure in the data.

Warning - Once an initial model has been chosen, the coefficients in the model are estimated
REFERENCE VALUES (µm)
regarding from the data using a statistical software package. It is impossible to
statistical over-emphasize the importance of using reliable and documented software for this
Interpretation Given the lack of control for this measurement process, any calibration procedure software analysis.
of calibration built on the average of the calibration data will fail to properly correct the system on
findings some days and invalidate resulting measurements. There is no good solution to this
problem except daily calibration. Output With the exception of non-linear models, the software package will use the method
required from of least squares for estimating the coefficients. The software package should also
a software be capable of performing a 'weighted' fit for situations where errors of
package measurement are non-constant over the calibration interval. The choice of weights
is usually the responsibility of the user. The software package should, at the
minimum, provide the following information:
● Coefficients of the calibration curve

● Standard deviations of the coefficients

● Residual standard deviation of the fit

● F-ratio for goodness of fit (if there are repetitions on the y-axis at each
reference value)

Typical The following output is from the statistical software package, Dataplot where load
analysis of a cell measurements are modeled as a quadratic function of known loads. There are 3
quadratic fit repetitions at each load level for a total of 33 measurements. The commands

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3641.htm (3 of 3) [11/13/2003 5:38:20 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc365.htm (1 of 3) [11/13/2003 5:38:20 PM]


2.3.6.5. Data analysis and model validation 2.3.6.5. Data analysis and model validation

Run software The t-values The t-values can be compared with critical values from a t-table. However, for a
macro read loadcell.dat x y are used to test at the 5% significance level, a t-value < 2 is a good indicator of
quadratic fit y x test the non-significance. The t-value for the intercept term, a, is < 2 indicating that the
return the following output: significance of intercept term is not significantly different from zero. The t-values for the linear
individual and quadratic terms are significant indicating that these coefficients are needed in
F-ratio for coefficients the model. If the intercept is dropped from the model, the analysis is repeated to
judging the LACK OF FIT F-RATIO = 0.3482 = THE 6.3445% POINT OF THE obtain new estimates for the coefficients, b and c.
adequacy of F DISTRIBUTION WITH 8 AND 22 DEGREES OF FREEDOM
the model. Residual The residual standard deviation estimates the standard deviation of a single
standard measurement with the load cell.
Coefficients deviation
and their COEFFICIENT ESTIMATES ST. DEV. T VALUE
standard Further The residuals (differences between the measurements and their fitted values) from
deviations and 1 a -0.183980E-04 (0.2450E-04) -0.75 considerations the fit should also be examined for outliers and structure that might invalidate the
associated t and tests of calibration curve. They are also a good indicator of whether basic assumptions of
values 2 b 0.100102 (0.4838E-05) 0.21E+05 assumptions normality and equal precision for all measurements are valid.
3 c 0.703186E-05 (0.2013E-06) 35. If the initial model proves inappropriate for the data, a strategy for improving the
model is followed.
RESIDUAL STANDARD DEVIATION = 0.0000376353

RESIDUAL DEGREES OF FREEDOM = 30


Note: The T-VALUE for a coefficient in the table above is the estimate of the
coefficient divided by its standard deviation.

The F-ratio is The F-ratio provides information on the model as a good descriptor of the data. The
used to test F-ratio is compared with a critical value from the F-table. An F-ratio smaller than
the goodness the critical value indicates that all significant structure has been captured by the
of the fit to the model.
data

F-ratio < 1 For the load cell analysis, a plot of the data suggests a linear fit. However, the
always linear fit gives a very large F-ratio. For the quadratic fit, the F-ratio = 0.3482 with
indicates a v1 = 8 and v2 = 20 degrees of freedom. The critical value of F(8, 20) = 3.313
good fit indicates that the quadratic function is sufficient for describing the data. A fact to
keep in mind is that an F-ratio < 1 does not need to be checked against a critical
value; it always indicates a good fit to the data.
Note: Dataplot reports a probability associated with the F-ratio (6.334%), where a
probability > 95% indicates an F-ratio that is significant at the 5% level. Other
software may report in other ways; therefore, it is necessary to check the
interpretation for each package.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc365.htm (2 of 3) [11/13/2003 5:38:20 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc365.htm (3 of 3) [11/13/2003 5:38:20 PM]


2.3.6.5.1. Data on load cell #32066 2.3.6.5.1. Data on load cell #32066

21. 2.10526
21. 2.10524
21. 2.10524

2. Measurement Process Characterization


2.3. Calibration
2.3.6. Instrument calibration over a regime
2.3.6.5. Data analysis and model validation

2.3.6.5.1. Data on load cell #32066


Three
repetitions X Y
on a load
cell at 2. 0.20024
eleven 2. 0.20016
known loads 2. 0.20024
4. 0.40056
4. 0.40045
4. 0.40054
6. 0.60087
6. 0.60075
6. 0.60086
8. 0.80130
8. 0.80122
8. 0.80127
10. 1.00173
10. 1.00164
10. 1.00173
12. 1.20227
12. 1.20218
12. 1.20227
14. 1.40282
14. 1.40278
14. 1.40279
16. 1.60344
16. 1.60339
16. 1.60341
18. 1.80412
18. 1.80409
18. 1.80411
20. 2.00485
20. 2.00481
20. 2.00483

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3651.htm (1 of 2) [11/13/2003 5:38:21 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3651.htm (2 of 2) [11/13/2003 5:38:21 PM]


2.3.6.6. Calibration of future measurements 2.3.6.6. Calibration of future measurements

Linear The inverse of the calibration line for the linear model
calibration
line
2. Measurement Process Characterization gives the calibrated value
2.3. Calibration
2.3.6. Instrument calibration over a regime

2.3.6.6. Calibration of future measurements


Purpose The purpose of creating the calibration curve is to correct future
measurements made with the same instrument to the correct units of Tests for the Before correcting for the calibration line by the equation above, the
measurement. The calibration curve can be applied many, many times intercept intercept and slope should be tested for a=0, and b=1. If both
before it is discarded or reworked as long as the instrument remains in and slope of
statistical control. Chemical measurements are an exception where calibration
curve -- If
frequently the calibration curve is used only for a single batch of
both
measurements, and a new calibration curve is created for the next batch.
conditions
hold, no
Notation The notation for this section is as follows: calibration
is needed. there is no need for calibration. If, on the other hand only the test for
● Y' denotes a future measurement. a=0 fails, the error is constant; if only the test for b=1 fails, the errors
are related to the size of the reference standards.
● X' denotes the associated calibrated value.
Table The factor, , is found in the t-table where v is the degrees of
look-up for
freedom for the residual standard deviation from the calibration curve,
● are the estimates of the coefficients, a, b, c. t-factor
and alpha is chosen to be small, say, 0.05.
● are standard deviations of the coefficients, a, b, c. Quadratic The inverse of the calibration curve for the quadratic model
calibration
Procedure To apply a correction to a future measurement, Y*, to obtain the curve
calibration value X* requires the inverse of the calibration curve. requires a root

The correct root (+ or -) can usually be identified from practical


considerations.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc366.htm (1 of 3) [11/13/2003 5:38:21 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc366.htm (2 of 3) [11/13/2003 5:38:21 PM]


2.3.6.6. Calibration of future measurements 2.3.6.7. Uncertainties of calibrated values

Power curve The inverse of the calibration curve for the power model

2. Measurement Process Characterization


gives the calibrated value 2.3. Calibration
2.3.6. Instrument calibration over a regime

2.3.6.7. Uncertainties of calibrated values


Purpose The purpose is to quantify the uncertainty of a 'future' result that has
been corrected by the calibration curve. In principle, the uncertainty
quantifies any possible difference between the calibrated value and its
reference base (which normally depends on reference standards).
where b and the natural logarithm of a are estimated from the power
model transformed to a linear function. Explanation Measurements of interest are future measurements on unknown
in terms of artifacts, but one way to look at the problem is to ask: If a measurement
Non-linear For more complicated models, the inverse for the calibration curve is reference is made on one of the reference standards and the calibration curve is
and other obtained by interpolation from a graph of the function or from predicted artifacts applied to obtain the calibrated value, how well will this value agree
calibration values of the function. with the 'known' value of the reference standard?
curves
Difficulties The answer is not easy because of the intersection of two uncertainties
associated with
1. the calibration curve itself because of limited data
2. the 'future' measurement
If the calibration experiment were to be repeated, a slightly different
calibration curve would result even for a system in statistical control.
An exposition of the intersection of the two uncertainties is given for
the calibration of proving rings ( Hockersmith and Ku).

ISO General procedures for computing an uncertainty based on ISO


approach to principles of uncertainty analysis are given in the chapter on modeling.
uncertainty
can be based Type A uncertainties for calibrated values from calibration curves can
on check be derived from
standards or ● check standard values
propagation ● propagation of error
of error
An example of type A uncertainties of calibrated values from a linear
calibration curve are analyzed from measurements on linewidth check
standards. Comparison of the uncertainties from check standards and

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc366.htm (3 of 3) [11/13/2003 5:38:21 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc367.htm (1 of 2) [11/13/2003 5:38:21 PM]


2.3.6.7. Uncertainties of calibrated values 2.3.6.7.1. Uncertainty for quadratic calibration using propagation of error

propagation of error for the linewidth calibration data are also


illustrated.
An example of the derivation of propagation of error type A
uncertainties for calibrated values from a quadratic calibration curve 2. Measurement Process Characterization
2.3. Calibration
for loadcells is discussed on the next page.
2.3.6. Instrument calibration over a regime
2.3.6.7. Uncertainties of calibrated values

2.3.6.7.1. Uncertainty for quadratic


calibration using propagation of
error
Propagation The purpose of this page is to show the propagation of error for
of error for calibrated values of a loadcell based on a quadratic calibration curve
uncertainty where the model for instrument response is
of calibrated
values of
loadcells
The calibration data are instrument responses at known loads (psi), and
estimates of the quadratic coefficients, a, b, c, and their associated
standard deviations are shown with the analysis.
A graph of the calibration curve showing a measurement Y' corrected to
X', the proper load (psi), is shown below.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc367.htm (2 of 2) [11/13/2003 5:38:21 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3671.htm (1 of 7) [11/13/2003 5:38:22 PM]


2.3.6.7.1. Uncertainty for quadratic calibration using propagation of error 2.3.6.7.1. Uncertainty for quadratic calibration using propagation of error

Propagation The analysis of uncertainty is demonstrated with the software package, Mathematica
of error using (Wolfram). The format for inputting the solution to the quadratic calibration curve in
Mathematica Mathematica is as follows:

In[10]:=
f = (-b + (b^2 - 4 c (a - Y))^(1/2))/(2 c)

Mathematica The Mathematica representation is


representation
Out[10]=

2
-b + Sqrt[b - 4 c (a - Y)]
---------------------------
2 c

Partial The partial derivatives are computed using the D function. For example, the partial derivative
derivatives of f with respect to Y is given by:

In[11]:=
dfdY=D[f, {Y,1}]
The Mathematica representation is:

Out[11]=

1
----------------------
2
Sqrt[b - 4 c (a - Y)]

Partial The other partial derivatives are computed similarly.


derivatives In[12]:=
with respect to dfda=D[f, {a,1}]
a, b, c
Uncertainty of The uncertainty to be evaluated is the uncertainty of the calibrated value, X', computed for any
the calibrated future measurement, Y', made with the calibrated instrument where Out[12]=
value X' can
be evaluated 1
using software -(----------------------)
capable of 2
algebraic Sqrt[b - 4 c (a - Y)]
representation
In[13]:=
dfdb=D[f,{b,1}]

Out[13]=

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3671.htm (2 of 7) [11/13/2003 5:38:22 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3671.htm (3 of 7) [11/13/2003 5:38:22 PM]


2.3.6.7.1. Uncertainty for quadratic calibration using propagation of error 2.3.6.7.1. Uncertainty for quadratic calibration using propagation of error

b Simplification Intermediate outputs from Mathematica, which are not shown, are simplified. (Note that the %
-1 + ---------------------- of output sign means an operation on the last output.) Then the standard deviation is computed as the
2 square root of the variance.
Sqrt[b - 4 c (a - Y)]
---------------------------
In[17]:=
2 c
u2 = Simplify[%]
u=u2^.5
In[14]:=dfdc=D[f, {c,1}]

Out[24]=
Out[14]=
0.100102 2
2 Power[0.11834 (-1 + --------------------------------) +
-(-b + Sqrt[b - 4 c (a - Y)]) a - Y Sqrt[0.0100204 + 0.0000281274 Y]
------------------------------ - ------------------------
2 2 -9
2 c c Sqrt[b - 4 c (a - Y)] 2.01667 10
-------------------------- +
The variance The variance of X' is defined from propagation of error as follows: 0.0100204 + 0.0000281274 Y
of the
calibrated -14 9
In[15]:= 4.05217 10 Power[1.01221 10 -
value from u2 =(dfdY)^2 (sy)^2 + (dfda)^2 (sa)^2 + (dfdb)^2 (sb)^2
propagation of + (dfdc)^2 (sc)^2
error 10
The values of the coefficients and their respective standard deviations from the quadratic fit to 1.01118 10 Sqrt[0.0100204 + 0.0000281274 Y] +
the calibration curve are substituted in the equation. The standard deviation of the
measurement, Y, may not be the same as the standard deviation from the fit to the calibration 142210. (0.000018398 + Y)
data if the measurements to be corrected are taken with a different system; here we assume that --------------------------------, 2], 0.5]
the instrument to be calibrated has a standard deviation that is essentially the same as the Sqrt[0.0100204 + 0.0000281274 Y]
instrument used for collecting the calibration data and the residual standard deviation from the
quadratic fit is the appropriate estimate. Input for The standard deviation expressed above is not easily interpreted but it is easily graphed. A
displaying graph showing standard deviations of calibrated values, X', as a function of instrument
In[16]:= standard response, Y', is displayed in Mathematica given the following input:
% /. a -> -0.183980 10^-4 deviations of
calibrated In[31]:= Plot[u,{Y,0,2.}]
% /. sa -> 0.2450 10^-4
% /. b -> 0.100102 values as a
% /. sb -> 0.4838 10^-5 function of Y'
% /. c -> 0.703186 10^-5
% /. sc -> 0.2013 10^-6
% /. sy -> 0.0000376353

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3671.htm (4 of 7) [11/13/2003 5:38:22 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3671.htm (5 of 7) [11/13/2003 5:38:22 PM]


2.3.6.7.1. Uncertainty for quadratic calibration using propagation of error 2.3.6.7.1. Uncertainty for quadratic calibration using propagation of error

Graph The graph below shows the correct estimates for the standard deviation of X' and gives a means
showing the for assessing the loss of accuracy that can be incurred by ignoring covariance terms. In this
standard case, the uncertainty is reduced by including covariance terms, some of which are negative.
deviations of
calibrated Graph
values X' for showing the
given standard
instrument deviations of
responses Y' calibrated
ignoring values, X', for
covariance given
terms in the instrument
propagation of responses, Y',
error with
covariance
terms included
in the
propagation of
error

Problem with The propagation of error shown above is not correct because it ignores the covariances among
propagation of the coefficients, a, b, c. Unfortunately, some statistical software packages do not display these
error covariance terms with the other output from the analysis.

Covariance The variance-covariance terms for the loadcell data set are shown below.
terms for
loadcell data a 6.0049021-10
b -1.0759599-10 2.3408589-11
c 4.0191106-12 -9.5051441-13 4.0538705-14

The diagonal elements are the variances of the coefficients, a, b, c, respectively, and the
off-diagonal elements are the covariance terms.

Recomputation To account for the covariance terms, the variance of X' is redefined by adding the covariance
of the terms. Appropriate substitutions are made; the standard deviations are recomputed and graphed
standard as a function of instrument response.
deviation of X'
In[25]:=
u2 = u2 + 2 dfda dfdb sab2 + 2 dfda dfdc sac2 + 2 dfdb dfdc sbc2
% /. sab2 -> -1.0759599 10^-10
% /. sac2 -> 4.0191106 10^-12
% /. sbc2 -> -9.5051441 10^-13
u2 = Simplify[%]
u = u2^.5
Plot[u,{Y,0,2.}]

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3671.htm (6 of 7) [11/13/2003 5:38:22 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3671.htm (7 of 7) [11/13/2003 5:38:22 PM]


2.3.6.7.2. Uncertainty for linear calibration using check standards 2.3.6.7.2. Uncertainty for linear calibration using check standards

Comparison The standard deviation, 0.062 µm, can be compared with a propagation of error analysis.
with
propagation
2. Measurement Process Characterization of error
2.3. Calibration
2.3.6. Instrument calibration over a regime Other sources In addition to the type A uncertainty, there may be other contributors to the uncertainty
2.3.6.7. Uncertainties of calibrated values of uncertainty such as the uncertainties of the values of the reference materials from which the
calibration curve was derived.

2.3.6.7.2. Uncertainty for linear calibration using


check standards
Check The easiest method for calculating type A uncertainties for calibrated values from a
standards calibration curve requires periodic measurements on check standards. The check
provide a standards, in this case, are artifacts at the lower, mid-point and upper ends of the
mechanism calibration curve. The measurements on the check standard are made in a way that
for randomly samples the output of the calibration procedure.
calculating
uncertainties

Calculation of The check standard values are the raw measurements on the artifacts corrected by the
check calibration curve. The standard deviation of these values should estimate the uncertainty
standard associated with calibrated values. The success of this method of estimating the
values uncertainties depends on adequate sampling of the measurement process.

Measurements As an example, consider measurements of linewidths on photomask standards, made with


corrected by a an optical imaging system and corrected by a linear calibration curve. The three control
linear measurements were made on reference standards with values at the lower, mid-point, and
calibration upper end of the calibration interval.
curve

Run software Dataplot commands for computing the standard deviation from the control data are:
macro for
computing the read linewid2.dat day position x y
standard let b0 = 0.2817
deviation let b1 = 0.9767
let w = ((y - b0)/b1) - x
let sdcal = standard deviation w

Standard Dataplot returns the following standard deviation


deviation of
calibrated
values THE COMPUTED VALUE OF THE CONSTANT SDCAL = 0.62036246E-01

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3672.htm (1 of 2) [11/13/2003 5:38:22 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3672.htm (2 of 2) [11/13/2003 5:38:22 PM]


2.3.6.7.3. Comparison of check standard analysis and propagation of error 2.3.6.7.3. Comparison of check standard analysis and propagation of error

Propagation The propagation of error is accomplished with the following instructions using the
of error software package Mathematica (Wolfram):
using
2. Measurement Process Characterization Mathematica f=(y -a)/b
2.3. Calibration dfdy=D[f, {y,1}]
2.3.6. Instrument calibration over a regime dfda=D[f, {a,1}]
2.3.6.7. Uncertainties of calibrated values dfdb=D[f,{b,1}]
u2 =dfdy^2 sy^2 + dfda^2 sa2 + dfdb^2 sb2 + 2 dfda dfdb sab2
% /. a-> .23723513
2.3.6.7.3. Comparison of check standard analysis % /. b-> .98839599
and propagation of error % /. sa2 -> 2.2929900 10^-04
% /. sb2 -> 4.5966426 10^-06
% /. sab2 -> -2.9703502 10^-05
Propagation The analysis of uncertainty for calibrated values from a linear calibration line can be % /. sy -> .038654864
of error for addressed using propagation of error. On the previous page, the uncertainty was u2 = Simplify[%]
the linear estimated from check standard values. u = u2^.5
calibration Plot[u, {y, 0, 12}]

Estimates The calibration data consist of 40 measurements with an optical imaging system on 10 Standard The output from Mathematica gives the standard deviation of a calibrated value, X', as a
from line width artifacts. A linear fit to the data using the software package Omnitab (Omnitab deviation of function of instrument response:
calibration 80 ) gives a calibration curve with the following estimates for the intercept, a, and the calibrated
data value X'
slope, b: -6 2 0.5
(0.00177907 - 0.0000638092 y + 4.81634 10 y )
a .23723513
b .98839599 Graph
------------------------------------------------------- showing
RESIDUAL STANDARD DEVIATION = .038654864 standard
BASED ON DEGREES OF FREEDOM 40 - 2 = 38 deviation of
calibrated
value X'
with the following variances and covariances: plotted as a
function of
a 2.2929900-04 instrument
b -2.9703502-05 4.5966426-06 response Y'
for a linear
calibration

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3673.htm (1 of 3) [11/13/2003 5:38:22 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3673.htm (2 of 3) [11/13/2003 5:38:22 PM]


2.3.6.7.3. Comparison of check standard analysis and propagation of error 2.3.7. Instrument control for linear calibration

Comparison Comparison of the analysis of check standard data, which gives a standard deviation of
of check 0.062 µm, and propagation of error, which gives a maximum standard deviation of 0.042
standard µm, suggests that the propagation of error may underestimate the type A uncertainty. The
analysis and check standard measurements are undoubtedly sampling some sources of variability that
propagation 2. Measurement Process Characterization
do not appear in the formal propagation of error formula.
of error 2.3. Calibration

2.3.7. Instrument control for linear


calibration
Purpose The purpose of the control program is to guarantee that the calibration
of an instrument does not degrade over time.

Approach This is accomplished by exercising quality control on the instrument's


output in much the same way that quality control is exercised on
components in a process using a modification of the Shewhart control
chart.

Check For linear calibration, it is sufficient to control the end-points and the
standards middle of the calibration interval to ensure that the instrument does not
needed for drift out of calibration. Therefore, check standards are required at three
the control points; namely,
program ● at the lower-end of the regime

● at the mid-range of the regime

● at the upper-end of the regime

Data One measurement is needed on each check standard for each checking
collection period. It is advisable to start by making control measurements at the
start of each day or as often as experience dictates. The time between
checks can be lengthened if the instrument continues to stay in control.

Definition of To conform to the notation in the section on instrument corrections, X*


control denotes the known value of a standard, and X denotes the measurement
value on the standard.
A control value is defined as the difference

If the calibration is perfect, control values will be randomly distributed


about zero and fall within appropriate upper and lower limits on a
control chart.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc3673.htm (3 of 3) [11/13/2003 5:38:22 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc37.htm (1 of 3) [11/13/2003 5:38:22 PM]


2.3.7. Instrument control for linear calibration 2.3.7. Instrument control for linear calibration

Calculation The upper and lower control limits (Croarkin and Varner)) are, Control An example of measurements of line widths on photomask standards,
of control respectively, chart for a made with an optical imaging system and corrected by a linear
limits system calibration curve, are shown as an example. The three control
corrected by measurements were made on reference standards with values at the
a linear lower, mid-point, and upper end of the calibration interval.
calibration
curve

where s is the residual standard deviation of the fit from the calibration
experiment, and is the slope of the linear calibration curve.

Values t* The critical value, , can be found in the t* table for p = 3; v is the
degrees of freedom for the residual standard deviation; and is equal to
0.05.

Run Dataplot will compute the critical value of the t* statistic. For the case
software where = 0.05, m = 3 and v = 38, say, the commands
macro for t*
let alpha = 0.05
let m = 3
let v = 38
let zeta = .5*(1 - exp(ln(1-alpha)/m))
let TSTAR = tppf(zeta, v)
return the following value:
THE COMPUTED VALUE OF THE CONSTANT TSTAR =
0.2497574E+01

Sensitivity to If
departure
from
linearity
the instrument is in statistical control. Statistical control in this context
implies not only that measurements are repeatable within certain limits
but also that instrument response remains linear. The test is sensitive to
departures from linearity.

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc37.htm (2 of 3) [11/13/2003 5:38:22 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc37.htm (3 of 3) [11/13/2003 5:38:22 PM]


2.3.7.1. Control chart for a linear calibration line 2.3.7.1. Control chart for a linear calibration line

5 U 8.89 9.05
6 L 0.76 1.03
6 M 3.29 3.52
6 U 8.89 9.02
2. Measurement Process Characterization
2.3. Calibration Run software Dataplot commands for computing the control limits and producing the
2.3.7. Instrument control for linear calibration macro for control chart are:
control chart
read linewid.dat day position x y
2.3.7.1. Control chart for a linear calibration let b0 = 0.2817
line let b1 = 0.9767
let s = 0.06826
let df = 38
Purpose Line widths of three photomask reference standards (at the low, middle let alpha = 0.05
and high end of the calibration line) were measured on six days with let m = 3
an optical imaging system that had been calibrated from similar let zeta = .5*(1 - exp(ln(1-alpha)/m))
measurements on 10 reference artifacts. The control values and limits let TSTAR = tppf(zeta, df)
for the control chart , which depend on the intercept and slope of the let W = ((y - b0)/b1) - x
linear calibration line, monitor the calibration and linearity of the let n = size w
optical imaging system. let center = 0 for i = 1 1 n
let LCL = CENTER + s*TSTAR/b1
Initial The initial calibration experiment consisted of 40 measurements (not let UCL = CENTER - s*TSTAR/b1
calibration shown here) on 10 artifacts and produced a linear calibration line with: characters * blank blank blank
experiment lines blank dashed solid solid
● Intercept = 0.2817
y1label control values
● Slope = 0.9767 xlabel TIME IN DAYS
● Residual standard deviation = 0.06826 micrometers plot W CENTER UCL LCL vs day
● Degrees of freedom = 38
Interpretation The control measurements show no evidence of drift and are within the
Line width The control measurements, Y, and known values, X, for the three of control control limits except on the fourth day when all three control values
measurements artifacts at the upper, mid-range, and lower end (U, M, L) of the chart are outside the limits. The cause of the problem on that day cannot be
made with an calibration line are shown in the following table: diagnosed from the data at hand, but all measurements made on that
optical day, including workload items, should be rejected and remeasured.
imaging DAY POSITION X Y
system
1 L 0.76 1.12
1 M 3.29 3.49
1 U 8.89 9.11
2 L 0.76 0.99
2 M 3.29 3.53
2 U 8.89 8.89
3 L 0.76 1.05
3 M 3.29 3.46
3 U 8.89 9.02
4 L 0.76 0.76
4 M 3.29 3.75
4 U 8.89 9.30
5 L 0.76 0.96
5 M 3.29 3.53

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc371.htm (1 of 3) [11/13/2003 5:38:23 PM] http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc371.htm (2 of 3) [11/13/2003 5:38:23 PM]


2.3.7.1. Control chart for a linear calibration line 2.4. Gauge R & R studies

2. Measurement Process Characterization

2.4. Gauge R & R studies


The purpose of this section is to outline the steps that can be taken to
characterize the performance of gauges and instruments used in a
production setting in terms of errors that affect the measurements.
What are the issues for a gauge R & R study?
What are the design considerations for the study?
1. Artifacts
2. Operators
3. Gauges, parameter levels, configurations

How do we collect data for the study?


How do we quantify variability of measurements?
1. Repeatability
2. Reproducibility
3. Stability
How do we identify and analyze bias?
1. Resolution
2. Linearity
3. Hysteresis
4. Drift
5. Differences among gauges
6. Differences among geometries, configurations
Remedies and strategies
How do we quantify uncertainties of measurements made with the
gauges?

http://www.itl.nist.gov/div898/handbook/mpc/section3/mpc371.htm (3 of 3) [11/13/2003 5:38:23 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc4.htm (1 of 2) [11/13/2003 5:38:23 PM]


2.4. Gauge R & R studies 2.4.1. What are the important issues?

2. Measurement Process Characterization


2.4. Gauge R & R studies

2.4.1. What are the important issues?


Basic issues The basic issue for the study is the behavior of gauges in a particular
environment with respect to:
● Repeatability

● Reproducibility
● Stability
● Bias

Strategy The strategy is to conduct and analyze a study that examines the
behavior of similar gauges to see if:
● They exhibit different levels of precision;

● Instruments in the same environment produce equivalent results;

● Operators in the same environment produce equivalent results;

● Responses of individual gauges are affected by configuration or


geometry changes or changes in setup procedures.

Other goals Other goals are to:


● Test the resolution of instruments

● Test the gauges for linearity


● Estimate differences among gauges (bias)
● Estimate differences caused by geometries, configurations
● Estimate operator biases
● Incorporate the findings in an uncertainty budget

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc4.htm (2 of 2) [11/13/2003 5:38:23 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc41.htm [11/13/2003 5:38:23 PM]


2.4.2. Design considerations 2.4.2. Design considerations

Selection of If there is only a small number of gauges in the facility, then all
gauges gauges should be included in the study.
If the study is intended to represent a larger pool of gauges, then a
2. Measurement Process Characterization random sample of I (I > 3) gauges should be chosen for the study.
2.4. Gauge R & R studies

Limit the initial If the gauges operate at several parameter levels (for example;
2.4.2. Design considerations study frequencies), an initial study should be carried out at 1 or 2 levels
before a larger study is undertaken.
Design Design considerations for a gauge study are choices of: If there are differences in the way that the gauge can be operated, an
considerations ● Artifacts (check standards)
initial study should be carried out for one or two configurations
before a larger study is undertaken.
● Operators

● Gauges

● Parameter levels

● Configurations, etc.

Selection of The artifacts for the study are check standards or test items of a type
artifacts or that are typically measured with the gauges under study. It may be
check necessary to include check standards for different parameter levels if
standards the gauge is a multi-response instrument. The discussion of check
standards should be reviewed to determine the suitability of available
artifacts.

Number of The number of artifacts for the study should be Q (Q > 2). Check
artifacts standards for a gauge study are needed only for the limited time
period (two or three months) of the study.

Selection of Only those operators who are trained and experienced with the
operators gauges should be enlisted in the study, with the following constraints:
● If there is a small number of operators who are familiar with
the gauges, they should all be included in the study.
● If the study is intended to be representative of a large pool of
operators, then a random sample of L (L > 2) operators should
be chosen from the pool.
● If there is only one operator for the gauge type, that operator
should make measurements on K (K > 2) days.

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc42.htm (1 of 2) [11/13/2003 5:38:30 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc42.htm (2 of 2) [11/13/2003 5:38:30 PM]


2.4.3. Data collection for time-related sources of variability 2.4.3.1. Simple design

2. Measurement Process Characterization 2. Measurement Process Characterization


2.4. Gauge R & R studies 2.4. Gauge R & R studies
2.4.3. Data collection for time-related sources of variability

2.4.3. Data collection for time-related


sources of variability 2.4.3.1. Simple design
Constraints In planning a gauge study, particularly for the first time, it is advisable
Time-related The purpose of this page is to present several options for collecting data on time and to start with a simple design and progress to more complicated and/or
analysis for estimating time-dependent effects in a measurement process. resources labor intensive designs after acquiring some experience with data
collection and analysis. The design recommended here is appropriate as
Time The following levels of time-dependent errors are considered in this a preliminary study of variability in the measurement process that
intervals section based on the characteristics of many measurement systems and occurs over time. It requires about two days of measurements separated
should be adapted to a specific measurement situation as needed. by about a month with two repetitions per day.
1. Level-1 Measurements taken over a short time to capture the
precision of the gauge Relationship The disadvantage of this design is that there is minimal data for
2. Level-2 Measurements taken over days (of other appropriate time to 2-level estimating variability over time. A 2-level nested design and a 3-level
increment) and 3-level nested design, both of which require measurments over time, are
nested discussed on other pages.
3. Level-3 Measurements taken over runs separated by months
designs
Time ● Simple design for 2 levels of random error
Plan of Choose at least Q = 10 work pieces or check standards, which are
intervals ● Nested design for 2 levels of random error action essentially identical insofar as their expected responses to the
● Nested design for 3 levels of random error measurement method. Measure each of the check standards twice with
the same gauge, being careful to randomize the order of the check
In all cases, data collection and analysis are straightforward, and there is standards.
no reason to estimate interaction terms when dealing with
time-dependent errors. Two levels should be sufficient for After about a month, repeat the measurement sequence, randomizing
characterizing most measurement systems. Three levels are anew the order in which the check standards are measured.
recommended for measurement systems where sources of error are not
well understood and have not previously been studied. Notation Measurements on the check standards are designated:

with the first index identifying the month of measurement and the
second index identifying the repetition number.

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc43.htm [11/13/2003 5:38:30 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc431.htm (1 of 2) [11/13/2003 5:38:30 PM]


2.4.3.1. Simple design 2.4.3.2. 2-level nested design

Analysis of The level-1 standard deviation, which describes the basic precision of
data the gauge, is

2. Measurement Process Characterization


2.4. Gauge R & R studies
2.4.3. Data collection for time-related sources of variability

with v1 = 2Q degrees of freedom.


2.4.3.2. 2-level nested design
The level-2 standard deviation, which describes the variability of the
measurement process over time, is Check Measurements on a check standard are recommended for studying the effect of
standard sources of variability that manifest themselves over time. Data collection and
measurements analysis are straightforward, and there is no reason to estimate interaction terms
for estimating when dealing with time-dependent errors. The measurements can be made at one of
time-dependent two levels. Two levels should be sufficient for characterizing most measurement
sources of systems. Three levels are recommended for measurement systems for which sources
with v2 = Q degrees of freedom. variability of error are not well understood and have not previously been studied.

Relationship The standard deviation that defines the uncertainty for a single Time intervals The following levels are based on the characteristics of many measurement systems
to measurement on a test item, often referred to as the reproducibility in a nested and should be adapted to a specific measurement situation as needed.
uncertainty standard deviation (ASTM), is given by design ● Level-1 Measurements taken over a short term to estimate gauge precision
for a test ● Level-2 Measurements taken over days (of other appropriate time increment)
item
Definition of The following symbols are defined for this chapter:
number of ● Level-1 J (J > 1) repetitions
The time-dependent component is measurements
● Level-2 K (K > 2) days
at each level

Schedule for A schedule for making check standard measurements over time (once a day, twice a
making week, or whatever is appropriate for sampling all conditions of measurement) should
measurements be set up and adhered to. The check standard measurements should be structured in
There may be other sources of uncertainty in the measurement process the same way as values reported on the test items. For example, if the reported values
that must be accounted for in a formal analysis of uncertainty. are averages of two repetitions made within 5 minutes of each other, the check
standard values should be averages of the two measurements made in the same
manner.

Exception One exception to this rule is that there should be at least J = 2 repetitions per day,
etc. Without this redundancy, there is no way to check on the short-term precision of
the measurement system.

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc431.htm (2 of 2) [11/13/2003 5:38:30 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc432.htm (1 of 3) [11/13/2003 5:38:30 PM]


2.4.3.2. 2-level nested design 2.4.3.2. 2-level nested design

Depiction of Pooling The pooled level-1 standard deviation with v = K(J - 1) degrees of freedom is
schedule for increases the
making check reliability of
standard the estimate of
measurements the standard
.
with 4 deviation
repetitions per
day over K Data analysis The level-2 standard deviation of the check standard represents the process
days on the of process variability. It is computed with v = K - 1 degrees of freedom as:
surface of a (level-2)
silicon wafer standard
deviation

K days - 4 repetitions
where
2-level design for check standard measurements

Operator The measurements should be taken with ONE operator. Operator is not usually a
considerations consideration with automated systems. However, systems that require decisions
regarding line edge or other feature delineations may be operator dependent.
Relationship to The standard deviation that defines the uncertainty for a single measurement on a test
Case Study: Results should be recorded along with pertinent environmental readings and uncertainty for item, often referred to as the reproducibility standard deviation (ASTM), is given by
Resistivity identifications for significant factors. The best way to record this information is in a test item
check standard one file with one line or row (on a spreadsheet) of information in fixed fields for
each check standard measurement.

Data analysis The check standard measurements are represented by The time-dependent component is
of gauge
precision

for the jth repetition on the kth day. The mean for the kth day is
There may be other sources of uncertainty in the measurement process that must be
accounted for in a formal analysis of uncertainty.

and the (level-1) standard deviation for gauge precision with v = J - 1 degrees of
freedom is

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc432.htm (2 of 3) [11/13/2003 5:38:30 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc432.htm (3 of 3) [11/13/2003 5:38:30 PM]


2.4.3.3. 3-level nested design 2.4.3.3. 3-level nested design

2. Measurement Process Characterization


2.4. Gauge R & R studies
2.4.3. Data collection for time-related sources of variability

2.4.3.3. 3-level nested design


Advantages A nested design is recommended for studying the effect of sources of
of nested variability that manifest themselves over time. Data collection and
designs analysis are straightforward, and there is no reason to estimate
interaction terms when dealing with time-dependent errors. Nested
designs can be run at several levels. Three levels are recommended for
measurement systems where sources of error are not well understood
and have not previously been studied.

Time The following levels are based on the characteristics of many


intervals in measurement systems and should be adapted to a specific measurement
a nested situation as need be. A typical design is shown below.
design ● Level-1 Measurements taken over a short-time to capture the
precision of the gauge
● Level-2 Measurements taken over days (or other appropriate time
increment)
● Level-3 Measurements taken over runs separated by months

Definition of The following symbols are defined for this chapter:


number of ● Level-1 J (J > 1) repetitions
measurements
● Level-2 K (K > 2) days
at each level
● Level-3 L (L > 2) runs

For the design shown above, J = 4; K = 3 and L = 2. The design can


be repeated for:
● Q (Q > 2) check standards

● I (I > 3) gauges if the intent is to characterize several similar


gauges

2-level nested The design can be truncated at two levels to estimate repeatability and
design day-to-day variability if there is no reason to estimate longer-term
effects. The analysis remains the same through the first two levels.

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc433.htm (1 of 4) [11/13/2003 5:38:31 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc433.htm (2 of 4) [11/13/2003 5:38:31 PM]


2.4.3.3. 3-level nested design 2.4.3.3. 3-level nested design

8. Short-term standard deviation from J repetitions


Advantages This design has advantages in ease of use and computation. The 9. Degrees of freedom
number of repetitions at each level need not be large because
10. Environmental readings (if pertinent)
information is being gathered on several check standards.

Operator The measurements should be made with ONE operator. Operator is


considerations not usually a consideration with automated systems. However,
systems that require decisions regarding line edge or other feature
delineations may be operator dependent. If there is reason to believe
that results might differ significantly by operator, 'operators' can be
substituted for 'runs' in the design. Choose L (L > 2) operators at
random from the pool of operators who are capable of making
measurements at the same level of precision. (Conduct a small
experiment with operators making repeatability measurements, if
necessary, to verify comparability of precision among operators.)
Then complete the data collection and analysis as outlined. In this
case, the level-3 standard deviation estimates operator effect.

Caution Be sure that the design is truly nested; i.e., that each operator reports
results for the same set of circumstances, particularly with regard to
day of measurement so that each operator measures every day, or
every other day, and so forth.

Randomize on Randomize with respect to gauges for each check standard; i.e.,
gauges choose the first check standard and randomize the gauges; choose the
second check standard and randomize gauges; and so forth.

Record results Record the average and standard deviation from each group of J
in a file repetitions by:
● check standard

● gauge

Case Study: Results should be recorded along with pertinent environmental


Resistivity readings and identifications for significant factors. The best way to
Gauges record this information is in one file with one line or row (on a
spreadsheet) of information in fixed fields for each check standard
measurement. A list of typical entries follows.
1. Month
2. Day
3. Year
4. Operator identification
5. Check standard identification
6. Gauge identification
7. Average of J repetitions

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc433.htm (3 of 4) [11/13/2003 5:38:31 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc433.htm (4 of 4) [11/13/2003 5:38:31 PM]


2.4.4. Analysis of variability 2.4.4. Analysis of variability

Hint on using An easy way to begin is with a 2-level table with J columns and K rows for the
tabular repeatability/reproducibility measurements and proceed as follows:
method of 1. Compute an average for each row and put it in the J+1 column.
2. Measurement Process Characterization analysis
2. Compute the level-1 (repeatability) standard deviation for each row and put it in the
2.4. Gauge R & R studies
J+2 column.
3. Compute the grand average and the level-2 standard deviation from data in the J+1
2.4.4. Analysis of variability column.
4. Repeat the table for each of the L runs.
Analysis of The purpose of this section is to show the effect of various levels of time-dependent effects 5. Compute the level-3 standard deviation from the L grand averages.
variability on the variability of the measurement process with standard deviations for each level of a
from a nested 3-level nested design. Level-1: LK The measurements from the nested design are denoted by
design ● Level 1 - repeatability/short-term precision repeatability
standard
● Level 2 - reproducibility/day-to-day deviations can
● Level 3 - stability/run-to-run be computed Equations corresponding to the tabular analysis are shown below. Level-1 repeatability
The graph below depicts possible scenarios for a 2-level design (short-term repetitions and from the data standard deviations, s1lk, are pooled over the K days and L runs. Individual standard
days) to illustrate the concepts. deviations with (J - 1) degrees of freedom each are computed from J repetitions as

Depiction of 2 Process 1 Process 2


measurement Large between-day variability Small between-day variability
processes with
the same
short-term where
variability
over 6 days
where process
1 has large
between-day
variability and
Level-2: L The level-2 standard deviation, s2l, is pooled over the L runs. Individual standard deviations
process 2 has
reproducibility with (K - 1) degrees of freedom each are computed from K daily averages as
negligible
standard
between-day
deviations can
variability
be computed
from the data

where

Distributions of short-term measurements over 6 days where


distances from centerlines illustrate between-day variability

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc44.htm (1 of 3) [11/13/2003 5:38:31 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc44.htm (2 of 3) [11/13/2003 5:38:31 PM]


2.4.4. Analysis of variability 2.4.4.1. Analysis of repeatability

Level-3: A A level-3 standard deviation with (L - 1) degrees of freedom is computed from the L-run
single global averages as
standard
deviation can 2. Measurement Process Characterization
be computed 2.4. Gauge R & R studies
from the L-run 2.4.4. Analysis of variability
averages
where
2.4.4.1. Analysis of repeatability
Case study: The repeatability quantifies the basic precision for the gauge. A level-1 repeatability
Resistivity standard deviation is computed for each group of J repetitions, and a graphical analysis is
probes recommended for deciding if repeatability is dependent on the check standard, the operator,
Relationship The standard deviation that defines the uncertainty for a single measurement on a test item is
to uncertainty given by or the gauge. Two graphs are recommended. These should show:
for a test item ● Plot of repeatability standard deviations versus check standard with day coded

● Plot of repeatability standard deviations versus check standard with gauge coded
Typically, we expect the standard deviation to be gauge dependent -- in which case there
should be a separate standard deviation for each gauge. If the gauges are all at the same level
where the pooled values, s1 and s2, are the usual of precision, the values can be combined over all gauges.

Repeatability A repeatability standard deviation from J repetitions is not a reliable estimate of the
standard precision of the gauge. Fortunately, these standard deviations can be pooled over days; runs;
deviations and check standards, if appropriate, to produce a more reliable precision measure. The table
and can be below shows a mechanism for pooling. The pooled repeatability standard deviation, , has
pooled over
LK(J - 1) degrees of freedom for measurements taken over:
operators,
runs, and ● J repetitions

check ● K days
standards ● L runs
There may be other sources of uncertainty in the measurement process that must be
accounted for in a formal analysis of uncertainty.
Basic The table below gives the mechanism for pooling repeatability standard deviations over days
pooling rules and runs. The pooled value is an average of weighted variances and is shown as the last
entry in the right-hand column of the table. The pooling can also cover check standards, if
appropriate.

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc44.htm (3 of 3) [11/13/2003 5:38:31 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc441.htm (1 of 3) [11/13/2003 5:38:32 PM]


2.4.4.1. Analysis of repeatability 2.4.4.1. Analysis of repeatability

run 2 - day 4 5 0.0768 0.02949


View of To illustrate the calculations, a subset of data collected in a nested design for one check
entire standard (#140) and one probe (#2362) are shown below. The measurements are resistivity run 2 - day 5 5 0.1042 0.05429
dataset from (ohm.cm) readings with six repetitions per day. The individual level-1 standard deviations
from the six repetitions and degrees of freedom are recorded in the last two columns of the run 2 - day 6 5 0.0868 0.03767
the nested
database.
design
60 0.37176
gives the total degrees gives the total sum of
Run Wafer Probe Month Day Op Temp Average Stddev df of freedom for s1 squares for s1
0.07871
1 140 2362 3 15 1 23.08 96.0771 0.1024 5 The pooled value of s1 is given by
1 140 2362 3 17 1 23.00 95.9976 0.0943 5
1 140 2362 3 18 1 23.01 96.0148 0.0622 5
1 140 2362 3 22 1 23.27 96.0397 0.0702 5 Run software The Dataplot commands (corresponding to the calculations in the table above)
1 140 2362 3 23 2 23.24 96.0407 0.0627 5 macro for
1 140 2362 3 24 2 23.13 96.0445 0.0622 5 pooling
standard dimension 500 30
2 140 2362 4 12 1 22.88 96.0793 0.0996 5 read mpc411.dat run wafer probe month day op temp avg s1i vi
2 140 2362 4 18 2 22.76 96.1115 0.0533 5 deviations
let ssi=vi*s1i*s1i
2 140 2362 4 19 2 22.79 96.0803 0.0364 5 let ss=sum ssi
2 140 2362 4 19 1 22.71 96.0411 0.0768 5 let v = sum vi
2 140 2362 4 20 2 22.84 96.0988 0.1042 5 let s1 = (ss/v)**0.5
2 140 2362 4 21 1 22.94 96.0482 0.0868 5 print s1 v

Pooled repeatability standard deviations over days, runs


return the following pooled values for the repeatability standard deviation and degrees of
Degrees of freedom.
Source of Variability Standard Deviations Sum of Squares (SS)
Freedom

Probe 2362 PARAMETERS AND CONSTANTS--

S1 -- 0.7871435E-01
run 1 - day 1 5 0.1024 0.05243 V -- 0.6000000E+02

run 1 - day 2 5 0.0943 0.04446

run 1 - day 3 5 0.0622 0.01934

run 1 - day 4 5 0.0702 0.02464

run 1 - day 5 5 0.0627 0.01966

run 1 - day 6 5 0.0622 0.01934

run 2 - day 1 5 0.0996 0.04960

run 2 - day 2 5 0.0533 0.01420

run 2 - day 3 5 0.0364 0.00662

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc441.htm (2 of 3) [11/13/2003 5:38:32 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc441.htm (3 of 3) [11/13/2003 5:38:32 PM]


2.4.4.2. Analysis of reproducibility 2.4.4.2. Analysis of reproducibility

Run A subset of data (shown on previous page) collected in a nested design on one check standard (#140)
software with probe (#2362) on six days are analyzed for between-day effects. Dataplot commands to compute
macro for the level-2 standard deviations and pool over runs 1 and 2 are:
2. Measurement Process Characterization computing
2.4. Gauge R & R studies level-2
2.4.4. Analysis of variability dimension 500 30
standard
deviations read mpc441.dat run wafer probe mo day op temp y s df
2.4.4.2. Analysis of reproducibility and pooling let n1 = count y subset run 1
over runs let df1 = n1 - 1
Case study: Day-to-day variability can be assessed by a graph of check standard values (averaged over J let n2 = count y subset run 2
Resistivity repetitions) versus day with a separate graph for each check standard. Graphs for all check standards let df2 = n2 - 1
gauges should be plotted on the same page to obtain an overall view of the measurement situation. let v2 = df1 + df2
let s2run1 = standard deviation y subset run 1
Pooling The level-2 standard deviations with (K - 1) degrees of a freedom are computed from the check
results in standard values for days and pooled over runs as shown in the table below. The pooled level-2
let s2run2 = standard deviation y subset run 2
more standard deviation has degrees of freedom L(K - 1) for measurements made over: let s2 = df1*(s2run1)**2 + df2*(s2run2)**2
reliable ● K days let s2 = (s2/v2)**.5
estimates print s2run1 df1
● L runs
print s2run2 df2
Mechanism The table below gives the mechanism for pooling level-2 standard deviations over runs. The pooled print s2 v2
for pooling value is an average of weighted variances and is the last entry in the right-hand column of the table.
The pooling can be extended in the same manner to cover check standards, if appropriate. Dataplot Dataplot returns the following level-2 standard deviations and degrees of freedom:
output
Level-2 standard deviations for a single gauge pooled over runs
Source of Degrees Sum of squares
Standard deviations
variability freedom (SS) PARAMETERS AND CONSTANTS--
Days
S2RUN1 -- 0.2728125E-01
DF1 -- 0.5000000E+01
Run 1 0.027280 5 0.003721
PARAMETERS AND CONSTANTS--
Run 2 0.027560 5 0.003798
S2RUN2 -- 0.2756367E-01
------- -------------
DF2 -- 0.5000000E+01
10 0.007519
PARAMETERS AND CONSTANTS--
Pooled value

0.02742
S2 -- 0.2742282E-01
v2 -- 0.1000000E+02

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc442.htm (1 of 3) [11/13/2003 5:38:32 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc442.htm (2 of 3) [11/13/2003 5:38:32 PM]


2.4.4.2. Analysis of reproducibility 2.4.4.3. Analysis of stability

Relationship The level-2 standard deviation is related to the standard deviation for between-day precision and
to day effect gauge precision by

2. Measurement Process Characterization


2.4. Gauge R & R studies
The size of the day effect can be calculated by subtraction using the formula above once the other two 2.4.4. Analysis of variability
standard deviations have been estimated reliably.

Computation
of
The Dataplot commands: 2.4.4.3. Analysis of stability
component
for days let J = 6 Case study: Run-to-run variability can be assessed graphically by a plot of check standard
let varday = s2**2 - (s1**2)/J Resistivity values (averaged over J repetitions) versus time with a separate graph for each
returns the following value for the variance for days: probes check standard. Data on all check standards should be plotted on one page to
obtain an overall view of the measurement situation.

THE COMPUTED VALUE OF THE CONSTANT Advantage A level-3 standard deviation with (L - 1) degrees of freedom is computed from
VARDAY = -0.2880149E-03 of pooling the run averages. Because there will rarely be more than 2 runs per check
The negative number for the variance is interpreted as meaning that the variance component for days standard, resulting in 1 degree of freedom per check standard, it is prudent to
is zero. However, with only 10 degrees of freedom for the level-2 standard deviation, this estimate is have three or more check standards in the design in order to take advantage of
not necessarily reliable. The standard deviation for days over the entire database shows a significant pooling. The mechanism for pooling over check standards is shown in the table
component for days. below. The pooled standard deviation has Q(L - 1) degrees and is shown as the
last entry in the right-hand column of the table.

Example of Level-3 standard deviations for a single gauge pooled over check
pooling standards
Source of Standard Degrees of freedom Sum of squares
variability deviation (DF) (SS)
Level-3

Chk std 138 0.0223 1 0.0004973

Chk std 139 0.0027 1 0.0000073

Chk std 140 0.0289 1 0.0008352

Chk std 141 0.0133 1 0.0001769

Chk std 142 0.0205 1 0.0004203


-------------- -----------
Sum 5 0.0019370

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc442.htm (3 of 3) [11/13/2003 5:38:32 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc443.htm (1 of 3) [11/13/2003 5:38:32 PM]


2.4.4.3. Analysis of stability 2.4.4.3. Analysis of stability

Pooled value 0.0197

Run A subset of data collected in a nested design on one check standard (#140) with
software probe (#2362) for six days and two runs is analyzed for between-run effects.
macro for Dataplot commands to compute the level-3 standard deviation from the
computing averages of 2 runs are:
level-3
standard
deviation dimension 30 columns
read mpc441.dat run wafer probe mo ...
day op temp y s df
let y1 = average y subset run 1
let y2 = average y subset run 2
let ybar = (y1 + y2)/2
let ss = (y1-ybar)**2 + (y2-ybar)**2
let v3 = 1
let s3 = (ss/v3)**.5
print s3 v3
Dataplot Dataplot returns the level-3 standard deviation and degrees of freedom:
output

PARAMETERS AND CONSTANTS--

S3 -- 0.2885137E-01
V3 -- 0.1000000E+01
Relationship The size of the between-run effect can be calculated by subtraction using the
to long-term standard deviations for days and gauge precision as
changes,
days and
gauge
precision

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc443.htm (2 of 3) [11/13/2003 5:38:32 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc443.htm (3 of 3) [11/13/2003 5:38:32 PM]


2.4.4.4.4. Example of calculations 2.4.4.4.4. Example of calculations

SUM
3.17635 300 0.10290

2. Measurement Process Characterization


2.4. Gauge R & R studies
2.4.4. Analysis of variability
2.4.4.4.

2.4.4.4.4. Example of calculations


Example of Short-term standard deviations based on
repeatability ● J = 6 repetitions with 5 degrees of freedom
calculations
● K = 6 days

● L = 2 runs

were recorded with a probing instrument on Q = 5 wafers. The


standard deviations were pooled over K = 6 days and L = 2 runs to
give 60 degrees of freedom for each wafer. The pooling of
repeatability standard deviations over the 5 wafers is demonstrated in
the table below.

Pooled repeatability standard deviation for a single gauge


Degrees of
Source of
Sum of Squares (SS) freedom Std Devs
variability
(DF)

Repeatability

0.48115 60
Wafer #138
0.69209 60
Wafer #139

Wafer #140 0.48483 60

Wafer #141
1.21752 60
Wafer #142
0.30076 60

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc4444.htm (1 of 2) [11/13/2003 5:38:33 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc4444.htm (2 of 2) [11/13/2003 5:38:33 PM]


2.4.5. Analysis of bias 2.4.5.1. Resolution

2. Measurement Process Characterization 2. Measurement Process Characterization


2.4. Gauge R & R studies 2.4. Gauge R & R studies
2.4.5. Analysis of bias

2.4.5. Analysis of bias


2.4.5.1. Resolution
Definition of The terms 'bias' and 'systematic error' have the same meaning in this
bias handbook. Bias is defined ( VIM) as the difference between the Resolution Resolution (MSA) is the ability of the measurement system to detect
measurement result and its unknown 'true value'. It can often be and faithfully indicate small changes in the characteristic of the
estimated and/or eliminated by calibration to a reference standard. measurement result.

Potential Calibration relates output to 'true value' in an ideal environment. Definition from The resolution of the instrument is if there is an equal probability
problem However, it may not assure that the gauge reacts properly in its working (MSA) manual that the indicated value of any artifact, which differs from a
environment. Temperature, humidity, operator, wear, and other factors
can introduce bias into the measurements. There is no single method for reference standard by less than , will be the same as the indicated
dealing with this problem, but the gauge study is intended to uncover value of the reference.
biases in the measurement process.
Good versus A small implies good resolution -- the measurement system can
Sources of Sources of bias that are discussed in this Handbook include: poor discriminate between artifacts that are close together in value.
bias ● Lack of gauge resolution
A large implies poor resolution -- the measurement system can
● Lack of linearity only discriminate between artifacts that are far apart in value.
● Drift
● Hysteresis Warning The number of digits displayed does not indicate the resolution of
the instrument.
● Differences among gauges
● Differences among geometries Manufacturer's Resolution as stated in the manufacturer's specifications is usually a
● Differences among operators statement of function of the least-significant digit (LSD) of the instrument and
resolution other factors such as timing mechanisms. This value should be
● Remedial actions and strategies checked in the laboratory under actual conditions of measurement.

Experimental To make a determination in the laboratory, select several artifacts


determination with known values over a range from close in value to far apart. Start
of resolution with the two artifacts that are farthest apart and make measurements
on each artifact. Then, measure the two artifacts with the second
largest difference, and so forth, until two artifacts are found which
repeatedly give the same result. The difference between the values of
these two artifacts estimates the resolution.

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc45.htm [11/13/2003 5:38:33 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc451.htm (1 of 2) [11/13/2003 5:38:33 PM]


2.4.5.1. Resolution 2.4.5.2. Linearity of the gauge

Consequence of No useful information can be gained from a study on a gauge with


poor resolution poor resolution relative to measurement needs.

2. Measurement Process Characterization


2.4. Gauge R & R studies
2.4.5. Analysis of bias

2.4.5.2. Linearity of the gauge


Definition of Linearity is given a narrow interpretation in this Handbook to indicate
linearity for that gauge response increases in equal increments to equal increments
gauge studies of stimulus, or, if the gauge is biased, that the bias remains constant
throughout the course of the measurement process.

Data A determination of linearity requires Q (Q > 4) reference standards


collection that cover the range of interest in fairly equal increments and J (J > 1)
and measurements on each reference standard. One measurement is made
repetitions on each of the reference standards, and the process is repeated J times.

Plot of the A test of linearity starts with a plot of the measured values versus
data corresponding values of the reference standards to obtain an indication
of whether or not the points fall on a straight line with slope equal to 1
-- indicating linearity.

Least-squares A least-squares fit of the data to the model


estimates of
bias and
slope Y = a + bX + measurement error
where Y is the measurement result and X is the value of the reference
standard, produces an estimate of the intercept, a, and the slope, b.

Output from The intercept and bias are estimated using a statistical software
software package that should provide the following information:
package ● Estimates of the intercept and slope,
● Standard deviations of the intercept and slope
● Residual standard deviation of the fit
● F-test for goodness of fit

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc451.htm (2 of 2) [11/13/2003 5:38:33 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc452.htm (1 of 2) [11/13/2003 5:38:33 PM]


2.4.5.2. Linearity of the gauge 2.4.5.3. Drift

Test for Tests for the slope and bias are described in the section on instrument
linearity calibration. If the slope is different from one, the gauge is non-linear
and requires calibration or repair. If the intercept is different from zero,
the gauge has a bias. 2. Measurement Process Characterization
2.4. Gauge R & R studies
Causes of The reference manual on Measurement Systems Analysis (MSA) lists 2.4.5. Analysis of bias
non-linearity possible causes of gauge non-linearity that should be investigated if the
gauge shows symptoms of non-linearity.
1. Gauge not properly calibrated at the lower and upper ends of the
2.4.5.3. Drift
operating range
Definition Drift can be defined (VIM) as a slow change in the response of a gauge.
2. Error in the value of X at the maximum or minimum range
Instruments Short-term drift can be a problem for comparator measurements. The
3. Worn gauge
used as cause is frequently heat build-up in the instrument during the time of
4. Internal design problems (electronics) comparators measurement. It would be difficult, and probably unproductive, to try to
for pinpoint the extent of such drift with a gauge study. The simplest
Note - on The requirement of linearity for artifact calibration is not so stringent. calibration solution is to use drift-free designs for collecting calibration data. These
artifact Where the gauge is used as a comparator for measuring small designs mitigate the effect of linear drift on the results.
calibration differences among test items and reference standards of the same
nominal size, as with calibration designs, the only requirement is that Long-term drift should not be a problem for comparator measurements
the gauge be linear over the small on-scale range needed to measure because such drift would be constant during a calibration design and
both the reference standard and the test item. would cancel in the difference measurements.

Situation Sometimes it is not economically feasible to correct for the calibration Instruments For instruments whose readings are corrected by a linear calibration
where the of the gauge ( Turgel and Vecchia). In this case, the bias that is corrected by line, drift can be detected using a control chart technique and
calibration of incurred by neglecting the calibration is estimated as a component of linear measurements on three or more check standards.
the gauge is uncertainty. calibration
neglected
Drift in For other instruments, measurements can be made on a daily basis on
direct two or more check standards over a preset time period, say, one month.
reading These measurements are plotted on a time scale to determine the extent
instruments and nature of any drift. Drift rarely continues unabated at the same rate
and and in the same direction for a long time period.
uncertainty
analysis Thus, the expectation from such an experiment is to document the
maximum change that is likely to occur during a set time period and
plan adjustments to the instrument accordingly. A further impact of the
findings is that uncorrected drift is treated as a type A component in the
uncertainty analysis.

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc452.htm (2 of 2) [11/13/2003 5:38:33 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc453.htm [11/13/2003 5:38:33 PM]


2.4.5.4. Differences among gauges 2.4.5.4. Differences among gauges

2062 -0.01754 -0.03226 -0.01258 -0.02802 -0.00110

2362 -0.03725 -0.00936 -0.02608 -0.02522 -0.03830

2. Measurement Process Characterization


Plot of A graphical analysis can be more effective for detecting differences among gauges
2.4. Gauge R & R studies differences than a table of differences. The differences are plotted versus artifact identification
2.4.5. Analysis of bias among with each gauge identified by a separate plotting symbol. For ease of interpretation,
probes the symbols for any one gauge can be connected by dotted lines.

2.4.5.4. Differences among gauges Interpretation Because the plots show differences from the average by artifact, the center line is the
zero-line, and the differences are estimates of bias. Gauges that are consistently
Purpose A gauge study should address whether gauges agree with one another and whether above or below the other gauges are biased high or low, respectively, relative to the
the agreement (or disagreement) is consistent over artifacts and time. average. The best estimate of bias for a particular gauge is its average bias over the Q
artifacts. For this data set, notice that probe #2362 is consistently biased low relative
Data For each gauge in the study, the analysis requires measurements on to the other probes.
collection ● Q (Q > 2) check standards

● K (K > 2) days
Strategies for Given that the gauges are a random sample of like-kind gauges, the best estimate in
dealing with any situation is an average over all gauges. In the usual production or metrology
The measurements should be made by a single operator. differences setting, however, it may only be feasible to make the measurements on a particular
among piece with one gauge. Then, there are two methods of dealing with the differences
Data The steps in the analysis are: gauges among gauges.
reduction 1. Measurements are averaged over days by artifact/gauge configuration. 1. Correct each measurement made with a particular gauge for the bias of that
2. For each artifact, an average is computed over gauges. gauge and report the standard deviation of the correction as a type A
3. Differences from this average are then computed for each gauge. uncertainty.
4. If the design is run as a 3-level design, the statistics are computed separately 2. Report each measurement as it occurs and assess a type A uncertainty for the
for each run. differences among the gauges.

Data from a The data in the table below come from resistivity (ohm.cm) measurements on Q = 5
gauge study artifacts on K = 6 days. Two runs were made which were separated by about a
month's time. The artifacts are silicon wafers and the gauges are four-point probes
specifically designed for measuring resistivity of silicon wafers. Differences from the
wafer means are shown in the table.

Biases for 5
probes from a Table of biases for probes and silicon wafers (ohm.cm)
gauge study Wafers
with 5
artifacts on 6 Probe 138 139 140 141 142
days ---------------------------------------------------------
1 0.02476 -0.00356 0.04002 0.03938 0.00620

181 0.01076 0.03944 0.01871 -0.01072 0.03761

182 0.01926 0.00574 -0.02008 0.02458 -0.00439

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc454.htm (1 of 2) [11/13/2003 5:38:34 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc454.htm (2 of 2) [11/13/2003 5:38:34 PM]


2.4.5.5. Geometry/configuration differences 2.4.5.5. Geometry/configuration differences

63. 3 2062. -0.0059 0.0067


63. 4 2062. -0.0078 0.0016
63. 5 2062. -0.0007 0.0020
63. 6 2062. 0.0006 0.0017
2. Measurement Process Characterization
2.4. Gauge R & R studies
2.4.5. Analysis of bias 103. 1 2062. -0.0050 0.0076
103. 2 2062. -0.0140 0.0002
2.4.5.5. Geometry/configuration differences 103. 3 2062. -0.0048 0.0025
103. 4 2062. 0.0018 0.0045
How to deal The mechanism for identifying and/or dealing with differences among geometries or 103. 5 2062. 0.0016 -0.0025
with configurations in an instrument is basically the same as dealing with differences among the 103. 6 2062. 0.0044 0.0035
configuration gauges themselves.
differences
125. 1 2062. -0.0056 0.0099
Example of An example is given of a study of configuration differences for a single gauge. The gauge, a 125. 2 2062. -0.0155 0.0123
differences 4-point probe for measuring resistivity of silicon wafers, can be wired in several ways. Because 125. 3 2062. -0.0010 0.0042
among wiring it was not possible to test all wiring configurations during the gauge study, measurements were 125. 4 2062. -0.0014 0.0098
configurations made in only two configurations as a way of identifying possible problems. 125. 5 2062. 0.0003 0.0032
Data on Measurements were made on six wafers over six days (except for 5 measurements on wafer 39)
125. 6 2062. -0.0017 0.0115
wiring with probe #2062 wired in two configurations. This sequence of measurements was repeated
configurations after about a month resulting in two runs. Differences between measurements in the two Test of Because there are only two configurations, a t-test is used to decide if there is a difference. If
and a plot of configurations on the same day are shown in the following table. difference
differences between
configurations
between the 2
wiring Differences between wiring configurations
configurations the difference between the two configurations is statistically significant.
Wafer Day Probe Run 1 Run 2
The average and standard deviation computed from the 29 differences in each run are shown in
the table below along with the t-values which confirm that the differences are significant for
17. 1 2062. -0.0108 0.0088 both runs.
17. 2 2062. -0.0111 0.0062
17. 3 2062. -0.0062 0.0074
17. 4 2062. 0.0020 0.0047 Average differences between wiring configurations
17. 5 2062. 0.0018 0.0049
17. 6 2062. 0.0002 0.0000 Run Probe Average Std dev N t

39. 1 2062. -0.0089 0.0075 1 2062 - 0.00383 0.00514 29 -4.0


39. 3 2062. -0.0040 -0.0016 2 2062 + 0.00489 0.00400 29 +6.6
39. 4 2062. -0.0022 0.0052
Unexpected The data reveal a wiring bias for both runs that changes direction between runs. This is a
39. 5 2062. -0.0012 0.0085 result somewhat disturbing finding, and further study of the gauges is needed. Because neither wiring
39. 6 2062. -0.0034 -0.0018 configuration is preferred or known to give the 'correct' result, the differences are treated as a
component of the measurement uncertainty.
63. 1 2062. -0.0016 0.0092
63. 2 2062. -0.0111 0.0040

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc455.htm (1 of 3) [11/13/2003 5:38:34 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc455.htm (2 of 3) [11/13/2003 5:38:34 PM]


2.4.5.5. Geometry/configuration differences 2.4.5.6. Remedial actions and strategies

2. Measurement Process Characterization


2.4. Gauge R & R studies
2.4.5. Analysis of bias

2.4.5.6. Remedial actions and strategies


Variability The variability of the gauge in its normal operating mode needs to be
examined in light of measurement requirements.
If the standard deviation is too large, relative to requirements, the
uncertainty can be reduced by making repeated measurements and
taking advantage of the standard deviation of the average (which is
reduced by a factor of when n measurements are averaged).

Causes of If multiple measurements are not economically feasible in the


excess workload, then the performance of the gauge must be improved.
variability Causes of variability which should be examined are:
● Wear

● Environmental effects such as humidity

● Temperature excursions

● Operator technique

Resolution There is no remedy for a gauge with insufficient resolution. The gauge
will need to be replaced with a better gauge.

Lack of Lack of linearity can be dealt with by correcting the output of the
linearity gauge to account for bias that is dependent on the level of the stimulus.
Lack of linearity can be tolerated (left uncorrected) if it does not
increase the uncertainty of the measurement result beyond its
requirement.

Drift It would be very difficult to correct a gauge for drift unless there is
sufficient history to document the direction and size of the drift. Drift
can be tolerated if it does not increase the uncertainty of the
measurement result beyond its requirement.

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc455.htm (3 of 3) [11/13/2003 5:38:34 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc456.htm (1 of 2) [11/13/2003 5:38:34 PM]


2.4.5.6. Remedial actions and strategies 2.4.6. Quantifying uncertainties from a gauge study

Differences Significant differences among gauges/configurations can be treated in


among gauges one of two ways:
or 1. By correcting each measurement for the bias of the specific
configurations gauge/configuration. 2. Measurement Process Characterization
2. By accepting the difference as part of the uncertainty of the 2.4. Gauge R & R studies
measurement process.

Differences Differences among operators can be viewed in the same way as


2.4.6. Quantifying uncertainties from a
among differences among gauges. However, an operator who is incapable of gauge study
operators making measurements to the required precision because of an
untreatable condition, such as a vision problem, should be re-assigned
Gauge One reason for conducting a gauge study is to quantify uncertainties in
to other tasks.
studies can the measurement process that would be difficult to quantify under
be used as conditions of actual measurement.
the basis for
uncertainty This is a reasonable approach to take if the results are truly
assessment representative of the measurement process in its working environment.
Consideration should be given to all sources of error, particularly those
sources of error which do not exhibit themselves in the short-term run.

Potential The potential problem with this approach is that the calculation of
problem with uncertainty depends totally on the gauge study. If the measurement
this process changes its characteristics over time, the standard deviation
approach from the gauge study will not be the correct standard deviation for the
uncertainty analysis. One way to try to avoid such a problem is to carry
out a gauge study both before and after the measurements that are being
characterized for uncertainty. The 'before' and 'after' results should
indicate whether or not the measurement process changed in the
interim.

Uncertainty The computation of uncertainty depends on the particular measurement


analysis that is of interest. The gauge study gathers the data and estimates
requires standard deviations for sources that contribute to the uncertainty of the
information measurement result. However, specific formulas are needed to relate
about the these standard deviations to the standard deviation of a measurement
specific result.
measurement

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc456.htm (2 of 2) [11/13/2003 5:38:34 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc46.htm (1 of 3) [11/13/2003 5:38:34 PM]


2.4.6. Quantifying uncertainties from a gauge study 2.4.6. Quantifying uncertainties from a gauge study

General The following sections outline the general approach to uncertainty Hysteresis Hysteresis, as a performance specification, is defined (NCSL RP-12) as
guidance analysis and give methods for combining the standard deviations into a the maximum difference between the upscale and downscale readings
final uncertainty: on the same artifact during a full range traverse in each direction. The
1. Approach standard uncertainty for hysteresis is
2. Methods for type A evaluations
3. Methods for type B evaluations
4. Propagation of error
Determining Drift in direct reading instruments is defined for a specific time interval
5. Error budgets and sensitivity coefficients drift of interest. The standard uncertainty for drift is
6. Standard and expanded uncertainties
7. Treatment of uncorrected biases

Type A Data collection methods and analyses of random sources of uncertainty where Y0 and Yt are measurements at time zero and t, respectively.
evaluations are given for the following:
of random 1. Repeatability of the gauge Other biases Other sources of bias are discussed as follows:
error
2. Reproducibility of the measurement process 1. Differences among gauges
3. Stability (very long-term) of the measurement process 2. Differences among configurations

Biases - Rule The approach for biases is to estimate the maximum bias from a gauge Case study: A case study on type A uncertainty analysis from a gauge study is
of thumb study and compute a standard uncertainty from the maximum bias Type A recommended as a guide for bringing together the principles and
assuming a suitable distribution. The formulas shown below assume a uncertainties elements discussed in this section. The study in question characterizes
uniform distribution for each bias. from a the uncertainty of resistivity measurements made on silicon wafers.
gauge study
Determining If the resolution of the gauge is , the standard uncertainty for
resolution resolution is

Determining If the maximum departure from linearity for the gauge has been
non-linearity determined from a gauge study, and it is reasonable to assume that the
gauge is equally likely to be engaged at any point within the range
tested, the standard uncertainty for linearity is

http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc46.htm (2 of 3) [11/13/2003 5:38:34 PM] http://www.itl.nist.gov/div898/handbook/mpc/section4/mpc46.htm (3 of 3) [11/13/2003 5:38:34 PM]


2.5. Uncertainty analysis 2.5. Uncertainty analysis

standard
3. Sensitivity coefficients for measurements with a 2-level
design
4. Sensitivity coefficients for measurements with a 3-level
2. Measurement Process Characterization
design
5. Example of error budget
2.5. Uncertainty analysis 7. Standard and expanded uncertainties
1. Degrees of freedom
Uncertainty This section discusses the uncertainty of measurement results.
measures Uncertainty is a measure of the 'goodness' of a result. Without such a 8. Treatment of uncorrected bias
'goodness' measure, it is impossible to judge the fitness of the value as a basis for 1. Computation of revised uncertainty
of a test making decisions relating to health, safety, commerce or scientific
result excellence.

Contents 1. What are the issues for uncertainty analysis?


2. Approach to uncertainty analysis
1. Steps
3. Type A evaluations
1. Type A evaluations of random error
1. Time-dependent components
2. Measurement configurations
2. Type A evaluations of material inhomogeneities
1. Data collection and analysis
3. Type A evaluations of bias
1. Treatment of inconsistent bias
2. Treatment of consistent bias
3. Treatment of bias with sparse data
4. Type B evaluations
1. Assumed distributions
5. Propagation of error considerations
1. Functions of a single variable
2. Functions of two variables
3. Functions of several variables
6. Error budgets and sensitivity coefficients
1. Sensitivity coefficients for measurements on the test item
2. Sensitivity coefficients for measurements on a check

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5.htm (1 of 2) [11/13/2003 5:38:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5.htm (2 of 2) [11/13/2003 5:38:35 PM]


2.5.1. Issues 2.5.1. Issues

Relationship to Many laboratories or industries participate in interlaboratory studies


interlaboratory where the test method itself is evaluated for:
test results ● repeatability within laboratories
2. Measurement Process Characterization ● reproducibility across laboratories
2.5. Uncertainty analysis
These evaluations do not lead to uncertainty statements because the
purpose of the interlaboratory test is to evaluate, and then improve,
2.5.1. Issues the test method as it is applied across the industry. The purpose of
uncertainty analysis is to evaluate the result of a particular
measurement, in a particular laboratory, at a particular time.
Issues for Evaluation of uncertainty is an ongoing process that can consume However, the two purposes are related.
uncertainty time and resources. It can also require the services of someone who
analysis is familiar with data analysis techniques, particularly statistical Default If a test laboratory has been party to an interlaboratory test that
analysis. Therefore, it is important for laboratory personnel who are recommendation follows the recommendations and analyses of an American Society
approaching uncertainty analysis for the first time to be aware of the for test for Testing Materials standard (ASTM E691) or an ISO standard
resources required and to carefully lay out a plan for data collection laboratories (ISO 5725), the laboratory can, as a default, represent its standard
and analysis.
uncertainty for a single measurement as the reproducibility standard
deviation as defined in ASTM E691 and ISO 5725. This standard
Problem areas Some laboratories, such as test laboratories, may not have the
deviation includes components for within-laboratory repeatability
resources to undertake detailed uncertainty analyses even though,
common to all laboratories and between-laboratory variation.
increasingly, quality management standards such as the ISO 9000
series are requiring that all measurement results be accompanied by
statements of uncertainty. Drawbacks of The standard deviation computed in this manner describes a future
this procedure single measurement made at a laboratory randomly drawn from the
Other situations where uncertainty analyses are problematical are: group and leads to a prediction interval (Hahn & Meeker) rather
● One-of-a-kind measurements than a confidence interval. It is not an ideal solution and may
● Dynamic measurements that depend strongly on the
produce either an unrealistically small or unacceptably large
application for the measurement uncertainty for a particular laboratory. The procedure can reward
laboratories with poor performance or those that do not follow the
test procedures to the letter and punish laboratories with good
Directions being What can be done in these situations? There is no definitive answer
performance. Further, the procedure does not take into account
pursued at this time. Several organizations, such as the National Conference
sources of uncertainty other than those captured in the
of Standards Laboratories (NCSL) and the International Standards
interlaboratory test. Because the interlaboratory test is a snapshot at
Organization (ISO) are investigating methods for dealing with this one point in time, characteristics of the measurement process over
problem, and there is a document in draft that will recommend a time cannot be accurately evaluated. Therefore, it is a strategy to be
simplified approach to uncertainty analysis based on results of used only where there is no possibility of conducting a realistic
interlaboratory tests. uncertainty investigation.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc51.htm (1 of 2) [11/13/2003 5:38:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc51.htm (2 of 2) [11/13/2003 5:38:35 PM]


2.5.2. Approach 2.5.2. Approach

ISO Uncertainty, as defined in the ISO Guide to the Expression of


definition of Uncertainty in Measurement (GUM) and the International Vocabulary
uncertainty of Basic and General Terms in Metrology (VIM), is a
2. Measurement Process Characterization "parameter, associated with the result of a measurement,
2.5. Uncertainty analysis that characterizes the dispersion of the values that could
reasonably be attributed to the measurand."
2.5.2. Approach Consistent This definition is consistent with the well-established concept that an
with uncertainty statement assigns credible limits to the accuracy of a
Procedures The procedures in this chapter are intended for test laboratories, historical reported value, stating to what extent that value may differ from its
in this calibration laboratories, and scientific laboratories that report results of view of reference value (Eisenhart). In some cases, reference values will be
chapter measurements from ongoing or well-documented processes. uncertainty traceable to a national standard, and in certain other cases, reference
values will be consensus values based on measurements made
Pertinent The following pages outline methods for estimating the individual according to a specific protocol by a group of laboratories.
sections uncertainty components, which are consistent with materials presented
in other sections of this Handbook, and rules and equations for Accounts for The estimation of a possible discrepancy takes into account both
combining them into a final expanded uncertainty. The general both random random error and bias in the measurement process. The distinction to
framework is: error and keep in mind with regard to random error and bias is that random
1. ISO Approach bias errors cannot be corrected, and biases can, theoretically at least, be
2. Outline of steps to uncertainty analysis corrected or eliminated from the measurement result.
3. Methods for type A evaluations Relationship Precision and bias are properties of a measurement method.
4. Methods for type B evaluations to precision Uncertainty is a property of a specific result for a single test item that
5. Propagation of error considerations and bias depends on a specific measurement configuration
statements (laboratory/instrument/operator, etc.). It depends on the repeatability of
6. Uncertainty budgets and sensitivity coefficients the instrument; the reproducibility of the result over time; the number
7. Standard and expanded uncertainties of measurements in the test result; and all sources of random and
systematic error that could contribute to disagreement between the
8. Treatment of uncorrected bias
result and its reference value.
Specific Methods for calculating uncertainties for specific results are explained Handbook This Handbook follows the ISO approach (GUM) to stating and
situations are in the following sections: follows the combining components of uncertainty. To this basic structure, it adds a
outlined in ● Calibrated values of artifacts ISO statistical framework for estimating individual components,
other places approach
● Calibrated values from calibration curves particularly those that are classified as type A uncertainties.
in this
chapter ❍ From propagation of error
❍ From check standard measurements
❍ Comparison of check standards and propagation of error
● Gauge R & R studies
● Type A components for resistivity measurements
● Type B components for resistivity measurements

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc52.htm (1 of 4) [11/13/2003 5:38:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc52.htm (2 of 4) [11/13/2003 5:38:35 PM]


2.5.2. Approach 2.5.2. Approach

Basic ISO The ISO approach is based on the following rules: Type B Type B evaluations apply to random errors and biases for which there
tenets ● Each uncertainty component is quantified by a standard evaluations is little or no data from the local process, and to random errors and
deviation. biases from other measurement processes.
● All biases are assumed to be corrected and any uncertainty is the
uncertainty of the correction.
● Zero corrections are allowed if the bias cannot be corrected and
an uncertainty is assessed.
● All uncertainty intervals are symmetric.

ISO Components are grouped into two major categories, depending on the
approach to source of the data and not on the type of error, and each component is
classifying quantified by a standard deviation. The categories are:
sources of ● Type A - components evaluated by statistical methods
error
● Type B - components evaluated by other means (or in other
laboratories)

Interpretation One way of interpreting this classification is that it distinguishes


of this between information that comes from sources local to the measurement
classification process and information from other sources -- although this
interpretation does not always hold. In the computation of the final
uncertainty it makes no difference how the components are classified
because the ISO guidelines treat type A and type B evaluations in the
same manner.

Rule of All uncertainty components (standard deviations) are combined by


quadrature root-sum-squares (quadrature) to arrive at a 'standard uncertainty', u,
which is the standard deviation of the reported value, taking into
account all sources of error, both random and systematic, that affect the
measurement result.

Expanded If the purpose of the uncertainty statement is to provide coverage with


uncertainty a high level of confidence, an expanded uncertainty is computed as
for a high
degree of U=ku
confidence
where k is chosen to be the critical value from the t-table for
v degrees of freedom.
For large degrees of freedom, it is suggested to use k = 2 to
approximate 95% coverage. Details for these calculations are found
under degrees of freedom.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc52.htm (3 of 4) [11/13/2003 5:38:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc52.htm (4 of 4) [11/13/2003 5:38:35 PM]


2.5.2.1. Steps 2.5.2.1. Steps

3. Compute a standard deviation for each type B component of


uncertainty.
4. Combine type A and type B standard deviations into a standard
uncertainty for the reported result using sensitivity factors.
2. Measurement Process Characterization 5. Compute an expanded uncertainty.
2.5. Uncertainty analysis
2.5.2. Approach
Outline of B. - Reported value involves more than one quantity.
steps to be 1. Write down the equation showing the relationship between the
2.5.2.1. Steps followed in quantities.
the
❍ Write-out the propagation of error equation and do a
evaluation
Steps in The first step in the uncertainty evaluation is the definition of the result preliminary evaluation, if possible, based on propagation of
of
uncertainty to be reported for the test item for which an uncertainty is required. The error.
uncertainty
analysis - computation of the standard deviation depends on the number of 2. If the measurement result can be replicated directly, regardless
involving
define the repetitions on the test item and the range of environmental and of the number of secondary quantities in the individual
several
result to be operational conditions over which the repetitions were made, in addition repetitions, treat the uncertainty evaluation as in (A.1) to (A.5)
secondary
reported to other sources of error, such as calibration uncertainties for reference above, being sure to evaluate all sources of random error in the
quantities
standards, which influence the final result. If the value for the test item process.
cannot be measured directly, but must be calculated from measurements
3. If the measurement result cannot be replicated directly, treat
on secondary quantities, the equation for combining the various
each measurement quantity as in (A.1) and (A.2) and:
quantities must be defined. The steps to be followed in an uncertainty
analysis are outlined for two situations: ❍ Compute a standard deviation for each measurement
quantity.
Outline of A. Reported value involves measurements on one quantity. ❍ Combine the standard deviations for the individual
steps to be 1. Compute a type A standard deviation for random sources of error quantities into a standard deviation for the reported result
followed in from: via propagation of error.
the 4. Compute a standard deviation for each type B component of
❍ Replicated results for the test item.
evaluation uncertainty.
of ❍ Measurements on a check standard.
uncertainty ❍ Measurements made according to a 2-level designed 5. Combine type A and type B standard deviations into a standard
for a single experiment uncertainty for the reported result.
quantity 6. Compute an expanded uncertainty.
❍ Measurements made according to a 3-level designed
experiment 7. Compare the uncerainty derived by propagation of error with the
uncertainty derived by data analysis techniques.
2. Make sure that the collected data and analysis cover all sources of
random error such as:
❍ instrument imprecision

❍ day-to-day variation
❍ long-term variation
and bias such as:
❍ differences among instruments

❍ operator differences.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc521.htm (1 of 2) [11/13/2003 5:38:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc521.htm (2 of 2) [11/13/2003 5:38:35 PM]


2.5.3. Type A evaluations 2.5.3. Type A evaluations

2. Measurement Process Characterization


2.5. Uncertainty analysis

2.5.3. Type A evaluations


Type A Type A evaluations can apply to both random error and bias. The only
evaluations requirement is that the calculation of the uncertainty component be
apply to based on a statistical analysis of data. The distinction to keep in mind
both error with regard to random error and bias is that:
and bias ● random errors cannot be corrected

● biases can, theoretically at least, be corrected or eliminated from


the result.

Caveat for The ISO guidelines are based on the assumption that all biases are
biases corrected and that the only uncertainty from this source is the
uncertainty of the correction. The section on type A evaluations of bias
gives guidance on how to assess, correct and calculate uncertainties
related to bias.

Random How the source of error affects the reported value and the context for
error and the uncertainty determines whether an analysis of random error or bias
bias require is appropriate.
different
types of Consider a laboratory with several instruments that can reasonably be
analyses assumed to be representative of all similar instruments. Then the
differences among these instruments can be considered to be a random
effect if the uncertainty statement is intended to apply to the result of
any instrument, selected at random, from this batch.
If, on the other hand, the uncertainty statement is intended to apply to
one specific instrument, then the bias of this instrument relative to the
group is the component of interest.

The following pages outline methods for type A evaluations of:


1. Random errors
2. Bias

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc53.htm (1 of 2) [11/13/2003 5:38:35 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc53.htm (2 of 2) [11/13/2003 5:38:35 PM]


2.5.3.1. Type A evaluations of random components 2.5.3.1. Type A evaluations of random components

Examples of Examples of causes of differences within a well-maintained


causes of laboratory are:
differences 1. Differences among instruments for measurements of derived
2. Measurement Process Characterization within a units, such as sheet resistance of silicon, where the
2.5. Uncertainty analysis laboratory instruments cannot be directly calibrated to a reference base
2.5.3. Type A evaluations 2. Differences among operators for optical measurements that
are not automated and depend strongly on operator sightings
2.5.3.1. Type A evaluations of random 3. Differences among geometrical or electrical configurations of
the instrumentation
components
Calibrated Calibrated instruments do not normally fall in this class because
Type A Type A sources of uncertainty fall into three main categories: instruments do uncertainties associated with the instrument's calibration are
evaluations of not fall in this reported as type B evaluations, and the instruments in the laboratory
1. Uncertainties that reveal themselves over time
random class should agree within the calibration uncertainties. Instruments whose
components 2. Uncertainties caused by specific conditions of measurement responses are not directly calibrated to the defined unit are
3. Uncertainties caused by material inhomogeneities candidates for type A evaluations. This covers situations in which
the measurement is defined by a test procedure or standard practice
using a specific instrument type.
Time-dependent One of the most important indicators of random error is time, with
changes are a the root cause perhaps being environmental changes over time.
primary source Three levels of time-dependent effects are discussed in this section. Evaluation How these differences are treated depends primarily on the context
of random depends on the for the uncertainty statement. The differences, depending on the
errors context for the context, will be treated either as random differences, or as bias
uncertainty differences.
Many possible Other sources of uncertainty are related to measurement
configurations configurations within the laboratory. Measurements on test items are Uncertainties Artifacts, electrical devices, and chemical substances, etc. can be
may exist in a usually made on a single day, with a single operator, on a single due to inhomogeneous relative to the quantity that is being characterized by
laboratory for instrument, etc. If the intent of the uncertainty is to characterize all inhomogeneities the measurement process. If this fact is known beforehand, it may be
making measurements made in the laboratory, the uncertainty should possible to measure the artifact very carefully at a specific site and
measurements account for any differences due to: then direct the user to also measure at this site. In this case, there is
1. instruments no contribution to measurement uncertainty from inhomogeneity.
2. operators However, this is not always possible, and measurements may be
3. geometries destructive. As an example, compositions of chemical compounds
may vary from bottle to bottle. If the reported value for the lot is
4. other
established from measurements on a few bottles drawn at random
from the lot, this variability must be taken into account in the
uncertainty statement.
Methods for testing for inhomogeneity and assessing the appropriate
uncertainty are discussed on another page.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc531.htm (1 of 3) [11/13/2003 5:38:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc531.htm (2 of 3) [11/13/2003 5:38:36 PM]


2.5.3.1. Type A evaluations of random components 2.5.3.1.1. Type A evaluations of time-dependent effects

2. Measurement Process Characterization


2.5. Uncertainty analysis
2.5.3. Type A evaluations
2.5.3.1. Type A evaluations of random components

2.5.3.1.1. Type A evaluations of


time-dependent effects
Time-dependent One of the most important indicators of random error is time.
changes are a Effects not specifically studied, such as environmental changes,
primary source exhibit themselves over time. Three levels of time-dependent errors
of random errors are discussed in this section. These can be usefully characterized
as:
1. Level-1 or short-term errors (repeatability, imprecision)
2. Level-2 or day-to-day errors (reproducibility)
3. Level-3 or long-term errors (stability - which may not be a
concern for all processes)

Day-to-day With instrumentation that is exceedingly precise in the short run,


errors can be the changes over time, often caused by small environmental effects,
dominant source are frequently the dominant source of uncertainty in the
of uncertainty measurement process. The uncertainty statement is not 'true' to its
purpose if it describes a situation that cannot be reproduced over
time. The customer for the uncertainty is entitled to know the range
of possible results for the measurement result, independent of the
day or time of year when the measurement was made.

Two levels may Two levels of time-dependent errors are probably sufficient for
be sufficient describing the majority of measurement processes. Three levels
may be needed for new measurement processes or processes whose
characteristics are not well understood.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc531.htm (3 of 3) [11/13/2003 5:38:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5311.htm (1 of 3) [11/13/2003 5:38:36 PM]


2.5.3.1.1. Type A evaluations of time-dependent effects 2.5.3.1.1. Type A evaluations of time-dependent effects

Measurements on Repeated measurements on the test item generally do not cover a


test item are used sufficient time period to capture day-to-day changes in the
to assess measurement process. The standard deviation of these
uncertainty only measurements is quoted as the estimate of uncertainty only if no
when no other other data are available for the assessment. For J short-term
data are measurements, this standard deviation has v = J - 1 degrees of
available freedom.

A check standard The best approach for capturing information on time-dependent


is the best device sources of uncertainties is to intersperse the workload with
for capturing all measurements on a check standard taken at set intervals over the
sources of life of the process. The standard deviation of the check standard
random error measurements estimates the overall temporal component of
uncertainty directly -- thereby obviating the estimation of
individual components.

Nested design for A less-efficient method for estimating time-dependent sources of


estimating type A uncertainty is a designed experiment. Measurements can be made
uncertainties specifically for estimating two or three levels of errors. There are
many ways to do this, but the easiest method is a nested design
Case study:
where J short-term measurements are replicated on K days and the
Temporal
entire operation is then replicated over L runs (months, etc.). The
uncertainty from analysis of these data leads to:
a 3-level nested
● = standard deviation with (J -1) degrees of freedom for
design
short-term errors
● = standard deviation with (K -1) degrees of freedom for
day-to-day errors
● = standard deviation with (L -1) degrees of freedom for
very long-term errors

Approaches The computation of the uncertainty of the reported value for a test
given in this item is outlined for situations where temporal sources of
chapter uncertainty are estimated from:
1. measurements on the test item itself
2. measurements on a check standard
3. measurements from a 2-level nested design (gauge study)
4. measurements from a 3-level nested design (gauge study)

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5311.htm (2 of 3) [11/13/2003 5:38:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5311.htm (3 of 3) [11/13/2003 5:38:36 PM]


2.5.3.1.2. Measurement configuration within the laboratory 2.5.3.1.2. Measurement configuration within the laboratory

Graph For a graphical analysis, differences from the average for each artifact can be plotted
showing versus artifact, with instruments individually identified by a special plotting symbol.
differences The plot is examined to determine if some instruments always read high or low
2. Measurement Process Characterization among relative to the other instruments and if this behavior is consistent across artifacts. If
2.5. Uncertainty analysis there are systematic and significant differences among instruments, a type A
instruments
2.5.3. Type A evaluations uncertainty for instruments is computed. Notice that in the graph for resistivity
2.5.3.1. Type A evaluations of random components probes, there are differences among the probes with probes #4 and #5, for example,
consistently reading low relative to the other probes. A standard deviation that
describes the differences among the probes is included as a component of the
2.5.3.1.2. Measurement configuration within the uncertainty.
laboratory Standard Given the measurements,
deviation for
Purpose of The purpose of this page is to outline options for estimating uncertainties related to instruments
this page the specific measurement configuration under which the test item is measured, given
other possible measurement configurations. Some of these may be controllable and for each of Q artifacts and I instruments, the pooled standard deviation that describes
some of them may not, such as: the differences among instruments is:
● instrument

● operator

● temperature

● humidity

The effect of uncontrollable environmental conditions in the laboratory can often be where
estimated from check standard data taken over a period of time, and methods for
calculating components of uncertainty are discussed on other pages. Uncertainties
resulting from controllable factors, such as operators or instruments chosen for a
specific measurement, are discussed on this page.

First, decide The approach depends primarily on the context for the uncertainty statement. For Example of A two-way table of resistivity measurements (ohm.cm) with 5 probes on 5 wafers
on context for example, if instrument effect is the question, one approach is to regard, say, the resistivity (identified as: 138, 139, 140, 141, 142) is shown below. Standard deviations for
uncertainty instruments in the laboratory as a random sample of instruments of the same type measurements probes with 4 degrees of freedom each are shown for each wafer. The pooled
and to compute an uncertainty that applies to all results regardless of the particular on silicon standard deviation over all wafers, with 20 degrees of freedom, is the type A
instrument on which the measurements are made. The other approach is to compute wafers standard deviation for instruments.
an uncertainty that applies to results using a specific instrument.

Next, To treat instruments as a random source of uncertainty requires that we first Wafers
evaluate determine if differences due to instruments are significant. The same can be said for
whether or operators, etc. Probe 138 139 140 141 142
not there are
differences -------------------------------------------------------

1 95.1548 99.3118 96.1018 101.1248 94.2593


Plan for To evaluate the measurement process for instruments, select a random sample of I (I
281 95.1408 99.3548 96.0805 101.0747 94.2907
collecting > 4) instruments from those available. Make measurements on Q (Q >2) artifacts
. 283 95.1493 99.3211 96.0417 101.1100 94.2487
data with each instrument.
2062 95.1125 99.2831 96.0492 101.0574 94.2520
2362 95.0928 99.3060 96.0357 101.0602 94.2148

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5312.htm (1 of 3) [11/13/2003 5:38:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5312.htm (2 of 3) [11/13/2003 5:38:36 PM]


2.5.3.1.2. Measurement configuration within the laboratory 2.5.3.2. Material inhomogeneity

Std dev 0.02643 0.02612 0.02826 0.03038 0.02711


DF 4 4 4 4 4

2. Measurement Process Characterization


Pooled standard deviation = 0.02770 DF = 20 2.5. Uncertainty analysis
2.5.3. Type A evaluations

2.5.3.2. Material inhomogeneity


Purpose of this The purpose of this page is to outline methods for assessing
page uncertainties related to material inhomogeneities. Artifacts, electrical
devices, and chemical substances, etc. can be inhomogeneous
relative to the quantity that is being characterized by the
measurement process.

Effect of Inhomogeneity can be a factor in the uncertainty analysis where


inhomogeneity 1. an artifact is characterized by a single value and the artifact is
on the inhomogeneous over its surface, etc.
uncertainty
2. a lot of items is assigned a single value from a few samples
from the lot and the lot is inhomogeneous from sample to
sample.
An unfortunate aspect of this situation is that the uncertainty from
inhomogeneity may dominate the uncertainty. If the measurement
process itself is very precise and in statistical control, the total
uncertainty may still be unacceptable for practical purposes because
of material inhomogeneities.

Targeted It may be possible to measure an artifact very carefully at a specific


measurements site and direct the user to also measure at this site. In this case there
can eliminate is no contribution to measurement uncertainty from inhomogeneity.
the effect of
inhomogeneity

Example Silicon wafers are doped with boron to produce desired levels of
resistivity (ohm.cm). Manufacturing processes for semiconductors
are not yet capable (at least at the time this was originally written) of
producing 2" diameter wafers with constant resistivity over the
surfaces. However, because measurements made at the center of a
wafer by a certification laboratory can be reproduced in the
industrial setting, the inhomogeneity is not a factor in the uncertainty
analysis -- as long as only the center-point of the wafer is used for
future measurements.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5312.htm (3 of 3) [11/13/2003 5:38:36 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc532.htm (1 of 3) [11/13/2003 5:38:37 PM]


2.5.3.2. Material inhomogeneity 2.5.3.2. Material inhomogeneity

Random Random inhomogeneities are assessed using statistical methods for Standard The simplest approach to the computation of uncertainty for
inhomogeneities quantifying random errors. An example of inhomogeneity is a method systematic inhomogeneity is to compute the maximum deviation
chemical compound which cannot be sufficiently homogenized with from the reported value and, assuming a uniform, normal or
respect to isotopes of interest. Isotopic ratio determinations, which triangular distribution for the distribution of inhomogeneity,
are destructive, must be determined from measurements on a few compute the appropriate standard deviation. Sometimes the
bottles drawn at random from the lot. approximate shape of the distribution can be inferred from the
inhomogeneity measurements. The standard deviation for
Best strategy The best strategy is to draw a sample of bottles from the lot for the inhomogeneity assuming a uniform distribution is:
purpose of identifying and quantifying between-bottle variability.
These measurements can be made with a method that lacks the
accuracy required to certify isotopic ratios, but is precise enough to
allow between-bottle comparisons. A second sample is drawn from
the lot and measured with an accurate method for determining
isotopic ratios, and the reported value for the lot is taken to be the
average of these determinations. There are therefore two components
of uncertainty assessed:
1. component that quantifies the imprecision of the average
2. component that quantifies how much an individual bottle can
deviate from the average.

Systematic Systematic inhomogeneities require a somewhat different approach.


inhomogeneities Roughness can vary systematically over the surface of a 2" square
metal piece lathed to have a specific roughness profile. The
certification laboratory can measure the piece at several sites, but
unless it is possible to characterize roughness as a mathematical
function of position on the piece, inhomogeneity must be assessed as
a source of uncertainty.

Best strategy In this situation, the best strategy is to compute the reported value as
the average of measurements made over the surface of the piece and
assess an uncertainty for departures from the average. The
component of uncertainty can be assessed by one of several methods
for evaluating bias -- depending on the type of inhomogeneity.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc532.htm (2 of 3) [11/13/2003 5:38:37 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc532.htm (3 of 3) [11/13/2003 5:38:37 PM]


2.5.3.2.1. Data collection and analysis 2.5.3.2.1. Data collection and analysis

Between If this variance is negative, there is no contribution to uncertainty, and the bottles
bottle are equivalent with regard to their chemical compositions. Even if the variance is
variance may positive, inhomogeneity still may not be statistically significant, in which case it is
2. Measurement Process Characterization be negative not required to be included as a component of the uncertainty.
2.5. Uncertainty analysis
2.5.3. Type A evaluations
If the between-bottle variance is statistically significantly (i.e., judged to be
2.5.3.2. Material inhomogeneity greater than zero), then inhomogeneity contributes to the uncertainty of the
reported value.

Certification, The purpose of assessing inhomogeneity is to be able to assign a value to the


2.5.3.2.1. Data collection and analysis reported entire batch based on the average of a few bottles, and the determination of
value and inhomogeneity is usually made by a less accurate method than the certification
Purpose of The purpose of this page is to outline methods for: associated method. The reported value for the batch would be the average of N repetitions
this page ● collecting data uncertainty on Q bottles using the certification method.
● testing for inhomogeneity The uncertainty calculation is summarized below for the case where the only
● quantifying the component of uncertainty contribution to uncertainty from the measurement method itself is the repeatability
standard deviation, s1 associated with the certification method. For more
Balanced The simplest scheme for identifying and quantifying the effect of inhomogeneity complicated scenarios, see the pages on uncertainty budgets.
measurements of a measurement result is a balanced (equal number of measurements per cell)
at 2-levels 2-level nested design. For example, K bottles of a chemical compound are drawn
If sreported value
at random from a lot and J (J > 1) measurements are made per bottle. The
measurements are denoted by
If , we need to distinguish two cases and their interpretations:
1. The standard deviation
where the k index runs over bottles and the j index runs over repetitions within a
bottle.

Analysis of The between (bottle) variance is calculated using an analysis of variance


measurements leads to an interval that covers the difference between the reported value
technique that is repeated here for convenience.
and the average for a bottle selected at random from the batch.
2. The standard deviation

where
allows one to test the instrument using a single measurement. The
prediction interval for the difference between the reported value and a
single measurement, made with the same precision as the certification
and measurements, on a bottle selected at random from the batch. This is
appropriate when the instrument under test is similar to the certification
instrument. If the difference is not within the interval, the user's instrument
is in need of calibration.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5321.htm (1 of 3) [11/13/2003 5:38:37 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5321.htm (2 of 3) [11/13/2003 5:38:37 PM]


2.5.3.2.1. Data collection and analysis 2.5.3.3. Type A evaluations of bias

Relationship When the standard deviation for inhomogeneity is included in the calculation, as
to prediction in the last two cases above, the uncertainty interval becomes a prediction interval
intervals ( Hahn & Meeker) and is interpreted as characterizing a future measurement on a
bottle drawn at random from the lot. 2. Measurement Process Characterization
2.5. Uncertainty analysis
2.5.3. Type A evaluations

2.5.3.3. Type A evaluations of bias


Sources of The sources of bias discussed on this page cover specific measurement
bias relate to configurations. Measurements on test items are usually made on a
the specific single day, with a single operator, with a single instrument, etc. Even if
measurement the intent of the uncertainty is to characterize only those measurements
environment made in one specific configuration, the uncertainty must account for
any significant differences due to:
1. instruments
2. operators
3. geometries
4. other

Calibrated Calibrated instruments do not normally fall in this class because


instruments uncertainties associated with the instrument's calibration are reported as
do not fall in type B evaluations, and the instruments in the laboratory should agree
this class within the calibration uncertainties. Instruments whose responses are
not directly calibrated to the defined unit are candidates for type A
evaluations. This covers situations where the measurement is defined
by a test procedure or standard practice using a specific instrument
type.

The best This problem was treated on the foregoing page as an analysis of
strategy is to random error for the case where the uncertainty was intended to apply
correct for to all measurements for all configurations. If measurements for only
bias and one configuration are of interest, such as measurements made with a
compute the specific instrument, or if a smaller uncertainty is required, the
uncertainty differences among, say, instruments are treated as biases. The best
of the strategy in this situation is to correct all measurements made with a
correction specific instrument to the average for the instruments in the laboratory
and compute a type A uncertainty for the correction. This strategy, of
course, relies on the assumption that the instruments in the laboratory
represent a random sample of all instruments of a specific type.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5321.htm (3 of 3) [11/13/2003 5:38:37 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc533.htm (1 of 3) [11/13/2003 5:38:37 PM]


2.5.3.3. Type A evaluations of bias 2.5.3.3. Type A evaluations of bias

Only limited However, suppose that it is possible to make comparisons among, say, Strategies for assessing corrections and uncertainties associated with
comparisons only two instruments and neither is known to be 'unbiased'. This significant biases
can be made scenario requires a different strategy because the average will not
Type of bias Examples Type of correction Uncertainty
among necessarily be an unbiased result. The best strategy if there is a
sources of significant difference between the instruments, and this should be Based on
Sign change (+ to -)
possible bias tested, is to apply a 'zero' correction and assess a type A uncertainty of 1. Inconsistent Zero maximum
Varying magnitude
the correction. bias
Bias (for a single Standard
Guidelines The discussion above is intended to point out that there are many Instrument bias ~ same
2. Consistent instrument) = difference deviation of
for treatment possible scenarios for biases and that they should be treated on a magnitude over many
from average over several
of biases case-by-case basis. A plan is needed for: artifacts correction
instruments
● gathering data 3. Not correctable because Limited testing; e.g., only Standard
● testing for bias (graphically and/or statistically) of sparse data - consistent 2 instruments, operators, Zero deviation of
● estimating biases or inconsistent configurations, etc. correction
● assessing uncertainties associated with significant biases. Lack of resolution, Based on
4. Not correctable - non-linearity, drift,
caused by: Zero maximum
consistent
● instruments material inhomogeneity bias
● operators

● configurations, geometries, etc. Strategy for If there is no significant bias over time, there is no correction and no
● inhomogeneities no contribution to uncertainty.
significant
Plan for Measurements needed for assessing biases among instruments, say, bias
testing for requires a random sample of I (I > 1) instruments from those available
assessing and measurements on Q (Q >2) artifacts with each instrument. The
bias same can be said for the other sources of possible bias. General
strategies for dealing with significant biases are given in the table
below.
Data collection and analysis for assessing biases related to:
● lack of resolution of instrument

● non-linearity of instrument
● drift
are addressed in the section on gauge studies.

Sources of Databases for evaluating bias may be available from:


data for ● check standards
evaluating
this type of ● gauge R and R studies
bias ● control measurements

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc533.htm (2 of 3) [11/13/2003 5:38:37 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc533.htm (3 of 3) [11/13/2003 5:38:37 PM]


2.5.3.3.1. Inconsistent bias 2.5.3.3.1. Inconsistent bias

208 283 -0.0000240 0.0001879

Average 283 -0.0000284 0.0000652

2. Measurement Process Characterization A conservative assumption is that the bias could fall somewhere within
2.5. Uncertainty analysis the limits ± a, with a = maximum bias or 0.0000652 ohm.cm. The
2.5.3. Type A evaluations standard deviation of the correction is included as a type A systematic
2.5.3.3. Type A evaluations of bias component of the uncertainty.

2.5.3.3.1. Inconsistent bias


Strategy for If there is significant bias but it changes direction over time, a zero
inconsistent correction is assumed and the standard deviation of the correction is
bias -- apply reported as a type A uncertainty; namely,
a zero
correction

Computations The equation for estimating the standard deviation of the correction
based on assumes that biases are uniformly distributed between {-max |bias|, +
uniform or max |bias|}. This assumption is quite conservative. It gives a larger
normal uncertainty than the assumption that the biases are normally distributed.
distribution If normality is a more reasonable assumption, substitute the number '3'
for the 'square root of 3' in the equation above.

Example of The results of resistivity measurements with five probes on five silicon
change in wafers are shown below for probe #283, which is the probe of interest
bias over at this level with the artifacts being 1 ohm.cm wafers. The bias for
time probe #283 is negative for run 1 and positive for run 2 with the runs
separated by a two-month time period. The correction is taken to be
zero.

Table of biases (ohm.cm) for probe 283


Wafer Probe Run 1 Run 2

-----------------------------------

11 283 0.0000340 -0.0001841


26 283 -0.0001000 0.0000861
42 283 0.0000181 0.0000781
131 283 -0.0000701 0.0001580

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5331.htm (1 of 2) [11/13/2003 5:38:37 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5331.htm (2 of 2) [11/13/2003 5:38:37 PM]


2.5.3.3.2. Consistent bias 2.5.3.3.2. Consistent bias

Computation Given the measurements,


of bias

2. Measurement Process Characterization


2.5. Uncertainty analysis on Q artifacts with I instruments, the average bias for instrument, I' say, is
2.5.3. Type A evaluations
2.5.3.3. Type A evaluations of bias

2.5.3.3.2. Consistent bias


where
Consistent Bias that is significant and persists consistently over time for a specific
bias instrument, operator, or configuration should be corrected if it can be reliably
estimated from repeated measurements. Results with the instrument of interest are
then corrected to:

Corrected result = Measurement - Estimate of bias Computation The correction that should be made to measurements made with instrument I' is
of correction
The example below shows how bias can be identified graphically from
measurements on five artifacts with five instruments and estimated from the
differences among the instruments.
Type A The type A uncertainty of the correction is the standard deviation of the average
Graph An analysis of bias for five instruments based on measurements on five artifacts uncertainty bias or
showing shows differences from the average for each artifact plotted versus artifact with of the
consistent instruments individually identified by a special plotting symbol. The plot is correction
bias for examined to determine if some instruments always read high or low relative to the
other instruments, and if this behavior is consistent across artifacts. Notice that on
probe #5
the graph for resistivity probes, probe #2362, (#5 on the graph), which is the
instrument of interest for this measurement process, consistently reads low
relative to the other probes. This behavior is consistent over 2 runs that are Example of The table below comes from the table of resistivity measurements from a type A
separated by a two-month time period. consistent analysis of random effects with the average for each wafer subtracted from each
bias for measurement. The differences, as shown, represent the biases for each probe with
Strategy - Because there is significant and consistent bias for the instrument of interest, the probe #2362 respect to the other probes. Probe #2362 has an average bias, over the five wafers,
correct for measurements made with that instrument should be corrected for its average bias used to of -0.02724 ohm.cm. If measurements made with this probe are corrected for this
bias relative to the other instruments. measure bias, the standard deviation of the correction is a type A uncertainty.
resistivity of
silicon
wafers Table of biases for probes and silicon wafers (ohm.cm)

Wafers
Probe 138 139 140 141 142
-------------------------------------------------------
1 0.02476 -0.00356 0.04002 0.03938 0.00620
181 0.01076 0.03944 0.01871 -0.01072 0.03761

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5332.htm (1 of 3) [11/13/2003 5:38:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5332.htm (2 of 3) [11/13/2003 5:38:38 PM]


2.5.3.3.2. Consistent bias 2.5.3.3.3. Bias with sparse data

182 0.01926 0.00574 -0.02008 0.02458 -0.00439


2062 -0.01754 -0.03226 -0.01258 -0.02802 -0.00110
2362 -0.03725 -0.00936 -0.02608 -0.02522 -0.03830
2. Measurement Process Characterization
Average bias for probe #2362 = - 0.02724 2.5. Uncertainty analysis
2.5.3. Type A evaluations
Standard deviation of bias = 0.01171 with 2.5.3.3. Type A evaluations of bias
4 degrees of freedom

Standard deviation of correction = 2.5.3.3.3. Bias with sparse data


0.01171/sqrt(5) = 0.00523
Strategy for The purpose of this discussion is to outline methods for dealing with biases that may be real but which
dealing with cannot be estimated reliably because of the sparsity of the data. For example, a test between two, of
Note on The analysis on this page considers the case where only one instrument is used to limited data many possible, configurations of the measurement process cannot produce a reliable enough estimate of
different make the certification measurements; namely probe #2362, and the certified bias to permit a correction, but it can reveal problems with the measurement process. The strategy for a
approaches values are corrected for bias due to this probe. The analysis in the section on type significant bias is to apply a 'zero' correction. The type A uncertainty component is the standard
deviation of the correction, and the calculation depends on whether the bias is
to A analysis of random effects considers the case where any one of the probes could
● inconsistent
instrument be used to make the certification measurements.
bias ● consistent

Example of An example is given of a study of wiring settings for a single gauge. The gauge, a 4-point probe for
differences measuring resistivity of silicon wafers, can be wired in several ways. Because it was not possible to test
among wiring all wiring configurations during the gauge study, measurements were made in only two configurations
settings as a way of identifying possible problems.

Data on Measurements were made on six wafers over six days (except for 5 measurements on wafer 39) with
wiring probe #2062 wired in two configurations. This sequence of measurements was repeated after about a
configurations month resulting in two runs. A database of differences between measurements in the two configurations
on the same day are analyzed for significance.

Run software A plot of the differences between the 2 configurations shows that the differences for run 1 are, for the
macro for most part, < zero, and the differences for run 2 are > zero. The following Dataplot commands produce
making the plot:
plotting
differences
between the 2 dimension 500 30
wiring read mpc536.dat wafer day probe d1 d2
configurations let n = count probe
let t = sequence 1 1 n
let zero = 0 for i = 1 1 n
lines dotted blank blank
characters blank 1 2
x1label = DIFFERENCES BETWEEN 2 WIRING CONFIGURATIONS
x2label SEQUENCE BY WAFER AND DAY
plot zero d1 d2 vs t

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5332.htm (3 of 3) [11/13/2003 5:38:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5333.htm (1 of 4) [11/13/2003 5:38:38 PM]


2.5.3.3.3. Bias with sparse data 2.5.3.3.3. Bias with sparse data

Run software The following Dataplot commands


macro for
making t-test
let dff = n-1
let avgrun1 = average d1
let avgrun2 = average d2
let sdrun1 = standard deviation d1
let sdrun2 = standard deviation d2
let t1 = ((n-1)**.5)*avgrun1/sdrun1
let t2 = ((n-1)**.5)*avgrun2/sdrun2
print avgrun1 sdrun1 t1
print avgrun2 sdrun2 t2
let tcrit=tppf(.975,dff)
reproduce the statistical tests in the table.

PARAMETERS AND CONSTANTS--

AVGRUN1 -- -0.3834483E-02
SDRUN1 -- 0.5145197E-02
T1 -- -0.4013319E+01

PARAMETERS AND CONSTANTS--


Statistical test A t-statistic is used as an approximate test where we are assuming the differences are
for difference approximately normal. The average difference and standard deviation of the difference are
between 2 required for this test. If AVGRUN2 -- 0.4886207E-02
configurations SDRUN2 -- 0.4004259E-02
T2 -- 0.6571260E+01

Case of The data reveal a significant wiring bias for both runs that changes direction between runs.
the difference between the two configurations is statistically significant. inconsistent Because of this inconsistency, a 'zero' correction is applied to the results, and the type A
The average and standard deviation computed from the N = 29 differences in each run from bias uncertainty is taken to be
the table above are shown along with corresponding t-values which confirm that the
differences are significant, but in opposite directions, for both runs.

Average differences between wiring configurations For this study, the type A uncertainty for wiring bias is

Run Probe Average Std dev N t


1 2062 - 0.00383 0.00514 29 - 4.0

2 2062 + 0.00489 0.00400 29 + 6.6

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5333.htm (2 of 4) [11/13/2003 5:38:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5333.htm (3 of 4) [11/13/2003 5:38:38 PM]


2.5.3.3.3. Bias with sparse data 2.5.4. Type B evaluations

Case of Even if the bias is consistent over time, a 'zero' correction is applied to the results, and for a
consistent single run, the estimated standard deviation of the correction is
bias

2. Measurement Process Characterization


2.5. Uncertainty analysis
For two runs (1 and 2), the estimated standard deviation of the correction is

2.5.4. Type B evaluations


Type B Type B evaluations can apply to both random error and bias. The
evaluations distinguishing feature is that the calculation of the uncertainty
apply to both component is not based on a statistical analysis of data. The distinction
error and to keep in mind with regard to random error and bias is that:
bias ● random errors cannot be corrected

● biases can, theoretically at least, be corrected or eliminated from


the result.

Sources of Some examples of sources of uncertainty that lead to type B evaluations


type B are:
evaluations ● Reference standards calibrated by another laboratory

● Physical constants used in the calculation of the reported value

● Environmental effects that cannot be sampled

● Possible configuration/geometry misalignment in the instrument

● Lack of resolution of the instrument

Documented Documented sources of uncertainty, such as calibration reports for


sources of reference standards or published reports of uncertainties for physical
uncertainty constants, pose no difficulties in the analysis. The uncertainty will
from other usually be reported as an expanded uncertainty, U, which is converted
processes to the standard uncertainty,
u = U/k
If the k factor is not known or documented, it is probably conservative
to assume that k = 2.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc5333.htm (4 of 4) [11/13/2003 5:38:38 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc54.htm (1 of 2) [11/13/2003 5:38:43 PM]


2.5.4. Type B evaluations 2.5.4.1. Standard deviations from assumed distributions

Sources of Sources of uncertainty that are local to the measurement process but
uncertainty which cannot be adequately sampled to allow a statistical analysis
that are require type B evaluations. One technique, which is widely used, is to
local to the estimate the worst-case effect, a, for the source of interest, from 2. Measurement Process Characterization
measurement 2.5. Uncertainty analysis
● experience
process 2.5.4. Type B evaluations
● scientific judgment

● scant data

A standard deviation, assuming that the effect is two-sided, can then be 2.5.4.1. Standard deviations from assumed
computed based on a uniform, triangular, or normal distribution of
possible effects.
distributions
Following the Guide to the Expression of Uncertainty of Measurement Difficulty of The methods described on this page attempt to avoid the difficulty of
(GUM), the convention is to assign infinite degrees of freedom to obtaining allowing for sources of error for which reliable estimates of uncertainty
standard deviations derived in this manner. reliable do not exist. The methods are based on assumptions that may, or may
uncertainty not, be valid and require the experimenter to consider the effect of the
estimates assumptions on the final uncertainty.

Difficulty of The ISO guidelines do not allow worst-case estimates of bias to be


obtaining added to the other components, but require they in some way be
reliable converted to equivalent standard deviations. The approach is to consider
uncertainty that any error or bias, for the situation at hand, is a random draw from a
estimates known statistical distribution. Then the standard deviation is calculated
from known (or assumed) characteristics of the distribution.
Distributions that can be considered are:
● Uniform

● Triangular

● Normal (Gaussian)

Standard The uniform distribution leads to the most conservative estimate of


deviation for uncertainty; i.e., it gives the largest standard deviation. The calculation
a uniform of the standard deviation is based on the assumption that the end-points,
distribution ± a, of the distribution are known. It also embodies the assumption that
all effects on the reported value, between -a and +a, are equally likely
for the particular source of uncertainty.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc54.htm (2 of 2) [11/13/2003 5:38:43 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc541.htm (1 of 2) [11/13/2003 5:38:44 PM]


2.5.4.1. Standard deviations from assumed distributions 2.5.5. Propagation of error considerations

Standard The triangular distribution leads to a less conservative estimate of


deviation for uncertainty; i.e., it gives a smaller standard deviation than the uniform
a triangular distribution. The calculation of the standard deviation is based on the
distribution assumption that the end-points, ± a, of the distribution are known and 2. Measurement Process Characterization
the mode of the triangular distribution occurs at zero. 2.5. Uncertainty analysis

2.5.5. Propagation of error considerations


Top-down The approach to uncertainty analysis that has been followed up to this point in the
approach discussion has been what is called a top-down approach. Uncertainty components are
consists of estimated from direct repetitions of the measurement result. To contrast this with a
estimating the propagation of error approach, consider the simple example where we estimate the area
uncertainty of a rectangle from replicate measurements of length and width. The area
Standard The normal distribution leads to the least conservative estimate of from direct
deviation for uncertainty; i.e., it gives the smallest standard deviation. The calculation repetitions of area = length x width
the
a normal of the standard deviation is based on the assumption that the end-points, measurement can be computed from each replicate. The standard deviation of the reported area is
distribution ± a, encompass 99.7 percent of the distribution. result estimated directly from the replicates of area.

Advantages of This approach has the following advantages:


top-down ● proper treatment of covariances between measurements of length and width
approach
● proper treatment of unsuspected sources of error that would emerge if
measurements covered a range of operating conditions and a sufficiently long
time period
● independence from propagation of error model

Propagation The formal propagation of error approach is to compute:


Degrees of In the context of using the Welch-Saitterthwaite formula with the above of error 1. standard deviation from the length measurements
freedom distributions, the degrees of freedom is assumed to be infinite. approach
2. standard deviation from the width measurements
combines
estimates from and combine the two into a standard deviation for area using the approximation for
individual products of two variables (ignoring a possible covariance between length and width),
auxiliary
measurements

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc541.htm (2 of 2) [11/13/2003 5:38:44 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc55.htm (1 of 3) [11/13/2003 5:38:44 PM]


2.5.5. Propagation of error considerations 2.5.5. Propagation of error considerations

● is the standard deviation of the X measurements


Exact formula Goodman (1960) derived an exact formula for the variance between two products.
● is the standard deviation of Z measurements
Given two random variables, x and y (correspond to width and length in the above
approximate formula), the exact formula for the variance is: ● is the standard deviation of Y measurements

● is the partial derivative of the function Y with respect to X, etc.


● is the estimated covariance between the X,Z measurements
with
● X = E(x) and Y = E(y) (corresponds to width and length, respectively, in the
approximate formula) Treatment of Covariance terms can be difficult to estimate if measurements are not made in pairs.
covariance Sometimes, these terms are omitted from the formula. Guidance on when this is
● V(x) = variance of x and V(y) = variance Y (corresponds to s2 for width and
terms acceptable practice is given below:
length, respectively, in the approximate formula)
1. If the measurements of X, Z are independent, the associated covariance term is
● Eij = {( x)i, ( y)j} where x = x - X and y=y-Y zero.
● 2. Generally, reported values of test items from calibration designs have non-zero
covariances that must be taken into account if Y is a summation such as the mass
To obtain the standard deviation, simply take the square root of the above formula. of two weights, or the length of two gage blocks end-to-end, etc.
Also, an estimate of the statistic is obtained by substituting sample estimates for the
3. Practically speaking, covariance terms should be included in the computation
corresponding population values on the right hand side of the equation.
only if they have been estimated from sufficient data. See Ku (1966) for guidance
on what constitutes sufficient data.
Approximate The approximate formula assumes that length and width are independent. The exact
formula formula assumes that length and width are not independent.
assumes Sensitivity The partial derivatives are the sensitivity coefficients for the associated components.
indpendence coefficients

Disadvantages In the ideal case, the propagation of error estimate above will not differ from the Examples of Examples of propagation of error that are shown in this chapter are:
of estimate made directly from the area measurements. However, in complicated scenarios, propagation ● Case study of propagation of error for resistivity measurements

propagation they may differ because of: of error


analyses ● Comparison of check standard analysis and propagation of error for linear
of error ● unsuspected covariances calibration
approach
● disturbances that affect the reported value and not the elementary measurements ● Propagation of error for quadratic calibration showing effect of covariance terms
(usually a result of mis-specification of the model)
● mistakes in propagating the error through the defining formulas Specific Formulas for specific functions can be found in the following sections:
formulas ● functions of a single variable
Propagation Sometimes the measurement of interest cannot be replicated directly and it is necessary
of error to estimate its uncertainty via propagation of error formulas (Ku). The propagation of ● functions of two variables
formula error formula for ● functions of many variables

Y = f(X, Z, ... )
a function of one or more variables with measurements, X, Z, ... gives the following
estimate for the standard deviation of Y:

where

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc55.htm (2 of 3) [11/13/2003 5:38:44 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc55.htm (3 of 3) [11/13/2003 5:38:44 PM]


2.5.5.1. Formulas for functions of one variable 2.5.5.1. Formulas for functions of one variable

2. Measurement Process Characterization Approximation


2.5. Uncertainty analysis could be
2.5.5. Propagation of error considerations seriously in
error if n is
small--
2.5.5.1. Formulas for functions of one
variable Not directly
derived from
Case: Standard deviations of reported values that are functions of a single the formulas Note: we need to assume that the original
Y=f(X,Z) variable are reproduced from a paper by H. Ku (Ku). data follow an approximately normal
distribution.
The reported value, Y, is a function of the average of N measurements
on a single variable.

Function of Standard deviation of


Notes
= standard deviation of X.
is an average of N
measurements

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc551.htm (1 of 2) [11/13/2003 5:38:45 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc551.htm (2 of 2) [11/13/2003 5:38:45 PM]


2.5.5.2. Formulas for functions of two variables 2.5.5.2. Formulas for functions of two variables

2. Measurement Process Characterization


2.5. Uncertainty analysis
2.5.5. Propagation of error considerations

2.5.5.2. Formulas for functions of two


variables
Case: Standard deviations of reported values that are functions of Note: this is an approximation. The exact result could be
Y=f(X,Z) measurements on two variables are reproduced from a paper by H. Ku obtained starting from the exact formula for the standard
(Ku). deviation of a product derived by Goodman (1960).
The reported value, Y is a function of averages of N measurements on
two variables.

Standard deviation of

= standard dev of X;
Function of ,
= standard dev of Z;

and are averages of N = covariance of X,Z


measurements
Note: Covariance term is to be included only if there is
a reliable estimate

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc552.htm (1 of 2) [11/13/2003 5:38:45 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc552.htm (2 of 2) [11/13/2003 5:38:45 PM]


2.5.5.3. Propagation of error for many variables 2.5.5.3. Propagation of error for many variables

2. Measurement Process Characterization


2.5. Uncertainty analysis
2.5.5. Propagation of error considerations

2.5.5.3. Propagation of error for many variables


Simplification Propagation of error for several variables can be simplified considerably if:
for dealing ● The function, Y, is a simple multiplicative function of secondary variables
with many Representation The defining equation is input as
● Uncertainty is evaluated as a percentage
variables of the defining
equation
Example of For three variables, X, Z, W, the function
three variables
Cd=m(1 - (d/D)^4)^(1/2)/(K d^2 F p^(1/2) delp^(1/2))
Mathematica and is represented in Mathematica as follows:
has a standard deviation in absolute units of representation

Out[1]=
4
d
Sqrt[1 - ---] m
4
D
In % units, the standard deviation can be written as -----------------------
2
d F K Sqrt[delp] Sqrt[p]

Partial Partial derivatives are derived via the function D where, for example,
derivatives -
if all covariances are negligible. These formulas are easily extended to more than three variables. first partial
derivative with D[Cd, {d,1}]
Software can Propagation of error for more complicated functions can be done reliably with software capable of respect to
orifice indicates the first partial derivative of the discharge coefficient with respect to orifice diameter, and
simplify algebraic representations such as Mathematica (Wolfram).
diameter the result returned by Mathematica is
propagation of
error
Out[2]=
Example from For example, discharge coefficients for fluid flow are computed from the following equation
fluid flow of (Whetstone et al.) 4
non-linear d
function -2 Sqrt[1 - ---] m
4
D
-------------------------- -
3
d F K Sqrt[delp] Sqrt[p]
where

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc553.htm (1 of 3) [11/13/2003 5:38:46 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc553.htm (2 of 3) [11/13/2003 5:38:46 PM]


2.5.5.3. Propagation of error for many variables 2.5.6. Uncertainty budgets and sensitivity coefficients

2 d m
------------------------------------
4
d 4
Sqrt[1 - ---] D F K Sqrt[delp] Sqrt[p]
4 2. Measurement Process Characterization
D 2.5. Uncertainty analysis

First partial Similarly, the first partial derivative of the discharge coefficient with respect to pressure is
derivative with
respect to
represented by 2.5.6. Uncertainty budgets and sensitivity
pressure coefficients
D[Cd, {p,1}]
with the result Case study Uncertainty components are listed in a table along with their
showing corresponding sensitivity coefficients, standard deviations and degrees
Out[3]= of freedom. A table of typical entries illustrates the concept.
uncertainty
budget
4
d
- (Sqrt[1 - ---] m)
4
D
---------------------- Typical budget of type A and type B uncertainty components
2 3/2
2 d F K Sqrt[delp] p Standard Degrees
Type A components Sensitivity coefficient
deviation freedom
Comparison of The software can also be used to combine the partial derivatives with the appropriate standard
check deviations, and then the standard deviation for the discharge coefficient can be evaluated and
1. Time (repeatability) v1
standard plotted for specific values of the secondary variables. 2. Time (reproducibility) v2
analysis and 3. Time (long-term) v3
propagation of
error
Type B components
5. Reference standard (nominal test / nominal ref) v4

Sensitivity The sensitivity coefficient shows the relationship of the individual


coefficients uncertainty component to the standard deviation of the reported
show how value for a test item. The sensitivity coefficient relates to the result
components are that is being reported and not to the method of estimating
related to result uncertainty components where the uncertainty, u, is

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc553.htm (3 of 3) [11/13/2003 5:38:46 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc56.htm (1 of 3) [11/13/2003 5:38:46 PM]


2.5.6. Uncertainty budgets and sensitivity coefficients 2.5.6. Uncertainty budgets and sensitivity coefficients

Sensitivity This section defines sensitivity coefficients that are appropriate for Sensitivity The majority of sensitivity coefficients for type B evaluations will
coefficients for type A components estimated from repeated measurements. The coefficients for be one with a few exceptions. The sensitivity coefficient for the
type A pages on type A evaluations, particularly the pages related to type B uncertainty of a reference standard is the nominal value of the test
components of estimation of repeatability and reproducibility components, should evaluations item divided by the nominal value of the reference standard.
uncertainty be reviewed before continuing on this page. The convention for the
notation for sensitivity coefficients for this section is that: Case If the uncertainty of the reported value is calculated from
1. refers to the sensitivity coefficient for the repeatability study-sensitivity propagation of error, the sensitivity coefficients are the multipliers
coefficients for of the individual variance terms in the propagation of error formula.
standard deviation, Formulas are given for selected functions of:
propagation of
2. refers to the sensitivity coefficient for the reproducibility error 1. functions of a single variable
standard deviation, 2. functions of two variables
3. refers to the sensitivity coefficient for the stability 3. several variables
standard deviation,
with some of the coefficients possibly equal to zero.

Note on Even if no day-to-day nor run-to-run measurements were made in


long-term determining the reported value, the sensitivity coefficient is
errors non-zero if that standard deviation proved to be significant in the
analysis of data.

Sensitivity Procedures for estimating differences among instruments, operators,


coefficients for etc., which are treated as random components of uncertainty in the
other type A laboratory, show how to estimate the standard deviations so that the
components of sensitivity coefficients = 1.
random error

Sensitivity This Handbook follows the ISO guidelines in that biases are
coefficients for corrected (correction may be zero), and the uncertainty component
type A is the standard deviation of the correction. Procedures for dealing
components for with biases show how to estimate the standard deviation of the
bias correction so that the sensitivity coefficients are equal to one.

Sensitivity The following pages outline methods for computing sensitivity


coefficients for coefficients where the components of uncertainty are derived in the
specific following manner:
applications 1. From measurements on the test item itself
2. From measurements on a check standard
3. From measurements in a 2-level design
4. From measurements in a 3-level design
and give an example of an uncertainty budget with sensitivity
coefficients from a 3-level design.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc56.htm (2 of 3) [11/13/2003 5:38:46 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc56.htm (3 of 3) [11/13/2003 5:38:46 PM]


2.5.6.1. Sensitivity coefficients for measurements on the test item 2.5.6.1. Sensitivity coefficients for measurements on the test item

To improve If possible, the measurements on the test item should be repeated over M
the days and averaged to estimate the reported value. The standard deviation
reliability of for the reported value is computed from the daily averages>, and the
2. Measurement Process Characterization the standard deviation for the temporal component is:
2.5. Uncertainty analysis uncertainty
2.5.6. Uncertainty budgets and sensitivity coefficients calculation

2.5.6.1. Sensitivity coefficients for


measurements on the test item with degrees of freedom where are the daily averages
and is the grand average.
From data If the temporal component is estimated from N short-term readings on
on the test the test item itself The sensitivity coefficients are: a1 = 0; a2 = .
item itself
Y1, Y2, ..., YN
Note on Even if no day-to-day nor run-to-run measurements were made in
and long-term determining the reported value, the sensitivity coefficient is non-zero if
errors that standard deviation proved to be significant in the analysis of data.

and the reported value is the average, the standard deviation of the
reported value is

with degrees of freedom .

Sensitivity The sensitivity coefficient is . The risk in using this method


coefficients
is that it may seriously underestimate the uncertainty.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc561.htm (1 of 2) [11/13/2003 5:38:46 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc561.htm (2 of 2) [11/13/2003 5:38:46 PM]


2.5.6.2. Sensitivity coefficients for measurements on a check standard 2.5.6.3. Sensitivity coefficients for measurements from a 2-level design

2. Measurement Process Characterization 2. Measurement Process Characterization


2.5. Uncertainty analysis 2.5. Uncertainty analysis
2.5.6. Uncertainty budgets and sensitivity coefficients 2.5.6. Uncertainty budgets and sensitivity coefficients

2.5.6.2. Sensitivity coefficients for 2.5.6.3. Sensitivity coefficients for measurements


measurements on a check standard from a 2-level design
From If the temporal component of the measurement process is evaluated Sensitivity If the temporal components are estimated from a 2-level nested design, and the reported
measurements from measurements on a check standard and there are M days (M = 1 coefficients value for a test item is an average over
on check is permissible) of measurements on the test item that are structured in from a ● N short-term repetitions
standards 2-level
the same manner as the measurements on the check standard, the ● M (M = 1 is permissible) days
design
standard deviation for the reported value is of measurements on the test item, the standard deviation for the reported value is:

with degrees of freedom from the K entries in the See the relationships in the section on 2-level nested design for definitions of the
check standard database. standard deviations and their respective degrees of freedom.

Standard The computation of the standard deviation from the check standard Problem If degrees of freedom are required for the uncertainty of the reported value, the formula
deviation values and its relationship to components of instrument precision and with above cannot be used directly and must be rewritten in terms of the standard deviations,
from check day-to-day variability of the process are explained in the section on estimating and .
standard two-level nested designs using check standards. degrees of
freedom
measurements

Sensitivity The sensitivity coefficients are: a1; a2 = .


coefficients
Sensitivity The sensitivity coefficients are: a1 = ; a2 = .
coefficients
Specific sensitivity coefficients are shown in the table below for selections of N, M.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc562.htm [11/13/2003 5:38:47 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc563.htm (1 of 2) [11/13/2003 5:38:47 PM]


2.5.6.3. Sensitivity coefficients for measurements from a 2-level design 2.5.6.4. Sensitivity coefficients for measurements from a 3-level design

Sensitivity coefficients for two components


of uncertainty
2. Measurement Process Characterization
2.5. Uncertainty analysis
Short-term Day-to-day 2.5.6. Uncertainty budgets and sensitivity coefficients
Number Number
sensitivity sensitivity
short-term day-to-day coefficient coefficient 2.5.6.4. Sensitivity coefficients for
N M measurements from a 3-level
design
1 1 1
Sensitivity If the temporal components are estimated from a 3-level nested design
coefficients and the reported value is an average over
from a ● N short-term repetitions
N 1 1 3-level
● M days
design
● P runs
Case study
N M of measurements on the test item, the standard deviation for the reported
showing value is:
sensitivity
coefficients
for 3-level
design
See the section on analysis of variability for definitions and
relationships among the standard deviations shown in the equation
above.

Problem If degrees of freedom are required for the uncertainty, the formula above
with cannot be used directly and must be rewritten in terms of the standard
estimating deviations , , and .
degrees of
freedom

Sensitivity The sensitivity coefficients are:


coefficients
a1 = ; a2 = ;

a3 = .

Specific sensitivity coefficients are shown in the table below for


selections of N, M, P. In addition, the following constraints must be
observed:
J must be > or = N and K must be > or = M

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc563.htm (2 of 2) [11/13/2003 5:38:47 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc564.htm (1 of 2) [11/13/2003 5:38:48 PM]


2.5.6.4. Sensitivity coefficients for measurements from a 3-level design 2.5.6.5. Example of uncertainty budget

Sensitivity coefficients for three components of uncertainty

Run-to-run
Number Number Number Short-term Day-to-day
sensitivity
short-term day-to-day run-to-run sensitivity coefficient sensitivity coefficient 2. Measurement Process Characterization
coefficient
2.5. Uncertainty analysis
N M P 2.5.6. Uncertainty budgets and sensitivity coefficients

1 1 1 1 2.5.6.5. Example of uncertainty budget


Example of An uncertainty budget that illustrates several principles of uncertainty
N 1 1 1
uncertainty analysis is shown below. The reported value for a test item is the
budget for average of N short-term measurements where the temporal components
three of uncertainty were estimated from a 3-level nested design with J
N M 1 1
components short-term repetitions over K days.
of temporal
uncertainty The number of measurements made on the test item is the same as the
N M P number of short-term measurements in the design; i.e., N = J. Because
there were no repetitions over days or runs on the test item, M = 1; P =
1. The sensitivity coefficients for this design are shown on the
foregoing page.

Example of This example also illustrates the case where the measuring instrument
instrument is biased relative to the other instruments in the laboratory, with a bias
bias correction applied accordingly. The sensitivity coefficient, given that
the bias correction is based on measurements on Q artifacts, is defined
as a4 = 1, and the standard deviation, s4, is the standard deviation of the
correction.

Example of error budget for type A and type B uncertainties


Standard Degrees
Type A components Sensitivity coefficient
deviation freedom
1. Repeatability =0 J-1
2. Reproducibility = K-1
2. Stability =1 L-1
3. Instrument bias =1 Q-1

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc564.htm (2 of 2) [11/13/2003 5:38:48 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc565.htm [11/13/2003 5:38:48 PM]


2.5.7. Standard and expanded uncertainties 2.5.7.1. Degrees of freedom

2. Measurement Process Characterization 2. Measurement Process Characterization


2.5. Uncertainty analysis 2.5. Uncertainty analysis
2.5.7. Standard and expanded uncertainties

2.5.7. Standard and expanded uncertainties


2.5.7.1. Degrees of freedom
Definition of The sensitivity coefficients and standard deviations are combined by
standard root sum of squares to obtain a 'standard uncertainty'. Given R Degrees of Degrees of freedom for type A uncertainties are the degrees of freedom
uncertainty components, the standard uncertainty is: freedom for for the respective standard deviations. Degrees of freedom for Type B
individual evaluations may be available from published reports or calibration
components certificates. Special cases where the standard deviation must be
of estimated from fragmentary data or scientific judgment are assumed to
uncertainty have infinite degrees of freedom; for example,
● Worst-case estimate based on a robustness study or other

Expanded If the purpose of the uncertainty statement is to provide coverage with evidence
uncertainty a high level of confidence, an expanded uncertainty is computed as ● Estimate based on an assumed distribution of possible errors
assures a ● Type B uncertainty component for which degrees of freedom are
high level of not documented
confidence
where k is chosen to be the critical value from the t-table with v
degrees of freedom. For large degrees of freedom, k = 2 approximates Degrees of Degrees of freedom for the standard uncertainty, u, which may be a
95% coverage. freedom for combination of many standard deviations, is not generally known. This
the standard is particularly troublesome if there are large components of uncertainty
uncertainty with small degrees of freedom. In this case, the degrees of freedom is
Interpretation The expanded uncertainty defined above is assumed to provide a high
approximated by the Welch-Satterthwaite formula (Brownlee).
of uncertainty level of coverage for the unknown true value of the measurement of
statement interest so that for any measurement result, Y,

Case study: A case study of type A uncertainty analysis shows the computations of
Uncertainty temporal components of uncertainty; instrument bias; geometrical bias;
and degrees standard uncertainty; degrees of freedom; and expanded uncertainty.
of freedom

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc57.htm [11/13/2003 5:38:48 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc571.htm [11/13/2003 5:38:48 PM]


2.5.8. Treatment of uncorrected bias 2.5.8. Treatment of uncorrected bias

2. Measurement Process Characterization


2.5. Uncertainty analysis

2.5.8. Treatment of uncorrected bias


Background The ISO Guide ( ISO) for expressing measurement uncertainties
assumes that all biases are corrected and that the uncertainty applies to
the corrected result. For measurements at the factory floor level, this
approach has several disadvantages. It may not be practical, may be
expensive and may not be economically sound to correct for biases that
do not impact the commercial value of the product (Turgel and
Vecchia).

Reasons for Corrections may be expensive to implement if they require


not modifications to existing software and "paper and pencil" corrections
correcting can be both time consuming and prone to error. In the scientific or
for bias metrology laboratory, biases may be documented in certain situations,
but the mechanism that causes the bias may not be fully understood, or
repeatable, which makes it difficult to argue for correction. In these
cases, the best course of action is to report the measurement as taken
and adjust the uncertainty to account for the "bias".

The question A method needs to be developed which assures that the resulting
is how to uncertainty has the following properties (Phillips and Eberhardt):
adjust the 1. The final uncertainty must be greater than or equal to the
uncertainty uncertainty that would be quoted if the bias were corrected.
2. The final uncertainty must reduce to the same uncertainty given
that the bias correction is applied.
3. The level of coverage that is achieved by the final uncertainty
statement should be at least the level obtained for the case of
corrected bias.
4. The method should be transferable so that both the uncertainty
and the bias can be used as components of uncertainty in another
uncertainty statement.
5. The method should be easy to implement.

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc58.htm (1 of 2) [11/13/2003 5:38:48 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc58.htm (2 of 2) [11/13/2003 5:38:48 PM]


2.5.8.1. Computation of revised uncertainty 2.5.8.1. Computation of revised uncertainty

Situation If the bias is not known exactly, its magnitude is estimated from
2. Measurement Process Characterization
2.5. Uncertainty analysis
where bias is repeated measurements, from sparse data or from theoretical
2.5.8. Treatment of uncorrected bias not known considerations, and the standard deviation is estimated from repeated
exactly but measurements or from an assumed distribution. The standard deviation
must be of the bias becomes a component in the uncertainty analysis with the
2.5.8.1. Computation of revised uncertainty estimated standard uncertainty restructured to be:

Definition of If the bias is and the corrected measurement is defined by


the bias and
corrected and the expanded uncertainty limits become:
measurement ,
the corrected value of Y has the usual expanded uncertainty interval
which is symmetric around the unknown true value for the
measurement process and is of the following type:
.

Interpretation The uncertainty intervals described above have the desirable properties
Definition of If no correction is made for the bias, the uncertainty interval is outlined on a previous page. For more information on theory and
asymmetric contaminated by the effect of the bias term as follows: industrial examples, the reader should consult the paper by the authors
uncertainty of this technique (Phillips and Eberhardt).
interval to
account for
uncorrected and can be rewritten in terms of upper and lower endpoints that are
measurement asymmetric around the true value; namely,

Conditions The definition above can lead to a negative uncertainty limit; e.g., if
on the the bias is positive and greater than U, the upper endpoint becomes
relationship negative. The requirement that the uncertainty limits be greater than or
between the equal to zero for all values of the bias guarantees non-negative
bias and U uncertainty limits and is accepted at the cost of somewhat wider
uncertainty intervals. This leads to the following set of restrictions on
the uncertainty limits:

http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc581.htm (1 of 2) [11/13/2003 5:38:49 PM] http://www.itl.nist.gov/div898/handbook/mpc/section5/mpc581.htm (2 of 2) [11/13/2003 5:38:49 PM]


2.6. Case studies 2.6.1. Gauge study of resistivity probes

2. Measurement Process Characterization 2. Measurement Process Characterization


2.6. Case studies

2.6. Case studies


2.6.1. Gauge study of resistivity probes
Contents The purpose of this section is to illustrate the planning, procedures, and
analyses outlined in the various sections of this chapter with data taken Purpose The purpose of this case study is to outline the analysis of a gauge study
from measurement processes at the National Institute of Standards and that was undertaken to identify the sources of uncertainty in resistivity
Technology. A secondary goal is to give the reader an opportunity to run measurements of silicon wafers.
the analyses in real-time using the software package, Dataplot.
1. Gauge study of resistivity probes Outline 1. Background and data
2. Check standard study for resistivity measurements 2. Analysis and interpretation
3. Type A uncertainty analysis 3. Graphs showing repeatability standard deviations
4. Type B uncertainty analysis and propagation of error 4. Graphs showing day-to-day variability
5. Graphs showing differences among gauges
6. Run this example yourself with Dataplot
7. Dataplot macros

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6.htm [11/13/2003 5:38:49 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc61.htm [11/13/2003 5:38:49 PM]


2.6.1.1. Background and data 2.6.1.1. Background and data

Database of The 3-level nested design consisted of:


measurements ● J = 6 measurements at the center of each wafer per day
● K = 6 days
2. Measurement Process Characterization
● L = 2 runs
2.6. Case studies
2.6.1. Gauge study of resistivity probes To characterize the probes and the influence of wafers on the
measurements, the design was repeated over:
● Q = 5 wafers (check standards 138, 139, 140, 141, 142)
2.6.1.1. Background and data ● I = 5 probes (1, 281, 283, 2062, 2362)

The runs were separated by about one month in time. The J = 6


Description of Measurements of resistivity on 100 ohm.cm wafers were made
measurements at the center of each wafer are reduced to an average
measurements according to an ASTM Standard Test Method (ASTM F84) to assess
and repeatability standard deviation and recorded in a database with
the sources of uncertainty in the measurement system. Resistivity identifications for wafer, probe, and day.
measurements have been studied over the years, and it is clear from
those data that there are sources of variability affecting the process
beyond the basic imprecision of the gauges. Changes in measurement
results have been noted over days and over months and the data in this
study are structured to quantify these time-dependent changes in the
measurement process.

Gauges The gauges for the study were five probes used to measure resistivity
of silicon wafers. The five gauges are assumed to represent a random
sample of typical 4-point gauges for making resistivity measurements.
There is a question of whether or not the gauges are essentially
equivalent or whether biases among them are possible.

Check The check standards for the study were five wafers selected at random
standards from the batch of 100 ohm.cm wafers.

Operators The effect of operator was not considered to be significant for this
study.

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc611.htm (1 of 2) [11/13/2003 5:38:49 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc611.htm (2 of 2) [11/13/2003 5:38:49 PM]


2.6.1.1.1. Database of resistivity measurements 2.6.1.1.1. Database of resistivity measurements

1 138. 283. 3. 17. 1. 23.08 95.1600 0.0998


1 138. 283. 3. 18. 1. 23.13 95.0818 0.1108
1 138. 283. 3. 21. 1. 23.28 95.1620 0.0408
1 138. 283. 3. 22. 1. 23.36 95.1735 0.0501
1 138. 283. 3. 24. 2. 22.97 95.1932 0.0287
2. Measurement Process Characterization
1 138. 2062. 3. 16. 1. 22.97 95.1311 0.1066
2.6. Case studies
2.6.1. Gauge study of resistivity probes
1 138. 2062. 3. 17. 1. 22.98 95.1132 0.0415
2.6.1.1. Background and data 1 138. 2062. 3. 18. 1. 23.16 95.0432 0.0491
1 138. 2062. 3. 21. 1. 23.16 95.1254 0.0603
1 138. 2062. 3. 22. 1. 23.28 95.1322 0.0561
2.6.1.1.1. Database of resistivity measurements 1 138. 2062. 3. 24. 2. 23.19 95.1299 0.0349
1 138. 2362. 3. 15. 1. 23.08 95.1162 0.0480
1 138. 2362. 3. 17. 1. 23.01 95.0569 0.0577
The check standards are Measurements of resistivity (ohm.cm) were made according to an ASTM 1 138. 2362. 3. 18. 1. 22.97 95.0598 0.0516
five wafers chosen at Standard Test Method (F4) at NIST to assess the sources of uncertainty in 1 138. 2362. 3. 22. 1. 23.23 95.1487 0.0386
random from a batch of the measurement system. The gauges for the study were five probes owned 1 138. 2362. 3. 23. 2. 23.28 95.0743 0.0256
wafers by NIST; the check standards for the study were five wafers selected at 1 138. 2362. 3. 24. 2. 23.10 95.1010 0.0420
random from a batch of wafers cut from one silicon crystal doped with 1 139. 1. 3. 15. 1. 23.01 99.3528 0.1424
phosphorous to give a nominal resistivity of 100 ohm.cm. 1 139. 1. 3. 17. 1. 23.00 99.2940 0.0660
1 139. 1. 3. 17. 1. 23.01 99.2340 0.1179
Measurements on the The effect of operator was not considered to be significant for this study; 1 139. 1. 3. 21. 1. 23.20 99.3489 0.0506
check standards are therefore, 'day' replaces 'operator' as a factor in the nested design. Averages 1 139. 1. 3. 23. 2. 23.22 99.2625 0.1111
used to estimate and standard deviations from J = 6 measurements at the center of each wafer 1 139. 1. 3. 23. 1. 23.22 99.3787 0.1103
repeatability, day effect, are shown in the table. 1 139. 281. 3. 16. 1. 22.95 99.3244 0.1134
and run effect ● J = 6 measurements at the center of the wafer per day 1 139. 281. 3. 17. 1. 22.98 99.3378 0.0949
● K = 6 days (one operator) per repetition
1 139. 281. 3. 18. 1. 22.86 99.3424 0.0847
1 139. 281. 3. 22. 1. 23.17 99.4033 0.0801
● L = 2 runs (complete)
1 139. 281. 3. 23. 2. 23.10 99.3717 0.0630
● Q = 5 wafers (check standards 138, 139, 140, 141, 142) 1 139. 281. 3. 23. 1. 23.14 99.3493 0.1157
● R = 5 probes (1, 281, 283, 2062, 2362) 1 139. 283. 3. 16. 1. 22.94 99.3065 0.0381
1 139. 283. 3. 17. 1. 23.09 99.3280 0.1153
1 139. 283. 3. 18. 1. 23.11 99.3000 0.0818
1 139. 283. 3. 21. 1. 23.25 99.3347 0.0972
Run Wafer Probe Month Day Op Temp Average Std Dev 1 139. 283. 3. 22. 1. 23.36 99.3929 0.1189
1 139. 283. 3. 23. 1. 23.18 99.2644 0.0622
1 138. 1. 3. 15. 1. 22.98 95.1772 0.1191 1 139. 2062. 3. 16. 1. 22.94 99.3324 0.1531
1 138. 1. 3. 17. 1. 23.02 95.1567 0.0183 1 139. 2062. 3. 17. 1. 23.08 99.3254 0.0543
1 138. 1. 3. 18. 1. 22.79 95.1937 0.1282 1 139. 2062. 3. 18. 1. 23.15 99.2555 0.1024
1 138. 1. 3. 21. 1. 23.17 95.1959 0.0398 1 139. 2062. 3. 18. 1. 23.18 99.1946 0.0851
1 138. 1. 3. 23. 2. 23.25 95.1442 0.0346 1 139. 2062. 3. 22. 1. 23.27 99.3542 0.1227
1 138. 1. 3. 23. 1. 23.20 95.0610 0.1539 1 139. 2062. 3. 24. 2. 23.23 99.2365 0.1218
1 138. 281. 3. 16. 1. 22.99 95.1591 0.0963 1 139. 2362. 3. 15. 1. 23.08 99.2939 0.0818
1 138. 281. 3. 17. 1. 22.97 95.1195 0.0606 1 139. 2362. 3. 17. 1. 23.02 99.3234 0.0723
1 138. 281. 3. 18. 1. 22.83 95.1065 0.0842 1 139. 2362. 3. 18. 1. 22.93 99.2748 0.0756
1 138. 281. 3. 21. 1. 23.28 95.0925 0.0973 1 139. 2362. 3. 22. 1. 23.29 99.3512 0.0475
1 138. 281. 3. 23. 2. 23.14 95.1990 0.1062 1 139. 2362. 3. 23. 2. 23.25 99.2350 0.0517
1 138. 281. 3. 23. 1. 23.16 95.1682 0.1090 1 139. 2362. 3. 24. 2. 23.05 99.3574 0.0485
1 138. 283. 3. 16. 1. 22.95 95.1252 0.0531 1 140. 1. 3. 15. 1. 23.07 96.1334 0.1052
1 140. 1. 3. 17. 1. 23.08 96.1250 0.0916

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6111.htm (1 of 7) [11/13/2003 5:38:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6111.htm (2 of 7) [11/13/2003 5:38:50 PM]


2.6.1.1.1. Database of resistivity measurements 2.6.1.1.1. Database of resistivity measurements

1 140. 1. 3. 18. 1. 22.77 96.0665 0.0836 1 141. 2062. 3. 18. 1. 23.19 100.9650 0.0700
1 140. 1. 3. 21. 1. 23.18 96.0725 0.0620 1 141. 2062. 3. 18. 1. 23.18 101.0319 0.1070
1 140. 1. 3. 23. 2. 23.20 96.1006 0.0582 1 141. 2062. 3. 22. 1. 23.34 101.0849 0.0960
1 140. 1. 3. 23. 1. 23.21 96.1131 0.1757 1 141. 2062. 3. 24. 2. 23.21 101.1302 0.0505
1 140. 281. 3. 16. 1. 22.94 96.0467 0.0565 1 141. 2362. 3. 15. 1. 23.08 101.0471 0.0320
1 140. 281. 3. 17. 1. 22.99 96.1081 0.1293 1 141. 2362. 3. 17. 1. 23.01 101.0224 0.1020
1 140. 281. 3. 18. 1. 22.91 96.0578 0.1148 1 141. 2362. 3. 18. 1. 23.05 101.0702 0.0580
1 140. 281. 3. 22. 1. 23.15 96.0700 0.0495 1 141. 2362. 3. 22. 1. 23.22 101.0904 0.1049
1 140. 281. 3. 22. 1. 23.33 96.1052 0.1722 1 141. 2362. 3. 23. 2. 23.29 101.0626 0.0702
1 140. 281. 3. 23. 1. 23.19 96.0952 0.1786 1 141. 2362. 3. 24. 2. 23.15 101.0686 0.0661
1 140. 283. 3. 16. 1. 22.89 96.0650 0.1301 1 142. 1. 3. 15. 1. 23.02 94.3160 0.1372
1 140. 283. 3. 17. 1. 23.07 96.0870 0.0881 1 142. 1. 3. 17. 1. 23.04 94.2808 0.0999
1 140. 283. 3. 18. 1. 23.07 95.8906 0.1842 1 142. 1. 3. 18. 1. 22.73 94.2478 0.0803
1 140. 283. 3. 21. 1. 23.24 96.0842 0.1008 1 142. 1. 3. 21. 1. 23.19 94.2862 0.0700
1 140. 283. 3. 22. 1. 23.34 96.0189 0.0865 1 142. 1. 3. 23. 2. 23.25 94.1859 0.0899
1 140. 283. 3. 23. 1. 23.19 96.1047 0.0923 1 142. 1. 3. 23. 1. 23.21 94.2389 0.0686
1 140. 2062. 3. 16. 1. 22.95 96.0379 0.2190 1 142. 281. 3. 16. 1. 22.98 94.2640 0.0862
1 140. 2062. 3. 17. 1. 22.97 96.0671 0.0991 1 142. 281. 3. 17. 1. 23.00 94.3333 0.1330
1 140. 2062. 3. 18. 1. 23.15 96.0206 0.0648 1 142. 281. 3. 18. 1. 22.88 94.2994 0.0908
1 140. 2062. 3. 21. 1. 23.14 96.0207 0.1410 1 142. 281. 3. 21. 1. 23.28 94.2873 0.0846
1 140. 2062. 3. 22. 1. 23.32 96.0587 0.1634 1 142. 281. 3. 23. 2. 23.07 94.2576 0.0795
1 140. 2062. 3. 24. 2. 23.17 96.0903 0.0406 1 142. 281. 3. 23. 1. 23.12 94.3027 0.0389
1 140. 2362. 3. 15. 1. 23.08 96.0771 0.1024 1 142. 283. 3. 16. 1. 22.92 94.2846 0.1021
1 140. 2362. 3. 17. 1. 23.00 95.9976 0.0943 1 142. 283. 3. 17. 1. 23.08 94.2197 0.0627
1 140. 2362. 3. 18. 1. 23.01 96.0148 0.0622 1 142. 283. 3. 18. 1. 23.09 94.2119 0.0785
1 140. 2362. 3. 22. 1. 23.27 96.0397 0.0702 1 142. 283. 3. 21. 1. 23.29 94.2536 0.0712
1 140. 2362. 3. 23. 2. 23.24 96.0407 0.0627 1 142. 283. 3. 22. 1. 23.34 94.2280 0.0692
1 140. 2362. 3. 24. 2. 23.13 96.0445 0.0622 1 142. 283. 3. 24. 2. 22.92 94.2944 0.0958
1 141. 1. 3. 15. 1. 23.01 101.2124 0.0900 1 142. 2062. 3. 16. 1. 22.96 94.2238 0.0492
1 141. 1. 3. 17. 1. 23.08 101.1018 0.0820 1 142. 2062. 3. 17. 1. 22.95 94.3061 0.2194
1 141. 1. 3. 18. 1. 22.75 101.1119 0.0500 1 142. 2062. 3. 18. 1. 23.16 94.1868 0.0474
1 141. 1. 3. 21. 1. 23.21 101.1072 0.0641 1 142. 2062. 3. 21. 1. 23.11 94.2645 0.0697
1 141. 1. 3. 23. 2. 23.25 101.0802 0.0704 1 142. 2062. 3. 22. 1. 23.31 94.3101 0.0532
1 141. 1. 3. 23. 1. 23.19 101.1350 0.0699 1 142. 2062. 3. 24. 2. 23.24 94.2204 0.1023
1 141. 281. 3. 16. 1. 22.93 101.0287 0.0520 1 142. 2362. 3. 15. 1. 23.08 94.2437 0.0503
1 141. 281. 3. 17. 1. 23.00 101.0131 0.0710 1 142. 2362. 3. 17. 1. 23.00 94.2115 0.0919
1 141. 281. 3. 18. 1. 22.90 101.1329 0.0800 1 142. 2362. 3. 18. 1. 22.99 94.2348 0.0282
1 141. 281. 3. 22. 1. 23.19 101.0562 0.1594 1 142. 2362. 3. 22. 1. 23.26 94.2124 0.0513
1 141. 281. 3. 23. 2. 23.18 101.0891 0.1252 1 142. 2362. 3. 23. 2. 23.27 94.2214 0.0627
1 141. 281. 3. 23. 1. 23.17 101.1283 0.1151 1 142. 2362. 3. 24. 2. 23.08 94.1651 0.1010
1 141. 283. 3. 16. 1. 22.85 101.1597 0.0990 2 138. 1. 4. 13. 1. 23.12 95.1996 0.0645
1 141. 283. 3. 17. 1. 23.09 101.0784 0.0810 2 138. 1. 4. 15. 1. 22.73 95.1315 0.1192
1 141. 283. 3. 18. 1. 23.08 101.0715 0.0460 2 138. 1. 4. 18. 2. 22.76 95.1845 0.0452
1 141. 283. 3. 21. 1. 23.27 101.0910 0.0880 2 138. 1. 4. 19. 1. 22.73 95.1359 0.1498
1 141. 283. 3. 22. 1. 23.34 101.0967 0.0901 2 138. 1. 4. 20. 2. 22.73 95.1435 0.0629
1 141. 283. 3. 24. 2. 23.00 101.1627 0.0888 2 138. 1. 4. 21. 2. 22.93 95.1839 0.0563
1 141. 2062. 3. 16. 1. 22.97 101.1077 0.0970 2 138. 281. 4. 14. 2. 22.46 95.2106 0.1049
1 141. 2062. 3. 17. 1. 22.96 101.0245 0.1210 2 138. 281. 4. 18. 2. 22.80 95.2505 0.0771
2 138. 281. 4. 18. 2. 22.77 95.2648 0.1046

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6111.htm (3 of 7) [11/13/2003 5:38:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6111.htm (4 of 7) [11/13/2003 5:38:50 PM]


2.6.1.1.1. Database of resistivity measurements 2.6.1.1.1. Database of resistivity measurements

2 138. 281. 4. 20. 2. 22.80 95.2197 0.1779 2 139. 2362. 4. 19. 1. 22.74 99.2991 0.0903
2 138. 281. 4. 20. 2. 22.87 95.2003 0.1376 2 139. 2362. 4. 20. 2. 22.88 99.3049 0.0783
2 138. 281. 4. 21. 2. 22.95 95.0982 0.1611 2 139. 2362. 4. 21. 2. 22.94 99.2782 0.0718
2 138. 283. 4. 18. 2. 22.83 95.1211 0.0794 2 140. 1. 4. 13. 1. 23.10 96.0811 0.0463
2 138. 283. 4. 13. 1. 23.17 95.1327 0.0409 2 140. 1. 4. 15. 2. 22.75 96.1460 0.0725
2 138. 283. 4. 18. 1. 22.67 95.2053 0.1525 2 140. 1. 4. 18. 2. 22.78 96.1582 0.1428
2 138. 283. 4. 19. 2. 23.00 95.1292 0.0655 2 140. 1. 4. 19. 1. 22.70 96.1039 0.1056
2 138. 283. 4. 21. 2. 22.91 95.1669 0.0619 2 140. 1. 4. 20. 2. 22.75 96.1262 0.0672
2 138. 283. 4. 21. 2. 22.96 95.1401 0.0831 2 140. 1. 4. 21. 2. 22.93 96.1478 0.0562
2 138. 2062. 4. 15. 1. 22.64 95.2479 0.2867 2 140. 281. 4. 15. 2. 22.71 96.1153 0.1097
2 138. 2062. 4. 15. 1. 22.67 95.2224 0.1945 2 140. 281. 4. 14. 2. 22.49 96.1297 0.1202
2 138. 2062. 4. 19. 2. 22.99 95.2810 0.1960 2 140. 281. 4. 18. 2. 22.81 96.1233 0.1331
2 138. 2062. 4. 19. 1. 22.75 95.1869 0.1571 2 140. 281. 4. 20. 2. 22.78 96.1731 0.1484
2 138. 2062. 4. 21. 2. 22.84 95.3053 0.2012 2 140. 281. 4. 20. 2. 22.89 96.0872 0.0857
2 138. 2062. 4. 21. 2. 22.92 95.1432 0.1532 2 140. 281. 4. 21. 2. 22.91 96.1331 0.0944
2 138. 2362. 4. 12. 1. 22.74 95.1687 0.0785 2 140. 283. 4. 13. 2. 23.22 96.1135 0.0983
2 138. 2362. 4. 18. 2. 22.75 95.1564 0.0430 2 140. 283. 4. 18. 2. 22.85 96.1111 0.1210
2 138. 2362. 4. 19. 2. 22.88 95.1354 0.0983 2 140. 283. 4. 18. 2. 22.78 96.1221 0.0644
2 138. 2362. 4. 19. 1. 22.73 95.0422 0.0773 2 140. 283. 4. 19. 2. 23.01 96.1063 0.0921
2 138. 2362. 4. 20. 2. 22.86 95.1354 0.0587 2 140. 283. 4. 21. 2. 22.91 96.1155 0.0704
2 138. 2362. 4. 21. 2. 22.94 95.1075 0.0776 2 140. 283. 4. 21. 2. 22.94 96.1308 0.0258
2 139. 1. 4. 13. 2. 23.14 99.3274 0.0220 2 140. 2062. 4. 15. 2. 22.60 95.9767 0.2225
2 139. 1. 4. 15. 2. 22.77 99.5020 0.0997 2 140. 2062. 4. 15. 2. 22.66 96.1277 0.1792
2 139. 1. 4. 18. 2. 22.80 99.4016 0.0704 2 140. 2062. 4. 19. 2. 22.96 96.1858 0.1312
2 139. 1. 4. 19. 1. 22.68 99.3181 0.1245 2 140. 2062. 4. 19. 1. 22.75 96.1912 0.1936
2 139. 1. 4. 20. 2. 22.78 99.3858 0.0903 2 140. 2062. 4. 21. 2. 22.82 96.1650 0.1902
2 139. 1. 4. 21. 2. 22.93 99.3141 0.0255 2 140. 2062. 4. 21. 2. 22.92 96.1603 0.1777
2 139. 281. 4. 14. 2. 23.05 99.2915 0.0859 2 140. 2362. 4. 12. 1. 22.88 96.0793 0.0996
2 139. 281. 4. 15. 2. 22.71 99.4032 0.1322 2 140. 2362. 4. 18. 2. 22.76 96.1115 0.0533
2 139. 281. 4. 18. 2. 22.79 99.4612 0.1765 2 140. 2362. 4. 19. 2. 22.79 96.0803 0.0364
2 139. 281. 4. 20. 2. 22.74 99.4001 0.0889 2 140. 2362. 4. 19. 1. 22.71 96.0411 0.0768
2 139. 281. 4. 20. 2. 22.91 99.3765 0.1041 2 140. 2362. 4. 20. 2. 22.84 96.0988 0.1042
2 139. 281. 4. 21. 2. 22.92 99.3507 0.0717 2 140. 2362. 4. 21. 1. 22.94 96.0482 0.0868
2 139. 283. 4. 13. 2. 23.11 99.3848 0.0792 2 141. 1. 4. 13. 1. 23.07 101.1984 0.0803
2 139. 283. 4. 18. 2. 22.84 99.4952 0.1122 2 141. 1. 4. 15. 2. 22.72 101.1645 0.0914
2 139. 283. 4. 18. 2. 22.76 99.3220 0.0915 2 141. 1. 4. 18. 2. 22.75 101.2454 0.1109
2 139. 283. 4. 19. 2. 23.03 99.4165 0.0503 2 141. 1. 4. 19. 1. 22.69 101.1096 0.1376
2 139. 283. 4. 21. 2. 22.87 99.3791 0.1138 2 141. 1. 4. 20. 2. 22.83 101.2066 0.0717
2 139. 283. 4. 21. 2. 22.98 99.3985 0.0661 2 141. 1. 4. 21. 2. 22.93 101.0645 0.1205
2 139. 2062. 4. 14. 2. 22.43 99.4283 0.0891 2 141. 281. 4. 15. 2. 22.72 101.1615 0.1272
2 139. 2062. 4. 15. 2. 22.70 99.4139 0.2147 2 141. 281. 4. 14. 2. 22.40 101.1650 0.0595
2 139. 2062. 4. 19. 2. 22.97 99.3813 0.1143 2 141. 281. 4. 18. 2. 22.78 101.1815 0.1393
2 139. 2062. 4. 19. 1. 22.77 99.4314 0.1685 2 141. 281. 4. 20. 2. 22.73 101.1106 0.1189
2 139. 2062. 4. 21. 2. 22.79 99.4166 0.2080 2 141. 281. 4. 20. 2. 22.86 101.1420 0.0713
2 139. 2062. 4. 21. 2. 22.94 99.4052 0.2400 2 141. 281. 4. 21. 2. 22.94 101.0116 0.1088
2 139. 2362. 4. 12. 1. 22.82 99.3408 0.1279 2 141. 283. 4. 13. 2. 23.26 101.1554 0.0429
2 139. 2362. 4. 18. 2. 22.77 99.3116 0.1131 2 141. 283. 4. 18. 2. 22.85 101.1267 0.0751
2 139. 2362. 4. 19. 2. 22.82 99.3241 0.0519 2 141. 283. 4. 18. 2. 22.76 101.1227 0.0826
2 141. 283. 4. 19. 2. 22.82 101.0635 0.1715

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6111.htm (5 of 7) [11/13/2003 5:38:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6111.htm (6 of 7) [11/13/2003 5:38:50 PM]


2.6.1.1.1. Database of resistivity measurements 2.6.1.2. Analysis and interpretation

2 141. 283. 4. 21. 2. 22.89 101.1264 0.1447


2 141. 283. 4. 21. 2. 22.96 101.0853 0.1189
2 141. 2062. 4. 15. 2. 22.65 101.1332 0.2532
2 141. 2062. 4. 15. 1. 22.68 101.1487 0.1413
2 141. 2062. 4. 19. 2. 22.95 101.1778 0.1772 2. Measurement Process Characterization
2 141. 2062. 4. 19. 1. 22.77 101.0988 0.0884 2.6. Case studies
2 141. 2062. 4. 21. 2. 22.87 101.1686 0.2940 2.6.1. Gauge study of resistivity probes
2 141. 2062. 4. 21. 2. 22.94 101.3289 0.2072
2 141. 2362. 4. 12. 1. 22.83 101.1353 0.0585
2 141. 2362. 4. 18. 2. 22.83 101.1201 0.0868 2.6.1.2. Analysis and interpretation
2 141. 2362. 4. 19. 2. 22.91 101.0946 0.0855
2 141. 2362. 4. 19. 1. 22.71 100.9977 0.0645 Graphs of A graphical analysis shows repeatability standard deviations plotted
2 141. 2362. 4. 20. 2. 22.87 101.0963 0.0638 probe effect on by wafer and probe. Probes are coded by numbers with probe #2362
2 141. 2362. 4. 21. 2. 22.94 101.0300 0.0549 repeatability coded as #5. The plots show that for both runs the precision of this
2 142. 1. 4. 13. 1. 23.07 94.3049 0.1197 probe is better than for the other probes.
2 142. 1. 4. 15. 2. 22.73 94.3153 0.0566
2 142. 1. 4. 18. 2. 22.77 94.3073 0.0875 Probe #2362, because of its superior precision, was chosen as the tool
for measuring all 100 ohm.cm resistivity wafers at NIST. Therefore,
2 142. 1. 4. 19. 1. 22.67 94.2803 0.0376
the remainder of the analysis focuses on this probe.
2 142. 1. 4. 20. 2. 22.80 94.3008 0.0703
2 142. 1. 4. 21. 2. 22.93 94.2916 0.0604
Plot of The precision of probe #2362 is first checked for consistency by
2 142. 281. 4. 14. 2. 22.90 94.2557 0.0619
repeatability plotting the repeatability standard deviations over days, wafers and
2 142. 281. 4. 18. 2. 22.83 94.3542 0.1027
standard runs. Days are coded by letter. The plots verify that, for both runs,
2 142. 281. 4. 18. 2. 22.80 94.3007 0.1492
deviations for probe repeatability is not dependent on wafers or days although the
2 142. 281. 4. 20. 2. 22.76 94.3351 0.1059 standard deviations on days D, E, and F of run 2 are larger in some
2 142. 281. 4. 20. 2. 22.88 94.3406 0.1508 probe #2362
instances than for the other days. This is not surprising because
2 142. 281. 4. 21. 2. 22.92 94.2621 0.0946 from the
repeated probing on the wafer surfaces can cause slight degradation.
2 142. 283. 4. 13. 2. 23.25 94.3124 0.0534 nested design
Then the repeatability standard deviations are pooled over:
2 142. 283. 4. 18. 2. 22.85 94.3680 0.1643 over days,
2 142. 283. 4. 18. 1. 22.67 94.3442 0.0346 wafers, runs ● K = 6 days for K(J - 1) = 30 degrees of freedom
2 142. 283. 4. 19. 2. 22.80 94.3391 0.0616 ● L = 2 runs for LK(J - 1) = 60 degrees of freedom
2 142. 283. 4. 21. 2. 22.91 94.2238 0.0721 ● Q = 5 wafers for QLK(J - 1) = 300 degrees of freedom
2 142. 283. 4. 21. 2. 22.95 94.2721 0.0998 The results of pooling are shown below. Intermediate steps are not
2 142. 2062. 4. 14. 2. 22.49 94.2915 0.2189 shown, but the section on repeatability standard deviations shows an
2 142. 2062. 4. 15. 2. 22.69 94.2803 0.0690 example of pooling over wafers.
2 142. 2062. 4. 19. 2. 22.94 94.2818 0.0987
2 142. 2062. 4. 19. 1. 22.76 94.2227 0.2628
2 142. 2062. 4. 21. 2. 22.74 94.4109 0.1230
Pooled level-1 standard deviations (ohm.cm)
2 142. 2062. 4. 21. 2. 22.94 94.2616 0.0929
2 142. 2362. 4. 12. 1. 22.86 94.2052 0.0813
2 142. 2362. 4. 18. 2. 22.83 94.2824 0.0605
2 142. 2362. 4. 19. 2. 22.85 94.2396 0.0882 Probe Run 1 DF Run 2 DF Pooled DF
2 142. 2362. 4. 19. 1. 22.75 94.2087 0.0702
2 142. 2362. 4. 20. 2. 22.86 94.2937 0.0591 2362. 0.0658 150 0.0758 150 0.0710 300
2 142. 2362. 4. 21. 1. 22.93 94.2330 0.0556

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6111.htm (7 of 7) [11/13/2003 5:38:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc612.htm (1 of 4) [11/13/2003 5:38:50 PM]


2.6.1.2. Analysis and interpretation 2.6.1.2. Analysis and interpretation

Graphs of Averages of the 6 center measurements on each wafer are plotted on Level-3 Level-3 standard deviations are computed from the averages of the two
reproducibility a single graph for each wafer. The points (connected by lines) on the (stability) runs. Then the level-3 standard deviations are pooled over the five
and stability for left side of each graph are averages at the wafer center plotted over 5 standard wafers to obtain a standard deviation with 5 degrees of freedom as
probe #2362 days; the points on the right are the same measurements repeated deviations shown in the table below.
after one month as a check on the stability of the measurement computed
process. The plots show day-to-day variability as well as slight from run
variability from run-to-run. averages
and pooled
Earlier work discounts long-term drift in the gauge as the cause of over wafers
these changes. A reasonable conclusion is that day-to-day and
run-to-run variations come from random fluctuations in the
measurement process.
Level-3 standard deviations (ohm.cm) for 5 wafers
Level-2 Level-2 standard deviations (with K - 1 = 5 degrees of freedom
(reproducibility) each) are computed from the daily averages that are recorded in the
standard database. Then the level-2 standard deviations are pooled over: Run 1 Run 2
deviations Wafer Probe Average Average Diff Stddev DF
● L = 2 runs for L(K - 1) = 10 degrees of freedom
computed from
day averages ● Q = 5 wafers for QL(K - 1) = 50 degrees of freedom

and pooled over as shown in the table below. The table shows that the level-2 138. 2362. 95.0928 95.1243 -0.0315 0.0223 1
wafers and runs standard deviations are consistent over wafers and runs. 139. 2362. 99.3060 99.3098 -0.0038 0.0027 1
140. 2362. 96.0357 96.0765 -0.0408 0.0289 1
141. 2362. 101.0602 101.0790 -0.0188 0.0133 1
Level-2 standard deviations (ohm.cm) for 5 wafers 142. 2362. 94.2148 94.2438 -0.0290 0.0205 1

Run 1 Run 2 2362. Pooled 0.0197 5


Wafer Probe Average Stddev DF Average Stddev DF

138. 2362. 95.0928 0.0359 5 95.1243 0.0453 5 Graphs of A graphical analysis shows the relative biases among the 5 probes. For each
139. 2362. 99.3060 0.0472 5 99.3098 0.0215 5 probe wafer, differences from the wafer average by probe are plotted versus wafer
biases number. The graphs verify that probe #2362 (coded as 5) is biased low
140. 2362. 96.0357 0.0273 5 96.0765 0.0276 5 relative to the other probes. The bias shows up more strongly after the
141. 2362. 101.0602 0.0232 5 101.0790 0.0537 5 probes have been in use (run 2).
142. 2362. 94.2148 0.0274 5 94.2438 0.0370 5
Formulas Biases by probe are shown in the following table.
2362. Pooled 0.0333 25 0.0388 25 for
computation Differences from the mean for each wafer
(over 2 runs) 0.0362 50 of biases for Wafer Probe Run 1 Run 2
probe
#2362 138. 1. 0.0248 -0.0119
138. 281. 0.0108 0.0323
138. 283. 0.0193 -0.0258
138. 2062. -0.0175 0.0561
138. 2362. -0.0372 -0.0507

139. 1. -0.0036 -0.0007

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc612.htm (2 of 4) [11/13/2003 5:38:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc612.htm (3 of 4) [11/13/2003 5:38:50 PM]


2.6.1.2. Analysis and interpretation 2.6.1.3. Repeatability standard deviations

139. 281. 0.0394 0.0050


139. 283. 0.0057 0.0239
139. 2062. -0.0323 0.0373
139. 2362. -0.0094 -0.0657
2. Measurement Process Characterization
140. 1. 0.0400 0.0109 2.6. Case studies
2.6.1. Gauge study of resistivity probes
140. 281. 0.0187 0.0106
140. 283. -0.0201 0.0003
140. 2062. -0.0126 0.0182 2.6.1.3. Repeatability standard deviations
140. 2362. -0.0261 -0.0398

141. 1. 0.0394 0.0324


141. 281. -0.0107 -0.0037
141. 283. 0.0246 -0.0191
141. 2062. -0.0280 0.0436
141. 2362. -0.0252 -0.0534
Run 1 -
142. 1. 0.0062 0.0093 Graph of
142. 281. 0.0376 0.0174 repeatability
142. 283. -0.0044 0.0192 standard
deviations
142. 2062. -0.0011 0.0008 for probe
142. 2362. -0.0383 -0.0469 #2362 -- 6
days and 5
wafers
How to deal Probe #2362 was chosen for the certification process because of its showing
with bias superior precision, but its bias relative to the other probes creates a that
due to the problem. There are two possibilities for handling this problem:
repeatability
probe 1. Correct all measurements made with probe #2362 to the average is constant
of the probes. across
2. Include the standard deviation for the difference among probes in wafers and
the uncertainty budget. days
The better choice is (1) if we can assume that the probes in the study
represent a random sample of probes of this type. This is particularly
true when the unit (resistivity) is defined by a test method.

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc612.htm (4 of 4) [11/13/2003 5:38:50 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc613.htm (1 of 4) [11/13/2003 5:38:51 PM]


2.6.1.3. Repeatability standard deviations 2.6.1.3. Repeatability standard deviations

Run 1 -
Graph
showing
repeatability
standard
Run 2 - deviations
Graph of for five
repeatability probes as a
standard function of
deviations wafers and
for probe probes
#2362 -- 6
days and 5
wafers
showing
that
repeatability
is constant
across
wafers and
days

Symbols for codes: 1 = #1; 2 = #281; 3 = #283; 4 = #2062; 5 =


#2362

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc613.htm (2 of 4) [11/13/2003 5:38:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc613.htm (3 of 4) [11/13/2003 5:38:51 PM]


2.6.1.3. Repeatability standard deviations 2.6.1.4. Effects of days and long-term stability

Run 2 -
2. Measurement Process Characterization
Graph 2.6. Case studies
showing 2.6.1. Gauge study of resistivity probes
repeatability
standard
deviations 2.6.1.4. Effects of days and long-term stability
for 5 probes
as a Effects of The data points that are plotted in the five graphs shown below are averages of resistivity
function of days and measurements at the center of each wafer for wafers #138, 139, 140, 141, 142. Data for each of
long-term two runs are shown on each graph. The six days of measurements for each run are separated by
wafers and stability on approximately one month and show, with the exception of wafer #139, that there is a very slight
probes the shift upwards between run 1 and run 2. The size of the effect is estimated as a level-3 standard
measurements deviation in the analysis of the data.

Wafer 138

Symbols for probes: 1 = #1; 2 = #281; 3 = #283; 4 = #2062; 5 =


#2362

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc613.htm (4 of 4) [11/13/2003 5:38:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc614.htm (1 of 5) [11/13/2003 5:38:51 PM]


2.6.1.4. Effects of days and long-term stability 2.6.1.4. Effects of days and long-term stability

Wafer 139

Wafer 141
Wafer 140

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc614.htm (2 of 5) [11/13/2003 5:38:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc614.htm (3 of 5) [11/13/2003 5:38:51 PM]


2.6.1.4. Effects of days and long-term stability 2.6.1.4. Effects of days and long-term stability

Wafer 142

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc614.htm (4 of 5) [11/13/2003 5:38:51 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc614.htm (5 of 5) [11/13/2003 5:38:51 PM]


2.6.1.5. Differences among 5 probes 2.6.1.5. Differences among 5 probes

2. Measurement Process Characterization


2.6. Case studies
2.6.1. Gauge study of resistivity probes
Run 2 -
Graph of
differences
2.6.1.5. Differences among 5 probes from
wafer
averages
for each of
5 probes
showing
that probe
#2362
continues
Run 1 - to be
Graph of biased low
differences relative to
from the other
wafer probes
averages
for each of
5 probes
showing
that Symbols for probes: 1 = #1; 2 = #281; 3 = #283; 4 = #2062; 5 =
probes
#2062 and
#2362
#2362 are
biased low
relative to
the other
probes

Symbols for probes: 1 = #1; 2 = #281; 3 = #283; 4 = #2062; 5 =


#2362

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc615.htm (1 of 2) [11/13/2003 5:38:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc615.htm (2 of 2) [11/13/2003 5:38:52 PM]


2.6.1.6. Run gauge study example using Dataplot™ 2.6.1.6. Run gauge study example using Dataplot™

5. Interpretation: There is a separate plot for


each wafer. The points on the left side of each
plot are averages at the wafer center plotted over
5 days; the points on the right are the same
2. Measurement Process Characterization
measurements repeated after one month to check
2.6. Case studies on the stability of the measurement process. The
2.6.1. Gauge study of resistivity probes plots show day-to-day variability as well as
slight variability from run-to-run.

2.6.1.6. Run gauge study example using Table of estimates for probe #2362
1., 2. and 3.: Interpretation: The repeatability of
Dataplot 1. Level-1 (repeatability)
the gauge (level-1 standard deviation) dominates
2. Level-2 (reproducibility)
the imprecision associated with measurements
View of This page allows you to repeat the analysis outlined in the case study 3. Level-3 (stability) and days and runs are less important
Dataplot description on the previous page using Dataplot . It is required that you contributors. Of course, even if the gauge has
macros for have already downloaded and installed Dataplot and configured your high precision, biases may contribute
this case browser. to run Dataplot. Output from each analysis step below will be substantially to the uncertainty of measurement.
study displayed in one or more of the Dataplot windows. The four main
windows are the Output Window, the Graphics window, the Command Bias estimates 1. and 2. Interpretation: The graphs show the
History window, and the data sheet window. Across the top of the main relative biases among the 5 probes. For each
1. Differences among probes - run 1
windows there are menus for executing Dataplot commands. Across the wafer, differences from the wafer average by
bottom is a command entry window where commands can be typed in. 2. Differences among probes - run 2
probe are plotted versus wafer number. The
graphs verify that probe #2362 (coded as 5) is
biased low relative to the other probes. The bias
Data Analysis Steps Results and Conclusions shows up more strongly after the probes have
been in use (run 2).
Click on the links below to start Dataplot and
run this case study yourself. Each step may use The links in this column will connect you with
results from previous steps, so please be patient. more detailed information about each analysis
Wait until the software verifies that the current step from the case study description.
step is complete before clicking on the next step.

Graphical analyses of variability Graphs to


test for: 1. and 2. Interpretation: The plots verify that, for
1. Wafer/day effect on repeatability (run 1) both runs, the repeatability of probe #2362 is not
2. Wafer/day effect on repeatability (run 2) dependent on wafers or days, although the
standard deviations on days D, E, and F of run 2
3. Probe effect on repeatability (run 1) are larger in some instances than for the other
4. Probe effect on repeatability (run 2) days.
5. Reproducibility and stability 3. and 4. Interpretation: Probe #2362 appears as
#5 in the plots which show that, for both runs,
the precision of this probe is better than for the
other probes.

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc616.htm (1 of 2) [11/13/2003 5:38:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc616.htm (2 of 2) [11/13/2003 5:38:52 PM]


2.6.1.7. Dataplot macros 2.6.1.7. Dataplot macros

Plot of
repeatability reset data
standard reset plot control
2. Measurement Process Characterization deviations for reset i/o
2.6. Case studies 5 probes - run dimension 500 30
2.6.1. Gauge study of resistivity probes 1 label size 3
read mpc61.dat run wafer probe mo day op hum y sw
y1label ohm.cm
2.6.1.7. Dataplot macros title GAUGE STUDY
lines blank all
Plot of wafer let z = pattern 1 2 3 4 5 6 for I = 1 1 300
and day effect reset data let z2 = wafer + z/10 -0.25
on reset plot control characters 1 2 3 4 5
repeatability reset i/o X1LABEL WAFERS
standard dimension 500 30 X2LABEL REPEATABILITY STANDARD DEVIATIONS BY WAFER AND PROBE
deviations for label size 3 X3LABEL CODE FOR PROBES: 1= SRM1; 2= 281; 3=283; 4=2062; 5=2362
run 1 read mpc61.dat run wafer probe mo day op hum y sw TITLE RUN 1
y1label ohm.cm plot sw z2 probe subset run 1
title GAUGE STUDY
lines blank all Plot of
let z = pattern 1 2 3 4 5 6 for I = 1 1 300 repeatability reset data
let z2 = wafer + z/10 -0.25 standard reset plot control
characters a b c d e f deviations for reset i/o
X1LABEL WAFERS 5 probes - run dimension 500 30
X2LABEL REPEATABILITY STANDARD DEVIATIONS BY WAFER AND DAY 2 label size 3
X3LABEL CODE FOR DAYS: A, B, C, D, E, F read mpc61.dat run wafer probe mo day op hum y sw
TITLE RUN 1 y1label ohm.cm
plot sw z2 day subset run 1 title GAUGE STUDY
lines blank all
Plot of wafer let z = pattern 1 2 3 4 5 6 for I = 1 1 300
and day effect reset data let z2 = wafer + z/10 -0.25
on reset plot control characters 1 2 3 4 5
repeatability reset i/o X1LABEL WAFERS
standard dimension 500 30 X2LABEL REPEATABILITY STANDARD DEVIATIONS BY WAFER AND PROBE
deviations for label size 3 X3LABEL CODE FOR PROBES: 1= SRM1; 2= 281; 3=283; 4=2062; 5=2362
run 2 read mpc61.dat run wafer probe mo day op hum y sw TITLE RUN 2
y1label ohm.cm plot sw z2 probe subset run 2
title GAUGE STUDY
lines blank all Plot of
let z = pattern 1 2 3 4 5 6 for I = 1 1 300 differences reset data
let z2 = wafer + z/10 -0.25 from the wafer reset plot control
characters a b c d e f mean for 5 reset i/o
X1LABEL WAFERS probes - run 1 dimension 500 30
X2LABEL REPEATABILITY STANDARD DEVIATIONS BY WAFER AND DAY read mpc61a.dat wafer probe d1 d2
X3LABEL CODE FOR DAYS: A, B, C, D, E, F let biasrun1 = mean d1 subset probe 2362
TITLE RUN 2 print biasrun1
plot sw z2 day subset run 2 title GAUGE STUDY FOR 5 PROBES
Y1LABEL OHM.CM
lines dotted dotted dotted dotted dotted solid
characters 1 2 3 4 5 blank

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc617.htm (1 of 4) [11/13/2003 5:38:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc617.htm (2 of 4) [11/13/2003 5:38:52 PM]


2.6.1.7. Dataplot macros 2.6.1.7. Dataplot macros

xlimits 137 143


let zero = pattern 0 for I = 1 1 30
x1label DIFFERENCES AMONG PROBES VS WAFER (RUN 1)
plot d1 wafer probe and
plot zero wafer

Plot of
differences reset data
from the wafer reset plot control
mean for 5 reset i/o
probes - run 2 dimension 500 30
read mpc61a.dat wafer probe d1 d2
let biasrun2 = mean d2 subset probe 2362
print biasrun2
title GAUGE STUDY FOR 5 PROBES
Y1LABEL OHM.CM
lines dotted dotted dotted dotted dotted solid
characters 1 2 3 4 5 blank
xlimits 137 143
let zero = pattern 0 for I = 1 1 30
x1label DIFFERENCES AMONG PROBES VS WAFER (RUN 2)
plot d2 wafer probe and
plot zero wafer

Plot of
averages by reset data
day showing reset plot control
reproducibility reset i/o
and stability dimension 300 50
for label size 3
measurements read mcp61b.dat wafer probe mo1 day1 y1 mo2 day2 y2 diff
made with let t = mo1+(day1-1)/31.
probe #2362 let t2= mo2+(day2-1)/31.
on 5 wafers x3label WAFER 138
multiplot 3 2
plot y1 t subset wafer 138 and
plot y2 t2 subset wafer 138
x3label wafer 139
plot y1 t subset wafer 139 and
plot y2 t2 subset wafer 139
x3label WAFER 140
plot y1 t subset wafer 140 and
plot y2 t2 subset wafer 140
x3label WAFER 140
plot y1 t subset wafer 141 and
plot y2 t2 subset wafer 141
x3label WAFER 142
plot y1 t subset wafer 142 and
plot y2 t2 subset wafer 142

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc617.htm (3 of 4) [11/13/2003 5:38:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc617.htm (4 of 4) [11/13/2003 5:38:52 PM]


2.6.2. Check standard for resistivity measurements 2.6.2.1. Background and data

2. Measurement Process Characterization 2. Measurement Process Characterization


2.6. Case studies 2.6. Case studies
2.6.2. Check standard for resistivity measurements

2.6.2. Check standard for resistivity


measurements 2.6.2.1. Background and data
Explanation of The process involves the measurement of resistivity (ohm.cm) of
Purpose The purpose of this page is to outline the analysis of check standard data check standard individual silicon wafers cut from a single crystal (# 51939). The
with respect to controlling the precision and long-term variability of the measurements wafers were doped with phosphorous to give a nominal resistivity of
process. 100 ohm.cm. A single wafer (#137), chosen at random from a batch
of 130 wafers, was designated as the check standard for this process.
Outline 1. Background and data
2. Analysis and interpretation Design of data The measurements were carried out according to an ASTM Test
collection and Method (F84) with NIST probe #2362. The measurements on the
3. Run this example yourself using Dataplot
Database check standard duplicate certification measurements that were being
made, during the same time period, on individual wafers from crystal
#51939. For the check standard there were:
● J = 6 repetitions at the center of the wafer on each day

● K = 25 days

The K = 25 days cover the time during which the individual wafers
were being certified at the National Institute of Standards and
Technology.

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc62.htm [11/13/2003 5:38:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc621.htm [11/13/2003 5:38:52 PM]


2.6.2.1.1. Database for resistivity check standard 2.6.2.1.1. Database for resistivity check standard

Database of Crystal Waf Mo Da Hr Mn Op Hum Probe Temp Avg Stddev


measurements DF
on check 51939 137 03 24 18 01 drr 42 2362 23.003 97.070 0.085
2. Measurement Process Characterization standard 5
2.6. Case studies 51939 137 03 25 12 41 drr 35 2362 23.115 97.049 0.052
2.6.2. Check standard for resistivity measurements 5
2.6.2.1. Background and data
51939 137 03 25 15 57 drr 33 2362 23.196 97.048 0.038
5
2.6.2.1.1. Database for resistivity check 51939 137 03 28 10 10 JMT 47 2362 23.383 97.084 0.036
5
standard
51939 137 03 28 13 31 JMT 44 2362 23.491 97.106 0.049
5
Description of A single wafer (#137), chosen at random from a batch of 130 wafers,
check is the check standard for resistivity measurements at the 100 ohm.cm 51939 137 03 28 17 33 drr 43 2362 23.352 97.014 0.036
5
standard level at the National Institute of Standards and Technology. The
average of six measurements at the center of the wafer is the check 51939 137 03 29 14 40 drr 36 2362 23.202 97.047 0.052
standard value for one occasion, and the standard deviation of the six 5
measurements is the short-term standard deviation. The columns of 51939 137 03 29 16 33 drr 35 2362 23.222 97.078 0.117
the database contain the following: 5
1. Crystal ID 51939 137 03 30 05 45 JMT 32 2362 23.337 97.065 0.085
2. Check standard ID 5
3. Month 51939 137 03 30 09 26 JMT 33 2362 23.321 97.061 0.052
4. Day 5
5. Hour 51939 137 03 25 14 59 drr 34 2362 22.993 97.060 0.060
5
6. Minute
7. Operator 51939 137 03 31 10 10 JMT 37 2362 23.164 97.102 0.048
5
8. Humidity
9. Probe ID 51939 137 03 31 13 00 JMT 37 2362 23.169 97.096 0.026
5
10. Temperature
51939 137 03 31 15 32 JMT 35 2362 23.156 97.035 0.088
11. Check standard value 5
12. Short-term standard deviation
51939 137 04 01 13 05 JMT 34 2362 23.097 97.114 0.031
13. Degrees of freedom 5

51939 137 04 01 15 32 JMT 34 2362 23.127 97.069 0.037


5

51939 137 04 01 10 32 JMT 48 2362 22.963 97.095 0.032


5

51939 137 04 06 14 38 JMT 49 2362 23.454 97.088 0.056


5

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6211.htm (1 of 3) [11/13/2003 5:38:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6211.htm (2 of 3) [11/13/2003 5:38:52 PM]


2.6.2.1.1. Database for resistivity check standard 2.6.2.2. Analysis and interpretation

51939 137 04 07 10 50 JMT 34 2362 23.285 97.079 0.067


5

51939 137 04 07 15 46 JMT 33 2362 23.123 97.016 0.116


5
51939 137 04 08 09 37 JMT 33 2362 23.373 97.051 0.046 2. Measurement Process Characterization
5 2.6. Case studies
2.6.2. Check standard for resistivity measurements
51939 137 04 08 12 53 JMT 33 2362 23.296 97.070 0.078
5

51939 137 04 08 15 03 JMT 33 2362 23.218 97.065 0.040


5
2.6.2.2. Analysis and interpretation
51939 137 04 11 09 30 JMT 36 2362 23.415 97.111 0.038 Estimates of The level-1 standard deviations (with J - 1 = 5 degrees of freedom
5
the each) from the database are pooled over the K = 25 days to obtain a
51939 137 04 11 11 34 JMT 35 2362 23.395 97.073 0.039 repeatability reliable estimate of repeatability. This pooled value is
5 standard
s1 = 0.04054 ohm.cm
deviation and
level-2 with K(J - 1) = 125 degrees of freedom. The level-2 standard
standard deviation is computed from the daily averages to be
deviation s2 = 0.02680 ohm.cm

with K - 1 = 24 degrees of freedom.

Relationship These standard deviations are appropriate for estimating the


to uncertainty uncertainty of the average of six measurements on a wafer that is of
calculations the same material and construction as the check standard. The
computations are explained in the section on sensitivity coefficients
for check standard measurements. For other numbers of measurements
on the test wafer, the computations are explained in the section on
sensitivity coefficients for level-2 designs.

Illustrative A tabular presentation of a subset of check standard data (J = 6


table showing repetitions and K = 6 days) illustrates the computations. The pooled
computations repeatability standard deviation with K(J - 1) = 30 degrees of freedom
of from this limited database is shown in the next to last row of the table.
repeatability A level-2 standard deviation with K - 1= 5 degrees of freedom is
and level-2 computed from the center averages and is shown in the last row of the
table.
standard
deviations

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6211.htm (3 of 3) [11/13/2003 5:38:52 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc622.htm (1 of 2) [11/13/2003 5:38:53 PM]


2.6.2.2. Analysis and interpretation 2.6.2.2.1. Repeatability and level-2 standard deviations

Control chart The control chart for monitoring the precision of probe #2362 is
for probe constructed as discussed in the section on control charts for standard
#2362 deviations. The upper control limit (UCL) for testing for degradation
of the probe is computed using the critical value from the F table with 2. Measurement Process Characterization
numerator degrees of freedom J - 1 = 5 and denominator degrees of 2.6. Case studies
freedom K(J - 1) = 125. For a 0.05 significance level, 2.6.2. Check standard for resistivity measurements
2.6.2.2. Analysis and interpretation
F0.05(5,125) = 2.29
2.6.2.2.1. Repeatability and level-2 standard
UCL = *s1 = 0.09238 ohm.cm
deviations
Interpretation The control chart shows two points exceeding the upper control limit.
Example The table below illustrates the computation of repeatability and level-2 standard
of control We expect 5% of the standard deviations to exceed the UCL for a
deviations from measurements on a check standard. The check standard
chart for measurement process that is in-control. Two outliers are not indicative
measurements are resistivities at the center of a 100 ohm.cm wafer. There are J
probe #2362 of significant problems with the repeatability for the probe, but the
= 6 repetitions per day and K = 5 days for this example.
probe should be monitored closely in the future.
Table of Measurements on check standard #137
Control chart The control limits for monitoring the bias and long-term variability of
data,
for bias and resistivity with a Shewhart control chart are given by Repetitions per day
averages,
variability and Days 1 2 3 4 5 6
UCL = Average + 2*s2 = 97.1234 ohm.cm repeatability 1 96.920 97.054 97.057 97.035 97.189 96.965
Centerline = Average = 97.0698 ohm.cm standard 2 97.118 96.947 97.110 97.047 96.945 97.013
deviations
LCL = Average - 2*s2 = 97.0162 ohm.cm 3 97.034 97.084 97.023 97.045 97.061 97.074
4 97.047 97.099 97.087 97.076 97.117 97.070
Interpretation The control chart shows that the points scatter randomly about the 5 97.127 97.067 97.106 96.995 97.052 97.121
of control center line with no serious problems, although one point exceeds the 6 96.995 96.984 97.053 97.065 96.976 96.997
chart for bias upper control limit and one point exceeds the lower control limit by a
Averages 97.040 97.039 97.073 97.044 97.057 97.037
small amount. The conclusion is that there is:
Repeatability
● No evidence of bias, change or drift in the measurement
Standard 0.0777 0.0602 0.0341 0.0281 0.0896 0.0614
process.
Deviations
● No evidence of long-term lack of control.
Pooled
Future measurements that exceed the control limits must be evaluated Repeatability 0.0625
for long-term changes in bias and/or variability. Standard 30 df
Deviation
Level-2
0.0139
Standard
5 df
Deviation

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc622.htm (2 of 2) [11/13/2003 5:38:53 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6221.htm (1 of 2) [11/13/2003 5:38:53 PM]


2.6.2.2.1. Repeatability and level-2 standard deviations 2.6.2.3. Control chart for probe precision

2. Measurement Process Characterization


2.6. Case studies
2.6.2. Check standard for resistivity measurements

2.6.2.3. Control chart for probe precision

Control
chart for
probe
#2362
showing
violations
of the
control
limits --
all
standard
deviations
are based
on 6
repetitions
and the
control
limits are
95%
limits

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6221.htm (2 of 2) [11/13/2003 5:38:53 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc623.htm [11/13/2003 5:38:53 PM]


2.6.2.4. Control chart for bias and long-term variability 2.6.2.5. Run check standard example yourself

2. Measurement Process Characterization 2. Measurement Process Characterization


2.6. Case studies
2.6. Case studies
2.6.2. Check standard for resistivity measurements
2.6.2. Check standard for resistivity measurements

2.6.2.4. Control chart for bias and long-term 2.6.2.5. Run check standard example
variability
yourself
View of This page allows you to repeat the analysis outlined in the case study
Dataplot description on the previous page using Dataplot. It is required that you
macros for have already downloaded and installed Dataplot and configured your
this case browser to run Dataplot. Output from each analysis step below will be
study displayed in one or more of the Dataplot windows. The four main
Shewhart windows are the Output Window, the Graphics window, the Command
control chart History window, and the data sheet window. Across the top of the main
for windows there are menus for executing Dataplot commands. Across the
measurements bottom is a command entry window where commands can be typed in.
on a
resistivity
check
standard Data Analysis Steps Results and Conclusions
showing that
the process is Click on the links below to start Dataplot and
in-control -- run this case study yourself. Each step may use The links in this column will connect you with
all results from previous steps, so please be patient. more detailed information about each analysis
measurements
Wait until the software verifies that the current step from the case study description.
are averages
of 6
step is complete before clicking on the next step.
repetitions
Graphical tests of assumptions
Histogram The histogram and normal probability plots
show no evidence of non-normality.
Normal probability plot

Control chart for precision The precision control chart shows two points
exceeding the upper control limit. We expect 5%
Control chart for probe #2362
of the standard deviations to exceed the UCL
Computations: even when the measurement process is
in-control.
1. Pooled repeatability standard deviation
2. Control limit

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc624.htm [11/13/2003 5:38:58 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc625.htm (1 of 2) [11/13/2003 5:38:58 PM]


2.6.2.5. Run check standard example yourself 2.6.2.6. Dataplot macros

Control chart for check standard


The Shewhart control chart shows that the points
Control chart for check standard #137
scatter randomly about the center line with no
serious problems, although one point exceeds 2. Measurement Process Characterization
Computations: 2.6. Case studies
the upper control limit and one point exceeds the
1. Average check standard value 2.6.2. Check standard for resistivity measurements
lower control limit by a small amount. The
2. Process standard deviation conclusion is that there is no evidence of bias or
3. Upper and lower control limits lack of long-term control. 2.6.2.6. Dataplot macros
Histogram
for check reset data
standard reset plot control
#137 to test reset i/o
assumption dimension 500 30
of normality skip 14
read mpc62.dat crystal wafer mo day hour min op hum probe temp y sw df
histogram y

Normal
probability reset data
plot for reset plot control
check reset i/o
standard dimension 500 30
#137 to test skip 14
assumption read mpc62.dat crystal wafer mo day hour min op hum probe temp y sw df
of normality normal probabilty plot y

Control
chart for reset data
precision of reset plot control
probe reset i/o
#2372 and dimension 500 30
computation skip 14
of control read mpc62.dat crystal wafer mo day hour min op hum probe temp y sw df
parameter let time = mo +(day-1)/31.
estimates let s = sw*sw
let spool = mean s
let spool = spool**.5
print spool
let f = fppf(.95, 5, 125)
let ucl = spool*(f)**.5
print ucl
title Control chart for precision
characters blank blank O
lines solid dashed blank
y1label ohm.cm
x1label Time in days
x2label Standard deviations with probe #2362
x3label 5% upper control limit
let center = sw - sw + spool
let cl = sw - sw + ucl
plot center cl sw vs time

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc625.htm (2 of 2) [11/13/2003 5:38:58 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc626.htm (1 of 2) [11/13/2003 5:38:58 PM]


2.6.2.6. Dataplot macros 2.6.3. Evaluation of type A uncertainty

Shewhart
control reset data
chart for reset plot control
check reset i/o
standard dimension 500 30 2. Measurement Process Characterization
#137 with skip 14
2.6. Case studies
computation read mpc62.dat crystal wafer mo day hour min op hum probe temp y sw df
of control let time = mo +(day-1)/31.
chart let avg = mean y
parameters let sprocess = standard deviation y 2.6.3. Evaluation of type A uncertainty
let ucl = avg + 2*sprocess
let lcl = avg - 2*sprocess
print avg Purpose The purpose of this case study is to demonstrate the computation of
print sprocess uncertainty for a measurement process with several sources of
print ucl lcl uncertainty from data taken during a gauge study.
title Shewhart control chart
characters O blank blank blank
lines blank dashed solid dashed Outline 1. Background and data for the study
y1label ohm.cm
x1label Time in days 2. Graphical and quantitative analyses and interpretations
x2label Check standard 137 with probe 2362 3. Run this example yourself with Dataplot
x3label 2-sigma control limits
let ybar = y - y + avg
let lc1 = y - y + lcl
let lc2 = y - y + ucl
plot y lc1 ybar lc2 vs time

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc626.htm (2 of 2) [11/13/2003 5:38:58 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc63.htm [11/13/2003 5:38:58 PM]


2.6.3.1. Background and data 2.6.3.1. Background and data

2. Measurement Process Characterization


2.6. Case studies
2.6.3. Evaluation of type A uncertainty

2.6.3.1. Background and data


Description of The measurements in question are resistivities (ohm.cm) of silicon
measurements wafers. The intent is to calculate an uncertainty associated with the
resistivity measurements of approximately 100 silicon wafers that
were certified with probe #2362 in wiring configuration A, according
to ASTM Method F84 (ASTM F84) which is the defined reference
for this measurement. The reported value for each wafer is the
average of six measurements made at the center of the wafer on a
single day. Probe #2362 is one of five probes owned by the National
Institute of Standards and Technology that is capable of making the
measurements.

Sources of The uncertainty analysis takes into account the following sources of
uncertainty in variability:
NIST ● Repeatability of measurements at the center of the wafer
measurements
● Day-to-day effects
● Run-to-run effects
● Bias due to probe #2362
● Bias due to wiring configuration

Database of The certification measurements themselves are not the primary


3-level nested source for estimating uncertainty components because they do not
design -- for yield information on day-to-day effects and long-term effects. The
estimating standard deviations for the three time-dependent sources of
time-dependent uncertainty are estimated from a 3-level nested design. The design
sources of was replicated on each of Q = 5 wafers which were chosen at
uncertainty random, for this purpose, from the lot of wafers. The certification
measurements were made between the two runs in order to check on
the long-term stability of the process. The data consist of
repeatability standard deviations (with J - 1 = 5 degrees of freedom
each) from measurements at the wafer center.

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc631.htm (1 of 2) [11/13/2003 5:38:58 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc631.htm (2 of 2) [11/13/2003 5:38:58 PM]


2.6.3.1.1. Database of resistivity measurements 2.6.3.1.1. Database of resistivity measurements

1 138. 283. 3. 16. 1. 22.95 95.1252 0.0531


1 138. 283. 3. 17. 1. 23.08 95.1600 0.0998
1 138. 283. 3. 18. 1. 23.13 95.0818 0.1108
1 138. 283. 3. 21. 1. 23.28 95.1620 0.0408
1 138. 283. 3. 22. 1. 23.36 95.1735 0.0501
2. Measurement Process Characterization
1 138. 283. 3. 24. 2. 22.97 95.1932 0.0287
2.6. Case studies
2.6.3. Evaluation of type A uncertainty
1 138. 2062. 3. 16. 1. 22.97 95.1311 0.1066
2.6.3.1. Background and data 1 138. 2062. 3. 17. 1. 22.98 95.1132 0.0415
1 138. 2062. 3. 18. 1. 23.16 95.0432 0.0491
1 138. 2062. 3. 21. 1. 23.16 95.1254 0.0603
2.6.3.1.1. Database of resistivity measurements 1 138. 2062. 3. 22. 1. 23.28 95.1322 0.0561
1 138. 2062. 3. 24. 2. 23.19 95.1299 0.0349
1 138. 2362. 3. 15. 1. 23.08 95.1162 0.0480
Check standards are Measurements of resistivity (ohm.cm) were made according to an ASTM 1 138. 2362. 3. 17. 1. 23.01 95.0569 0.0577
five wafers chosen at Standard Test Method (F4) at the National Institute of Standards and 1 138. 2362. 3. 18. 1. 22.97 95.0598 0.0516
random from a batch of Technology to assess the sources of uncertainty in the measurement system. 1 138. 2362. 3. 22. 1. 23.23 95.1487 0.0386
wafers The gauges for the study were five probes owned by NIST; the check 1 138. 2362. 3. 23. 2. 23.28 95.0743 0.0256
standards for the study were five wafers selected at random from a batch of 1 138. 2362. 3. 24. 2. 23.10 95.1010 0.0420
wafers cut from one silicon crystal doped with phosphorous to give a 1 139. 1. 3. 15. 1. 23.01 99.3528 0.1424
nominal resistivity of 100 ohm.cm. 1 139. 1. 3. 17. 1. 23.00 99.2940 0.0660
1 139. 1. 3. 17. 1. 23.01 99.2340 0.1179
Measurements on the The effect of operator was not considered to be significant for this study. 1 139. 1. 3. 21. 1. 23.20 99.3489 0.0506
check standards are Averages and standard deviations from J = 6 measurements at the center of 1 139. 1. 3. 23. 2. 23.22 99.2625 0.1111
used to estimate each wafer are shown in the table. 1 139. 1. 3. 23. 1. 23.22 99.3787 0.1103
repeatability, day effect, ● J = 6 measurements at the center of the wafer per day 1 139. 281. 3. 16. 1. 22.95 99.3244 0.1134
run effect 1 139. 281. 3. 17. 1. 22.98 99.3378 0.0949
● K = 6 days (one operator) per repetition
1 139. 281. 3. 18. 1. 22.86 99.3424 0.0847
● L = 2 runs (complete)
1 139. 281. 3. 22. 1. 23.17 99.4033 0.0801
● Q = 5 wafers (check standards 138, 139, 140, 141, 142) 1 139. 281. 3. 23. 2. 23.10 99.3717 0.0630
● I = 5 probes (1, 281, 283, 2062, 2362) 1 139. 281. 3. 23. 1. 23.14 99.3493 0.1157
1 139. 283. 3. 16. 1. 22.94 99.3065 0.0381
1 139. 283. 3. 17. 1. 23.09 99.3280 0.1153
1 139. 283. 3. 18. 1. 23.11 99.3000 0.0818
Standard 1 139. 283. 3. 21. 1. 23.25 99.3347 0.0972
Run Wafer Probe Month Day Operator Temp Average Deviation 1 139. 283. 3. 22. 1. 23.36 99.3929 0.1189
1 139. 283. 3. 23. 1. 23.18 99.2644 0.0622
1 138. 1. 3. 15. 1. 22.98 95.1772 0.1191 1 139. 2062. 3. 16. 1. 22.94 99.3324 0.1531
1 138. 1. 3. 17. 1. 23.02 95.1567 0.0183 1 139. 2062. 3. 17. 1. 23.08 99.3254 0.0543
1 138. 1. 3. 18. 1. 22.79 95.1937 0.1282 1 139. 2062. 3. 18. 1. 23.15 99.2555 0.1024
1 138. 1. 3. 21. 1. 23.17 95.1959 0.0398 1 139. 2062. 3. 18. 1. 23.18 99.1946 0.0851
1 138. 1. 3. 23. 2. 23.25 95.1442 0.0346 1 139. 2062. 3. 22. 1. 23.27 99.3542 0.1227
1 138. 1. 3. 23. 1. 23.20 95.0610 0.1539 1 139. 2062. 3. 24. 2. 23.23 99.2365 0.1218
1 138. 281. 3. 16. 1. 22.99 95.1591 0.0963 1 139. 2362. 3. 15. 1. 23.08 99.2939 0.0818
1 138. 281. 3. 17. 1. 22.97 95.1195 0.0606 1 139. 2362. 3. 17. 1. 23.02 99.3234 0.0723
1 138. 281. 3. 18. 1. 22.83 95.1065 0.0842 1 139. 2362. 3. 18. 1. 22.93 99.2748 0.0756
1 138. 281. 3. 21. 1. 23.28 95.0925 0.0973 1 139. 2362. 3. 22. 1. 23.29 99.3512 0.0475
1 138. 281. 3. 23. 2. 23.14 95.1990 0.1062 1 139. 2362. 3. 23. 2. 23.25 99.2350 0.0517
1 138. 281. 3. 23. 1. 23.16 95.1682 0.1090 1 139. 2362. 3. 24. 2. 23.05 99.3574 0.0485
1 140. 1. 3. 15. 1. 23.07 96.1334 0.1052

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6311.htm (1 of 8) [11/13/2003 5:38:59 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6311.htm (2 of 8) [11/13/2003 5:38:59 PM]


2.6.3.1.1. Database of resistivity measurements 2.6.3.1.1. Database of resistivity measurements

1 140. 1. 3. 17. 1. 23.08 96.1250 0.0916 1 141. 2062. 3. 17. 1. 22.96 101.0245 0.1210
1 140. 1. 3. 18. 1. 22.77 96.0665 0.0836 1 141. 2062. 3. 18. 1. 23.19 100.9650 0.0700
1 140. 1. 3. 21. 1. 23.18 96.0725 0.0620 1 141. 2062. 3. 18. 1. 23.18 101.0319 0.1070
1 140. 1. 3. 23. 2. 23.20 96.1006 0.0582 1 141. 2062. 3. 22. 1. 23.34 101.0849 0.0960
1 140. 1. 3. 23. 1. 23.21 96.1131 0.1757 1 141. 2062. 3. 24. 2. 23.21 101.1302 0.0505
1 140. 281. 3. 16. 1. 22.94 96.0467 0.0565 1 141. 2362. 3. 15. 1. 23.08 101.0471 0.0320
1 140. 281. 3. 17. 1. 22.99 96.1081 0.1293 1 141. 2362. 3. 17. 1. 23.01 101.0224 0.1020
1 140. 281. 3. 18. 1. 22.91 96.0578 0.1148 1 141. 2362. 3. 18. 1. 23.05 101.0702 0.0580
1 140. 281. 3. 22. 1. 23.15 96.0700 0.0495 1 141. 2362. 3. 22. 1. 23.22 101.0904 0.1049
1 140. 281. 3. 22. 1. 23.33 96.1052 0.1722 1 141. 2362. 3. 23. 2. 23.29 101.0626 0.0702
1 140. 281. 3. 23. 1. 23.19 96.0952 0.1786 1 141. 2362. 3. 24. 2. 23.15 101.0686 0.0661
1 140. 283. 3. 16. 1. 22.89 96.0650 0.1301 1 142. 1. 3. 15. 1. 23.02 94.3160 0.1372
1 140. 283. 3. 17. 1. 23.07 96.0870 0.0881 1 142. 1. 3. 17. 1. 23.04 94.2808 0.0999
1 140. 283. 3. 18. 1. 23.07 95.8906 0.1842 1 142. 1. 3. 18. 1. 22.73 94.2478 0.0803
1 140. 283. 3. 21. 1. 23.24 96.0842 0.1008 1 142. 1. 3. 21. 1. 23.19 94.2862 0.0700
1 140. 283. 3. 22. 1. 23.34 96.0189 0.0865 1 142. 1. 3. 23. 2. 23.25 94.1859 0.0899
1 140. 283. 3. 23. 1. 23.19 96.1047 0.0923 1 142. 1. 3. 23. 1. 23.21 94.2389 0.0686
1 140. 2062. 3. 16. 1. 22.95 96.0379 0.2190 1 142. 281. 3. 16. 1. 22.98 94.2640 0.0862
1 140. 2062. 3. 17. 1. 22.97 96.0671 0.0991 1 142. 281. 3. 17. 1. 23.00 94.3333 0.1330
1 140. 2062. 3. 18. 1. 23.15 96.0206 0.0648 1 142. 281. 3. 18. 1. 22.88 94.2994 0.0908
1 140. 2062. 3. 21. 1. 23.14 96.0207 0.1410 1 142. 281. 3. 21. 1. 23.28 94.2873 0.0846
1 140. 2062. 3. 22. 1. 23.32 96.0587 0.1634 1 142. 281. 3. 23. 2. 23.07 94.2576 0.0795
1 140. 2062. 3. 24. 2. 23.17 96.0903 0.0406 1 142. 281. 3. 23. 1. 23.12 94.3027 0.0389
1 140. 2362. 3. 15. 1. 23.08 96.0771 0.1024 1 142. 283. 3. 16. 1. 22.92 94.2846 0.1021
1 140. 2362. 3. 17. 1. 23.00 95.9976 0.0943 1 142. 283. 3. 17. 1. 23.08 94.2197 0.0627
1 140. 2362. 3. 18. 1. 23.01 96.0148 0.0622 1 142. 283. 3. 18. 1. 23.09 94.2119 0.0785
1 140. 2362. 3. 22. 1. 23.27 96.0397 0.0702 1 142. 283. 3. 21. 1. 23.29 94.2536 0.0712
1 140. 2362. 3. 23. 2. 23.24 96.0407 0.0627 1 142. 283. 3. 22. 1. 23.34 94.2280 0.0692
1 140. 2362. 3. 24. 2. 23.13 96.0445 0.0622 1 142. 283. 3. 24. 2. 22.92 94.2944 0.0958
1 141. 1. 3. 15. 1. 23.01 101.2124 0.0900 1 142. 2062. 3. 16. 1. 22.96 94.2238 0.0492
1 141. 1. 3. 17. 1. 23.08 101.1018 0.0820 1 142. 2062. 3. 17. 1. 22.95 94.3061 0.2194
1 141. 1. 3. 18. 1. 22.75 101.1119 0.0500 1 142. 2062. 3. 18. 1. 23.16 94.1868 0.0474
1 141. 1. 3. 21. 1. 23.21 101.1072 0.0641 1 142. 2062. 3. 21. 1. 23.11 94.2645 0.0697
1 141. 1. 3. 23. 2. 23.25 101.0802 0.0704 1 142. 2062. 3. 22. 1. 23.31 94.3101 0.0532
1 141. 1. 3. 23. 1. 23.19 101.1350 0.0699 1 142. 2062. 3. 24. 2. 23.24 94.2204 0.1023
1 141. 281. 3. 16. 1. 22.93 101.0287 0.0520 1 142. 2362. 3. 15. 1. 23.08 94.2437 0.0503
1 141. 281. 3. 17. 1. 23.00 101.0131 0.0710 1 142. 2362. 3. 17. 1. 23.00 94.2115 0.0919
1 141. 281. 3. 18. 1. 22.90 101.1329 0.0800 1 142. 2362. 3. 18. 1. 22.99 94.2348 0.0282
1 141. 281. 3. 22. 1. 23.19 101.0562 0.1594 1 142. 2362. 3. 22. 1. 23.26 94.2124 0.0513
1 141. 281. 3. 23. 2. 23.18 101.0891 0.1252 1 142. 2362. 3. 23. 2. 23.27 94.2214 0.0627
1 141. 281. 3. 23. 1. 23.17 101.1283 0.1151 1 142. 2362. 3. 24. 2. 23.08 94.1651 0.1010
1 141. 283. 3. 16. 1. 22.85 101.1597 0.0990 2 138. 1. 4. 13. 1. 23.12 95.1996 0.0645
1 141. 283. 3. 17. 1. 23.09 101.0784 0.0810 2 138. 1. 4. 15. 1. 22.73 95.1315 0.1192
1 141. 283. 3. 18. 1. 23.08 101.0715 0.0460 2 138. 1. 4. 18. 2. 22.76 95.1845 0.0452
1 141. 283. 3. 21. 1. 23.27 101.0910 0.0880 2 138. 1. 4. 19. 1. 22.73 95.1359 0.1498
1 141. 283. 3. 22. 1. 23.34 101.0967 0.0901 2 138. 1. 4. 20. 2. 22.73 95.1435 0.0629
1 141. 283. 3. 24. 2. 23.00 101.1627 0.0888 2 138. 1. 4. 21. 2. 22.93 95.1839 0.0563
1 141. 2062. 3. 16. 1. 22.97 101.1077 0.0970 2 138. 281. 4. 14. 2. 22.46 95.2106 0.1049
2 138. 281. 4. 18. 2. 22.80 95.2505 0.0771

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6311.htm (3 of 8) [11/13/2003 5:38:59 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6311.htm (4 of 8) [11/13/2003 5:38:59 PM]


2.6.3.1.1. Database of resistivity measurements 2.6.3.1.1. Database of resistivity measurements

2 138. 281. 4. 18. 2. 22.77 95.2648 0.1046 2 139. 2362. 4. 19. 2. 22.82 99.3241 0.0519
2 138. 281. 4. 20. 2. 22.80 95.2197 0.1779 2 139. 2362. 4. 19. 1. 22.74 99.2991 0.0903
2 138. 281. 4. 20. 2. 22.87 95.2003 0.1376 2 139. 2362. 4. 20. 2. 22.88 99.3049 0.0783
2 138. 281. 4. 21. 2. 22.95 95.0982 0.1611 2 139. 2362. 4. 21. 2. 22.94 99.2782 0.0718
2 138. 283. 4. 18. 2. 22.83 95.1211 0.0794 2 140. 1. 4. 13. 1. 23.10 96.0811 0.0463
2 138. 283. 4. 13. 1. 23.17 95.1327 0.0409 2 140. 1. 4. 15. 2. 22.75 96.1460 0.0725
2 138. 283. 4. 18. 1. 22.67 95.2053 0.1525 2 140. 1. 4. 18. 2. 22.78 96.1582 0.1428
2 138. 283. 4. 19. 2. 23.00 95.1292 0.0655 2 140. 1. 4. 19. 1. 22.70 96.1039 0.1056
2 138. 283. 4. 21. 2. 22.91 95.1669 0.0619 2 140. 1. 4. 20. 2. 22.75 96.1262 0.0672
2 138. 283. 4. 21. 2. 22.96 95.1401 0.0831 2 140. 1. 4. 21. 2. 22.93 96.1478 0.0562
2 138. 2062. 4. 15. 1. 22.64 95.2479 0.2867 2 140. 281. 4. 15. 2. 22.71 96.1153 0.1097
2 138. 2062. 4. 15. 1. 22.67 95.2224 0.1945 2 140. 281. 4. 14. 2. 22.49 96.1297 0.1202
2 138. 2062. 4. 19. 2. 22.99 95.2810 0.1960 2 140. 281. 4. 18. 2. 22.81 96.1233 0.1331
2 138. 2062. 4. 19. 1. 22.75 95.1869 0.1571 2 140. 281. 4. 20. 2. 22.78 96.1731 0.1484
2 138. 2062. 4. 21. 2. 22.84 95.3053 0.2012 2 140. 281. 4. 20. 2. 22.89 96.0872 0.0857
2 138. 2062. 4. 21. 2. 22.92 95.1432 0.1532 2 140. 281. 4. 21. 2. 22.91 96.1331 0.0944
2 138. 2362. 4. 12. 1. 22.74 95.1687 0.0785 2 140. 283. 4. 13. 2. 23.22 96.1135 0.0983
2 138. 2362. 4. 18. 2. 22.75 95.1564 0.0430 2 140. 283. 4. 18. 2. 22.85 96.1111 0.1210
2 138. 2362. 4. 19. 2. 22.88 95.1354 0.0983 2 140. 283. 4. 18. 2. 22.78 96.1221 0.0644
2 138. 2362. 4. 19. 1. 22.73 95.0422 0.0773 2 140. 283. 4. 19. 2. 23.01 96.1063 0.0921
2 138. 2362. 4. 20. 2. 22.86 95.1354 0.0587 2 140. 283. 4. 21. 2. 22.91 96.1155 0.0704
2 138. 2362. 4. 21. 2. 22.94 95.1075 0.0776 2 140. 283. 4. 21. 2. 22.94 96.1308 0.0258
2 139. 1. 4. 13. 2. 23.14 99.3274 0.0220 2 140. 2062. 4. 15. 2. 22.60 95.9767 0.2225
2 139. 1. 4. 15. 2. 22.77 99.5020 0.0997 2 140. 2062. 4. 15. 2. 22.66 96.1277 0.1792
2 139. 1. 4. 18. 2. 22.80 99.4016 0.0704 2 140. 2062. 4. 19. 2. 22.96 96.1858 0.1312
2 139. 1. 4. 19. 1. 22.68 99.3181 0.1245 2 140. 2062. 4. 19. 1. 22.75 96.1912 0.1936
2 139. 1. 4. 20. 2. 22.78 99.3858 0.0903 2 140. 2062. 4. 21. 2. 22.82 96.1650 0.1902
2 139. 1. 4. 21. 2. 22.93 99.3141 0.0255 2 140. 2062. 4. 21. 2. 22.92 96.1603 0.1777
2 139. 281. 4. 14. 2. 23.05 99.2915 0.0859 2 140. 2362. 4. 12. 1. 22.88 96.0793 0.0996
2 139. 281. 4. 15. 2. 22.71 99.4032 0.1322 2 140. 2362. 4. 18. 2. 22.76 96.1115 0.0533
2 139. 281. 4. 18. 2. 22.79 99.4612 0.1765 2 140. 2362. 4. 19. 2. 22.79 96.0803 0.0364
2 139. 281. 4. 20. 2. 22.74 99.4001 0.0889 2 140. 2362. 4. 19. 1. 22.71 96.0411 0.0768
2 139. 281. 4. 20. 2. 22.91 99.3765 0.1041 2 140. 2362. 4. 20. 2. 22.84 96.0988 0.1042
2 139. 281. 4. 21. 2. 22.92 99.3507 0.0717 2 140. 2362. 4. 21. 1. 22.94 96.0482 0.0868
2 139. 283. 4. 13. 2. 23.11 99.3848 0.0792 2 141. 1. 4. 13. 1. 23.07 101.1984 0.0803
2 139. 283. 4. 18. 2. 22.84 99.4952 0.1122 2 141. 1. 4. 15. 2. 22.72 101.1645 0.0914
2 139. 283. 4. 18. 2. 22.76 99.3220 0.0915 2 141. 1. 4. 18. 2. 22.75 101.2454 0.1109
2 139. 283. 4. 19. 2. 23.03 99.4165 0.0503 2 141. 1. 4. 19. 1. 22.69 101.1096 0.1376
2 139. 283. 4. 21. 2. 22.87 99.3791 0.1138 2 141. 1. 4. 20. 2. 22.83 101.2066 0.0717
2 139. 283. 4. 21. 2. 22.98 99.3985 0.0661 2 141. 1. 4. 21. 2. 22.93 101.0645 0.1205
2 139. 2062. 4. 14. 2. 22.43 99.4283 0.0891 2 141. 281. 4. 15. 2. 22.72 101.1615 0.1272
2 139. 2062. 4. 15. 2. 22.70 99.4139 0.2147 2 141. 281. 4. 14. 2. 22.40 101.1650 0.0595
2 139. 2062. 4. 19. 2. 22.97 99.3813 0.1143 2 141. 281. 4. 18. 2. 22.78 101.1815 0.1393
2 139. 2062. 4. 19. 1. 22.77 99.4314 0.1685 2 141. 281. 4. 20. 2. 22.73 101.1106 0.1189
2 139. 2062. 4. 21. 2. 22.79 99.4166 0.2080 2 141. 281. 4. 20. 2. 22.86 101.1420 0.0713
2 139. 2062. 4. 21. 2. 22.94 99.4052 0.2400 2 141. 281. 4. 21. 2. 22.94 101.0116 0.1088
2 139. 2362. 4. 12. 1. 22.82 99.3408 0.1279 2 141. 283. 4. 13. 2. 23.26 101.1554 0.0429
2 139. 2362. 4. 18. 2. 22.77 99.3116 0.1131 2 141. 283. 4. 18. 2. 22.85 101.1267 0.0751
2 141. 283. 4. 18. 2. 22.76 101.1227 0.0826

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6311.htm (5 of 8) [11/13/2003 5:38:59 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6311.htm (6 of 8) [11/13/2003 5:38:59 PM]


2.6.3.1.1. Database of resistivity measurements 2.6.3.1.1. Database of resistivity measurements

2 141. 283. 4. 19. 2. 22.82 101.0635 0.1715


2 141. 283. 4. 21. 2. 22.89 101.1264 0.1447
2 141. 283. 4. 21. 2. 22.96 101.0853 0.1189
2 141. 2062. 4. 15. 2. 22.65 101.1332 0.2532
2 141. 2062. 4. 15. 1. 22.68 101.1487 0.1413
2 141. 2062. 4. 19. 2. 22.95 101.1778 0.1772
2 141. 2062. 4. 19. 1. 22.77 101.0988 0.0884
2 141. 2062. 4. 21. 2. 22.87 101.1686 0.2940
2 141. 2062. 4. 21. 2. 22.94 101.3289 0.2072
2 141. 2362. 4. 12. 1. 22.83 101.1353 0.0585
2 141. 2362. 4. 18. 2. 22.83 101.1201 0.0868
2 141. 2362. 4. 19. 2. 22.91 101.0946 0.0855
2 141. 2362. 4. 19. 1. 22.71 100.9977 0.0645
2 141. 2362. 4. 20. 2. 22.87 101.0963 0.0638
2 141. 2362. 4. 21. 2. 22.94 101.0300 0.0549
2 142. 1. 4. 13. 1. 23.07 94.3049 0.1197
2 142. 1. 4. 15. 2. 22.73 94.3153 0.0566
2 142. 1. 4. 18. 2. 22.77 94.3073 0.0875
2 142. 1. 4. 19. 1. 22.67 94.2803 0.0376
2 142. 1. 4. 20. 2. 22.80 94.3008 0.0703
2 142. 1. 4. 21. 2. 22.93 94.2916 0.0604
2 142. 281. 4. 14. 2. 22.90 94.2557 0.0619
2 142. 281. 4. 18. 2. 22.83 94.3542 0.1027
2 142. 281. 4. 18. 2. 22.80 94.3007 0.1492
2 142. 281. 4. 20. 2. 22.76 94.3351 0.1059
2 142. 281. 4. 20. 2. 22.88 94.3406 0.1508
2 142. 281. 4. 21. 2. 22.92 94.2621 0.0946
2 142. 283. 4. 13. 2. 23.25 94.3124 0.0534
2 142. 283. 4. 18. 2. 22.85 94.3680 0.1643
2 142. 283. 4. 18. 1. 22.67 94.3442 0.0346
2 142. 283. 4. 19. 2. 22.80 94.3391 0.0616
2 142. 283. 4. 21. 2. 22.91 94.2238 0.0721
2 142. 283. 4. 21. 2. 22.95 94.2721 0.0998
2 142. 2062. 4. 14. 2. 22.49 94.2915 0.2189
2 142. 2062. 4. 15. 2. 22.69 94.2803 0.0690
2 142. 2062. 4. 19. 2. 22.94 94.2818 0.0987
2 142. 2062. 4. 19. 1. 22.76 94.2227 0.2628
2 142. 2062. 4. 21. 2. 22.74 94.4109 0.1230
2 142. 2062. 4. 21. 2. 22.94 94.2616 0.0929
2 142. 2362. 4. 12. 1. 22.86 94.2052 0.0813
2 142. 2362. 4. 18. 2. 22.83 94.2824 0.0605
2 142. 2362. 4. 19. 2. 22.85 94.2396 0.0882
2 142. 2362. 4. 19. 1. 22.75 94.2087 0.0702
2 142. 2362. 4. 20. 2. 22.86 94.2937 0.0591
2 142. 2362. 4. 21. 1. 22.93 94.2330 0.0556

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6311.htm (7 of 8) [11/13/2003 5:38:59 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6311.htm (8 of 8) [11/13/2003 5:38:59 PM]


2.6.3.1.2. Measurements on wiring configurations 2.6.3.1.2. Measurements on wiring configurations

140. 2362. 96.0771 0.1024 96.0915 0.1257 96.0793 0.0996 96.1041 0.0890
140. 2362. 95.9976 0.0943 96.0057 0.0806 96.1115 0.0533 96.0774 0.0983
140. 2362. 96.0148 0.0622 96.0244 0.0833 96.0803 0.0364 96.1004 0.0758
140. 2362. 96.0397 0.0702 96.0422 0.0738 96.0411 0.0768 96.0677 0.0663
140. 2362. 96.0407 0.0627 96.0738 0.0800 96.0988 0.1042 96.0585 0.0960
2. Measurement Process Characterization
140. 2362. 96.0445 0.0622 96.0557 0.1129 96.0482 0.0868 96.0062 0.0895
2.6. Case studies
2.6.3. Evaluation of type A uncertainty
141. 2362. 101.0471 0.0320 101.0241 0.0670 101.1353 0.0585 101.1156 0.1027
2.6.3.1. Background and data 141. 2362. 101.0224 0.1020 101.0660 0.1030 101.1201 0.0868 101.1077 0.1141
141. 2362. 101.0702 0.0580 101.0509 0.0710 101.0946 0.0855 101.0455 0.1070
141. 2362. 101.0904 0.1049 101.0983 0.0894 100.9977 0.0645 101.0274 0.0666
2.6.3.1.2. Measurements on wiring configurations 141. 2362. 101.0626 0.0702 101.0614 0.0849 101.0963 0.0638 101.1106 0.0788
141. 2362. 101.0686 0.0661 101.0811 0.0490 101.0300 0.0549 101.1073 0.0663
142. 2362. 94.2437 0.0503 94.2088 0.0815 94.2052 0.0813 94.2487 0.0719
Check wafers were Measurements of resistivity (ohm.cm) were made according to an ASTM Standard 142. 2362. 94.2115 0.0919 94.2043 0.1176 94.2824 0.0605 94.2886 0.0499
measured with the probe Test Method (F4) to identify differences between 2 wiring configurations for probe 142. 2362. 94.2348 0.0282 94.2324 0.0519 94.2396 0.0882 94.2739 0.1075
wired in two #2362. The check standards for the study were five wafers selected at random from 142. 2362. 94.2124 0.0513 94.2347 0.0694 94.2087 0.0702 94.2023 0.0416
configurations a batch of wafers cut from one silicon crystal doped with phosphorous to give a 142. 2362. 94.2214 0.0627 94.2416 0.0757 94.2937 0.0591 94.2600 0.0731
nominal resistivity of 100 ohm.cm. 142. 2362. 94.1651 0.1010 94.2287 0.0919 94.2330 0.0556 94.2406 0.0651

Description of database The data are averages of K = 6 days' measurements and J = 6 repetitions at the
center of each wafer. There are L = 2 complete runs, separated by two months time,
on each wafer.
The data recorded in the 10 columns are:
1. Wafer
2. Probe
3. Average - configuration A; run 1
4. Standard deviation - configuration A; run 1
5. Average - configuration B; run 1
6. Standard deviation - configuration B; run 1
7. Average - configuration A; run 2
8. Standard deviation - configuration A; run 2
9. Average - configuration B; run 2
10. Standard deviation - configuration B; run 2

Wafer Probe Config A-run1 Config B-run1 Config A-run2 Config B-run2.

138. 2362. 95.1162 0.0480 95.0993 0.0466 95.1687 0.0785 95.1589 0.0642
138. 2362. 95.0569 0.0577 95.0657 0.0450 95.1564 0.0430 95.1705 0.0730
138. 2362. 95.0598 0.0516 95.0622 0.0664 95.1354 0.0983 95.1221 0.0695
138. 2362. 95.1487 0.0386 95.1625 0.0311 95.0422 0.0773 95.0513 0.0840
138. 2362. 95.0743 0.0256 95.0599 0.0488 95.1354 0.0587 95.1531 0.0482
138. 2362. 95.1010 0.0420 95.0944 0.0393 95.1075 0.0776 95.1537 0.0230
139. 2362. 99.2939 0.0818 99.3018 0.0905 99.3408 0.1279 99.3637 0.1025
139. 2362. 99.3234 0.0723 99.3488 0.0350 99.3116 0.1131 99.3881 0.0451
139. 2362. 99.2748 0.0756 99.3571 0.1993 99.3241 0.0519 99.3737 0.0699
139. 2362. 99.3512 0.0475 99.3512 0.1286 99.2991 0.0903 99.3066 0.0709
139. 2362. 99.2350 0.0517 99.2255 0.0738 99.3049 0.0783 99.3040 0.0744
139. 2362. 99.3574 0.0485 99.3605 0.0459 99.2782 0.0718 99.3680 0.0470

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6312.htm (1 of 2) [11/13/2003 5:38:59 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6312.htm (2 of 2) [11/13/2003 5:38:59 PM]


2.6.3.2. Analysis and interpretation 2.6.3.2. Analysis and interpretation

Calculation of The certified value for each wafer is the average of N = 6 repeatability
the standard measurements at the center of the wafer on M = 1 days and over P = 1 runs.
deviation of Notice that N, M and P are not necessarily the same as the number of
2. Measurement Process Characterization the certified measurements in the gauge study per wafer; namely, J, K and L. The
2.6. Case studies value showing standard deviation of a certified value (for time-dependent sources of
2.6.3. Evaluation of type A uncertainty sensitivity error), is
coefficients

2.6.3.2. Analysis and interpretation


Standard deviations for days and runs are included in this calculation, even
Purpose of this The purpose of this page is to outline an analysis of data taken during a though there were no replications over days or runs for the certification
page gauge study to quantify the type A uncertainty component for resistivity measurements. These factors contribute to the overall uncertainty of the
(ohm.cm) measurements on silicon wafers made with a gauge that was part measurement process even though they are not sampled for the particular
of the initial study. measurements of interest.

Summary of The level-1, level-2, and level-3 standard deviations for the uncertainty The equation Degrees of freedom cannot be calculated from the equation above because
standard analysis are summarized in the table below from the gauge case study. must be the calculations for the individual components involve differences among
deviations at rewritten to variances. The table of sensitivity coefficients for a 3-level design shows
three levels calculate that for
degrees of N = J, M = 1, P = 1
Standard deviations for probe #2362 freedom
the equation above can be rewritten in the form
Level Symbol Estimate DF
Level-1 s1 0.0710 300
Level-2 s2 0.0362 50 Then the degrees of freedom can be approximated using the
Level-3 s3 0.0197 5 Welch-Satterthwaite method.

Probe bias - A graphical analysis shows the relative biases among the 5 probes. For
Calculation of The standard deviation that estimates the day effect is Graphs of each wafer, differences from the wafer average by probe are plotted versus
individual wafer number. The graphs verify that probe #2362 (coded as 5) is biased
probe biases
components low relative to the other probes. The bias shows up more strongly after the
for days and probes have been in use (run 2).
runs
The standard deviation that estimates the run effect is

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc632.htm (1 of 5) [11/13/2003 5:38:59 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc632.htm (2 of 5) [11/13/2003 5:38:59 PM]


2.6.3.2. Analysis and interpretation 2.6.3.2. Analysis and interpretation

How to deal Probe #2362 was chosen for the certification process because of its superior Test for This finding is consistent over runs 1 and 2 and is confirmed by the
with bias due precision, but its bias relative to the other probes creates a problem. There difference t-statistics in the table below where the average differences and standard
to the probe are two possibilities for handling this problem: between deviations are computed from 6 days of measurements on 5 wafers. A
1. Correct all measurements made with probe #2362 to the average of configurations t-statistic < 2 indicates no significant difference. The conclusion is that
the probes. there is no bias due to wiring configuration and no contribution to
2. Include the standard deviation for the difference among probes in the uncertainty from this source.
uncertainty budget.
The best strategy, as followed in the certification process, is to correct all Differences between configurations
measurements for the average bias of probe #2362 and take the standard
deviation of the correction as a type A component of uncertainty.
Status Average Std dev DF t
Correction for Biases by probe and wafer are shown in the gauge case study. Biases for
bias or probe probe #2362 are summarized in table below for the two runs. The Pre -0.00858 0.0242 29 1.9
#2362 and correction is taken to be the negative of the average bias. The standard
uncertainty
Post -0.0110 0.0354 29 1.7
deviation of the correction is the standard deviation of the average of the
ten biases.
Error budget The error budget showing sensitivity coefficients for computing the
showing standard uncertainty and degrees of freedom is outlined below.
sensitivity
Estimated biases for probe #2362 coefficients, Error budget for resistivity (ohm.cm)
standard Standard
Wafer Probe Run 1 Run 2 All deviations and Source Type Sensitivity Deviation DF
degrees of
freedom Repeatability A a1 = 0 0.0710 300
138 2362 -0.0372 -0.0507 Reproducibility A a2 = 0.0362 50
139 2362 -0.0094 -0.0657
Run-to-run A a3 = 1 0.0197 5
140 2362 -0.0261 -0.0398
Probe #2362 A a4 = 0.0162 5
141 2362 -0.0252 -0.0534
142 2362 -0.0383 -0.0469 Wiring A a5 = 1 0 --
Configuration A
Average -0.0272 -0.0513 -0.0393
Standard deviation 0.0162 Standard The standard uncertainty is computed from the error budget as
uncertainty
(10 values) includes
components
Configurations Measurements on the check wafers were made with the probe wired in two for
Database and different configurations (A, B). A plot of differences between configuration repeatability,
plot of A and configuration B shows no bias between the two configurations. days, runs and
differences probe

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc632.htm (3 of 5) [11/13/2003 5:38:59 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc632.htm (4 of 5) [11/13/2003 5:38:59 PM]


2.6.3.2. Analysis and interpretation 2.6.3.2.1. Difference between 2 wiring configurations

Approximate The degrees of freedom associated with u are approximated by the


degrees of Welch-Satterthwaite formula as:
freedom and
expanded 2. Measurement Process Characterization
uncertainty 2.6. Case studies
2.6.3. Evaluation of type A uncertainty
2.6.3.2. Analysis and interpretation

where the are the degrees of freedom given in the rightmost column of
the table.
i 2.6.3.2.1. Difference between 2 wiring
The critical value at the 0.05 significance level with 42 degrees of freedom,
configurations
from the t-table, is 2.018 so the expanded uncertainty is
Measurements The graphs below are constructed from resistivity measurements
with the probe (ohm.cm) on five wafers where the probe (#2362) was wired in two
U = 2.018 u = 0.078 ohm.cm configured in different configurations, A and B. The probe is a 4-point probe with
two ways many possible wiring configurations. For this experiment, only two
configurations were tested as a means of identifying large
discrepancies.

Artifacts for the The five wafers; namely, #138, #139, #140, #141, and #142 are
study coded 1, 2, 3, 4, 5, respectively, in the graphs. These wafers were
chosen at random from a batch of approximately 100 wafers that
were being certified for resistivity.

Interpretation Differences between measurements in configurations A and B,


made on the same day, are plotted over six days for each wafer. The
two graphs represent two runs separated by approximately two
months time. The dotted line in the center is the zero line. The
pattern of data points scatters fairly randomly above and below the
zero line -- indicating no difference between configurations for
probe #2362. The conclusion applies to probe #2362 and cannot be
extended to all probes of this type.

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc632.htm (5 of 5) [11/13/2003 5:38:59 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6321.htm (1 of 3) [11/13/2003 5:39:00 PM]


2.6.3.2.1. Difference between 2 wiring configurations 2.6.3.2.1. Difference between 2 wiring configurations

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6321.htm (2 of 3) [11/13/2003 5:39:00 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc6321.htm (3 of 3) [11/13/2003 5:39:00 PM]


2.6.3.3. Run the type A uncertainty analysis using Dataplot 2.6.3.3. Run the type A uncertainty analysis using Dataplot

6. Compute level-3 standard deviations over 5 wafers is 0.0197 ohm.cm. This is


small compared to the other components
but is included in the uncertainty
calculation for completeness.
2. Measurement Process Characterization
2.6. Case studies Bias due to probe #2362 Database of measurements with 5 probes
2.6.3. Evaluation of type A uncertainty 1. Plot biases for 5 NIST probes 1. The plot shows that probe #2362 is biased
2. Compute wafer bias and average bias for low relative to the other probes and that
this bias is consistent over 5 wafers.
2.6.3.3. Run the type A uncertainty analysis probe #2362
2. The bias correction is the average bias =
3. Correction for bias and standard deviation
using Dataplot 0.0393 ohm.cm over the 5 wafers. The
correction is to be subtracted from all
View of This page allows you to repeat the analysis outlined in the case study measurements made with probe #2362.
Dataplot description on the previous page using Dataplot . It is required that you 3. The uncertainty of the bias correction =
macros for have already downloaded and installed Dataplot and configured your 0.0051 ohm.cm is computed from the
this case browser. to run Dataplot. Output from each analysis step below will be standard deviation of the biases for the 5
study displayed in one or more of the Dataplot windows. The four main wafers.
windows are the Output Window, the Graphics window, the Command
History window, and the data sheet window. Across the top of the main Bias due to wiring configuration A Database of wiring configurations A and B
windows there are menus for executing Dataplot commands. Across the 1. Plot differences between wiring 1. The plot of measurements in wiring
bottom is a command entry window where commands can be typed in.
configurations configurations A and B shows no
2. Averages, standard deviations and difference between A and B.
Data Analysis Steps Results and Conclusions t-statistics 2. The statistical test confirms that there is
no difference between the wiring
configurations.
Click on the links below to start Dataplot and
run this case study yourself. Each step may use The links in this column will connect you with
results from previous steps, so please be patient. more detailed information about each analysis Uncertainty Elements of error budget
Wait until the software verifies that the current step from the case study description. 1. Standard uncertainty, df, t-value and 1. The uncertainty is computed from the
step is complete before clicking on the next step. expanded uncertainty error budget. The uncertainty for an
average of 6 measurements on one day
Time-dependent components from 3-level Database of measurements with probe #2362 with probe #2362 is 0.078 with 42
nested design degrees of freedom.
1. The repeatability standard deviation is
Pool repeatability standard deviations for: 0.0658 ohm.cm for run 1 and 0.0758
1. Run 1 ohm.cm for run 2. This represents the
basic precision of the measuring
2. Run 2
instrument.
Compute level-2 standard deviations for: 2. The level-2 standard deviation pooled
3. Run 1 over 5 wafers and 2 runs is 0.0362
ohm.cm. This is significant in the
4. Run 2
calculation of uncertainty.
5. Pool level-2 standard deviations 3. The level-3 standard deviation pooled

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc633.htm (1 of 2) [11/13/2003 5:39:00 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc633.htm (2 of 2) [11/13/2003 5:39:00 PM]


2.6.3.4. Dataplot macros 2.6.3.4. Dataplot macros
print s11 df11
print s12 df12
let s1 = ((s11**2 + s12**2)/2.)**(1/2)
let df1=df11+df12
. repeatability standard deviation and df for run 2
2. Measurement Process Characterization
print s1 df1
2.6. Case studies
2.6.3. Evaluation of type A uncertainty
. end of calculations

Computes
2.6.3.4. Dataplot macros level-2 reset data
standard reset plot control
deviations from reset i/o
Reads data and
daily averages dimension 500 rows
plots the reset data and pools over label size 3
repeatability reset plot control wafers -- run 1 set read format f1.0,f6.0,f8.0,32x,f10.4,f10.4
standard reset i/o
read mpc633a.dat run wafer probe y sr
deviations for dimension 500 rows
retain run wafer probe y sr subset probe 2362
probe #2362 label size 3
sd plot y wafer subset run 1
and pools set read format f1.0,f6.0,f8.0,32x,f10.4,f10.4
let s21 = yplot
standard read mpc633a.dat run wafer probe y sr
let wafer1 = xplot
deviations over retain run wafer probe y sr subset probe = 2362
retain s21 wafer1 subset tagplot = 1
days, wafers -- let df = sr - sr + 5.
let nwaf = size s21
run 1 y1label ohm.cm
let df21 = 5 for i = 1 1 nwaf
characters * all
. level-2 standard deviations and df for 5 wafers - run 1
lines blank all
print wafer1 s21 df21
x2label Repeatability standard deviations for probe 2362 - run 1
. end of calculations
plot sr subset run 1
let var = sr*sr
Computes
let df11 = sum df subset run 1
level-2 reset data
let s11 = sum var subset run 1
standard reset plot control
. repeatability standard deviation for run 1
deviations from reset i/o
let s11 = (5.*s11/df11)**(1/2)
daily averages dimension 500 rows
print s11 df11
and pools over label size 3
. end of calculations
wafers -- run 2 set read format f1.0,f6.0,f8.0,32x,f10.4,f10.4
read mpc633a.dat run wafer probe y sr
Reads data and
retain run wafer probe y sr subset probe 2362
plots reset data sd plot y wafer subset run 2
repeatability reset plot control let s22 = yplot
standard reset i/o let wafer1 = xplot
deviations for dimension 500 30 retain s22 wafer1 subset tagplot = 1
probe #2362 label size 3 let nwaf = size s22
and pools set read format f1.0,f6.0,f8.0,32x,f10.4,f10.4 let df22 = 5 for i = 1 1 nwaf
standard read mpc633a.dat run wafer probe y sr . level-2 standard deviations and df for 5 wafers - run 1
deviations over retain run wafer probe y sr subset probe 2362 print wafer1 s22 df22
days, wafers -- let df = sr - sr + 5. . end of calculations
run 2 y1label ohm.cm
characters * all
lines blank all
x2label Repeatability standard deviations for probe 2362 - run 2
plot sr subset run 2
let var = sr*sr
let df11 = sum df subset run 1
let df12 = sum df subset run 2
let s11 = sum var subset run 1
let s12 = sum var subset run 2
let s11 = (5.*s11/df11)**(1/2)
let s12 = (5.*s12/df12)**(1/2)

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc634.htm (1 of 7) [11/13/2003 5:39:00 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc634.htm (2 of 7) [11/13/2003 5:39:00 PM]


2.6.3.4. Dataplot macros 2.6.3.4. Dataplot macros

Pools level-2 Plot


standard reset data differences reset data
deviations over reset plot control from the reset plot control
wafers and reset i/o average wafer reset i/o
runs dimension 500 30 value for each dimension 500 30
label size 3 probe showing read mpc61a.dat wafer probe d1 d2
set read format f1.0,f6.0,f8.0,32x,f10.4,f10.4 bias for probe let biasrun1 = mean d1 subset probe 2362
read mpc633a.dat run wafer probe y sr #2362 let biasrun2 = mean d2 subset probe 2362
retain run wafer probe y sr subset probe 2362 print biasrun1 biasrun2
sd plot y wafer subset run 1 title GAUGE STUDY FOR 5 PROBES
let s21 = yplot Y1LABEL OHM.CM
let wafer1 = xplot lines dotted dotted dotted dotted dotted solid
sd plot y wafer subset run 2 characters 1 2 3 4 5 blank
let s22 = yplot xlimits 137 143
retain s21 s22 wafer1 subset tagplot = 1 let zero = pattern 0 for I = 1 1 30
let nwaf = size wafer1 x1label DIFFERENCES AMONG PROBES VS WAFER (RUN 1)
let df21 = 5 for i = 1 1 nwaf plot d1 wafer probe and
let df22 = 5 for i = 1 1 nwaf plot zero wafer
let s2a = (s21**2)/5 + (s22**2)/5 let biasrun2 = mean d2 subset probe 2362
let s2 = sum s2a print biasrun2
let s2 = sqrt(s2/2) title GAUGE STUDY FOR 5 PROBES
let df2a = df21 + df22 Y1LABEL OHM.CM
let df2 = sum df2a lines dotted dotted dotted dotted dotted solid
. pooled level-2 standard deviation and df across wafers and runs characters 1 2 3 4 5 blank
print s2 df2 xlimits 137 143
. end of calculations let zero = pattern 0 for I = 1 1 30
x1label DIFFERENCES AMONG PROBES VS WAFER (RUN 2)
Computes plot d2 wafer probe and
level-3standard reset data plot zero wafer
deviations from reset plot control . end of calculations
run averages reset i/o
and pools over dimension 500 rows Compute bias
wafers label size 3 for probe reset data
set read format f1.0,f6.0,f8.0,32x,f10.4,f10.4 #2362 by wafer reset plot control
read mpc633a.dat run wafer probe y sr reset i/o
retain run wafer probe y sr subset probe 2362 dimension 500 30
. label size 3
mean plot y wafer subset run 1 set read format f1.0,f6.0,f8.0,32x,f10.4,f10.4
let m31 = yplot read mpc633a.dat run wafer probe y sr
let wafer1 = xplot set read format
mean plot y wafer subset run 2 .
let m32 = yplot cross tabulate mean y run wafer
retain m31 m32 wafer1 subset tagplot = 1 retain run wafer probe y sr subset probe 2362
let nwaf = size m31 skip 1
let s31 =(((m31-m32)**2)/2.)**(1/2) read dpst1f.dat runid wafid ybar
let df31 = 1 for i = 1 1 nwaf print runid wafid ybar
. level-3 standard deviations and df for 5 wafers let ngroups = size ybar
print wafer1 s31 df31 skip 0
let s31 = (s31**2)/5 .
let s3 = sum s31 let m3 = y - y
let s3 = sqrt(s3) feedback off
let df3=sum df31 loop for k = 1 1 ngroups
. pooled level-3 std deviation and df over 5 wafers let runa = runid(k)
print s3 df3 let wafera = wafid(k)
. end of calculations let ytemp = ybar(k)

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc634.htm (3 of 7) [11/13/2003 5:39:00 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc634.htm (4 of 7) [11/13/2003 5:39:00 PM]


2.6.3.4. Dataplot macros 2.6.3.4. Dataplot macros
let m3 = ytemp subset run = runa subset wafer = wafera mean plot d wafer subset run 2
end of loop let b2 = yplot
feedback on retain b1 b2 wafer1 subset tagplot = 1
. .
let d = y - m3 extend b1 b2
let bias1 = average d subset run 1 let sd = standard deviation b1
let bias2 = average d subset run 2 let sdcorr = sd/(10**(1/2))
. let correct = -(bias1+bias2)/2.
mean plot d wafer subset run 1 . correction for probe #2362, standard dev, and standard dev of corr
let b1 = yplot print correct sd sdcorr
let wafer1 = xplot . end of calculations
mean plot d wafer subset run 2
let b2 = yplot Plot
retain b1 b2 wafer1 subset tagplot = 1 differences reset data
let nwaf = size b1 between wiring reset plot control
. biases for run 1 and run 2 by wafers configurations reset i/o
print wafer1 b1 b2 A and B dimension 500 30
. average biases over wafers for run 1 and run 2 label size 3
print bias1 bias2 read mpc633k.dat wafer probe a1 s1 b1 s2 a2 s3 b2 s4
. end of calculations let diff1 = a1 - b1
let diff2 = a2 - b2
let t = sequence 1 1 30
Compute lines blank all
correction for reset data characters 1 2 3 4 5
bias for reset plot control y1label ohm.cm
measurements reset i/o x1label Config A - Config B -- Run 1
with probe dimension 500 30 x2label over 6 days and 5 wafers
#2362 and the label size 3 x3label legend for wafers 138, 139, 140, 141, 142: 1, 2, 3, 4, 5
standard set read format f1.0,f6.0,f8.0,32x,f10.4,f10.4 plot diff1 t wafer
deviation of the read mpc633a.dat run wafer probe y sr x1label Config A - Config B -- Run 2
correction set read format plot diff2 t wafer
. . end of calculations
cross tabulate mean y run wafer
retain run wafer probe y sr subset probe 2362 Compute
skip 1 average reset data
read dpst1f.dat runid wafid ybar differences reset plot control
let ngroups = size ybar between reset i/o
skip 0 configuration separator character @
. A and B; dimension 500 rows
let m3 = y - y standard label size 3
feedback off deviations and read mpc633k.dat wafer probe a1 s1 b1 s2 a2 s3 b2 s4
loop for k = 1 1 ngroups t-statistics for let diff1 = a1 - b1
let runa = runid(k) testing let diff2 = a2 - b2
let wafera = wafid(k) significance let d1 = average diff1
let ytemp = ybar(k) let d2 = average diff2
let m3 = ytemp subset run = runa subset wafer = wafera let s1 = standard deviation diff1
end of loop let s2 = standard deviation diff2
feedback on let t1 = (30.)**(1/2)*(d1/s1)
. let t2 = (30.)**(1/2)*(d2/s2)
let d = y - m3 . Average config A-config B; std dev difference; t-statistic for run 1
let bias1 = average d subset run 1 print d1 s1 t1
let bias2 = average d subset run 2 . Average config A-config B; std dev difference; t-statistic for run 2
. print d2 s2 t2
mean plot d wafer subset run 1 separator character ;
let b1 = yplot . end of calculations
let wafer1 = xplot

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc634.htm (5 of 7) [11/13/2003 5:39:00 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc634.htm (6 of 7) [11/13/2003 5:39:00 PM]


2.6.3.4. Dataplot macros 2.6.4. Evaluation of type B uncertainty and propagation of error

Compute
standard reset data
uncertainty, reset plot control
effective reset i/o
degrees of dimension 500 rows
freedom, t label size 3
2. Measurement Process Characterization
value and read mpc633m.dat sz a df 2.6. Case studies
expanded let c = a*sz*sz
uncertainty let d = c*c
let e = d/(df)
let sume = sum e
2.6.4. Evaluation of type B uncertainty and
let u = sum c
let u = u**(1/2)
propagation of error
let effdf=(u**4)/sume
let tvalue=tppf(.975,effdf) Focus of this The purpose of this case study is to demonstrate uncertainty analysis using
let expu=tvalue*u
.
case study statistical techniques coupled with type B analyses and propagation of
. uncertainty, effective degrees of freedom, tvalue and error. It is a continuation of the case study of type A uncertainties.
. expanded uncertainty
print u effdf tvalue expu
. end of calculations Background - The measurements in question are volume resistivities (ohm.cm) of silicon
description of wafers which have the following definition:
measurements
and = Xo.Ka.Ft .t.Ft/s
constraints
with explanations of the quantities and their nominal values shown below:

= resistivity = 0.00128 ohm.cm


X = voltage/current (ohm)
t = thicknesswafer(cm) = 0.628 cm
Ka = factorelectrical = 4.50 ohm.cm
FF = correctiontemp
Ft/s = factorthickness/separation 1.0

Type A The resistivity measurements, discussed in the case study of type A


evaluations evaluations, were replicated to cover the following sources of uncertainty
in the measurement process, and the associated uncertainties are reported in
units of resistivity (ohm.cm).
● Repeatability of measurements at the center of the wafer

● Day-to-day effects
● Run-to-run effects
● Bias due to probe #2362
● Bias due to wiring configuration

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc634.htm (7 of 7) [11/13/2003 5:39:00 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc64.htm (1 of 5) [11/13/2003 5:39:01 PM]


2.6.4. Evaluation of type B uncertainty and propagation of error 2.6.4. Evaluation of type B uncertainty and propagation of error

Need for Not all factors could be replicated during the gauge experiment. Wafer Thickness The standard deviation for thickness, t, accounts for two sources of
propagation thickness and measurements required for the scale corrections were uncertainty:
of error measured off-line. Thus, the type B evaluation of uncertainty is computed
1. calibration of the thickness measuring tool with precision gauge
using propagation of error. The propagation of error formula in units of
blocks
resistivity is as follows:
2. variation in thicknesses of the silicon wafers
The maximum bounds to these errors are assumed to be half-widths of

a = 0.000015 cm and a = 0.000001 cm


respectively, from uniform distributions. Thus, the standard deviation for
Standard Standard deviations for the type B components are summarized here. For a
thickness is
deviations for complete explanation, see the publication (Ehrstein and Croarkin).
type B
evaluations

Electrical There are two basic sources of uncertainty for the electrical measurements.
measurements The first is the least-count of the digital volt meter in the measurement of X Temperature The standard deviation for the temperature correction is calculated from its
with a maximum bound of correction defining equation as shown below. Thus, the standard deviation for the
correction is the standard deviation associated with the measurement of
a = 0.0000534 ohm temperature multiplied by the temperature coefficient, C(t) = 0.0083.
which is assumed to be the half-width of a uniform distribution. The The maximum bound to the error of the temperature measurement is
second is the uncertainty of the electrical scale factor. This has two sources assumed to be the half-width
of uncertainty:
1. error in the solution of the transcendental equation for determining a = 0.13 °C
the factor of a triangular distribution. Thus the standard deviation of the correction
2. errors in measured voltages for

The maximum bounds to these errors are assumed to be half-widths of


is
a = 0.0001 ohm.cm and a = 0.00038 ohm.cm
respectively, from uniform distributions. The corresponding standard
deviations are shown below.

sx = 0.0000534/ = 0.0000308 ohm Thickness The standard deviation for the thickness scale factor is negligible.
scale factor

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc64.htm (2 of 5) [11/13/2003 5:39:01 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc64.htm (3 of 5) [11/13/2003 5:39:01 PM]


2.6.4. Evaluation of type B uncertainty and propagation of error 2.6.4. Evaluation of type B uncertainty and propagation of error

Thickness B a10 = 100 0 --


Associated Sensitivity coefficients for translating the standard deviations for the type B scale
sensitivity components into units of resistivity (ohm.cm) from the propagation of error
coefficients equation are listed below and in the error budget. The sensitivity coefficient Standard The standard uncertainty is computed as:
for a source is the multiplicative factor associated with the standard uncertainty
deviation in the formula above; i.e., the partial derivative with respect to
that variable from the propagation of error equation.

a6 = ( /X) = 100/0.111 = 900.901


Approximate The degrees of freedom associated with u are approximated by the
a7 = ( /Ka) = 100/4.50 = 22.222 degrees of Welch-Satterthwaite formula as:
a8 = ( /t) = 100/0.628 = 159.24 freedom and
expanded
a9 = ( /FT) = 100 uncertainty
a10 = ( /Ft/S) = 100

Sensitivity Sensitivity coefficients for the type A components are shown in the case This calculation is not affected by components with infinite degrees of
coefficients study of type A uncertainty analysis and repeated below. Degrees of freedom, and therefore, the degrees of freedom for the standard uncertainty
and degrees freedom for type B uncertainties based on assumed distributions, according is the same as the degrees of freedom for the type A uncertainty. The
of freedom to the convention, are assumed to be infinite. critical value at the 0.05 significance level with 42 degrees of freedom,
from the t-table, is 2.018 so the expanded uncertainty is
Error budget The error budget showing sensitivity coefficients for computing the relative
showing standard uncertainty of volume resistivity (ohm.cm) with degrees of
sensitivity freedom is outlined below.
U = 2.018 u = 0.13 ohm.cm
coefficients,
standard Error budget for volume resistivity (ohm.cm)
deviations Standard
and degrees Source Type Sensitivity Deviation DF
of freedom
Repeatability A a1 = 0 0.0710 300
Reproducibility A a2 = 0.0362 50

Run-to-run A a3 = 1 0.0197 5
Probe #2362 A a4 = 0.0162 5

Wiring A a5 = 1 0 --
Configuration A
Resistance B a6 = 900.901 0.0000308
ratio
Electrical B a7 = 22.222 0.000227
scale
Thickness B a8 = 159.20 0.00000868
Temperature B a9 = 100 0.000441
correction

http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc64.htm (4 of 5) [11/13/2003 5:39:01 PM] http://www.itl.nist.gov/div898/handbook/mpc/section6/mpc64.htm (5 of 5) [11/13/2003 5:39:01 PM]


2.7. References 2.7. References

Theory of Churchill Eisenhart (1962). Realistic Evaluation of the Precision


uncertainty and Accuracy of Instrument Calibration SystemsJ Research
analysis National Bureau of Standards-C. Engineering and Instrumentation,
2. Measurement Process Characterization Vol. 67C, No.2, p. 161-187.

Confidence, Gerald J. Hahn and William Q. Meeker (1991). Statistical Intervals:


2.7. References prediction, and A Guide for Practitioners, John Wiley & Sons, Inc., New York.
tolerance
Degrees of K. A. Brownlee (1960). Statistical Theory and Methodology in intervals
freedom Science and Engineering, John Wiley & Sons, Inc., New York, p.
236. Original J. A. Hayford (1893). On the Least Square Adjustment of
calibration Weighings, U.S. Coast and Geodetic Survey Appendix 10, Report for
Calibration J. M. Cameron, M. C. Croarkin and R. C. Raybold (1977). Designs designs for 1892.
designs for the Calibration of Standards of Mass, NBS Technical Note 952, weighings
U.S. Dept. Commerce, 58 pages.
Uncertainties Thomas E. Hockersmith and Harry H. Ku (1993). Uncertainties
Calibration J. M. Cameron and G. E. Hailes (1974). Designs for the Calibration for values from associated with proving ring calibrations, NBS Special Publication
designs for of Small Groups of Standards in the Presence of Drift, Technical a calibration 300: Precision Measurement and Calibration, Statistical Concepts and
eliminating Note 844, U.S. Dept. Commerce, 31 pages. curve Procedures, Vol. 1, pp. 257-263, H. H. Ku, editor.
drift
EWMA control J. Stuart Hunter (1986). The Exponentially Weighted Moving
charts Average, J Quality Technology, Vol. 18, No. 4, pp. 203-207.
Measurement Carroll Croarkin and Ruth Varner (1982). Measurement Assurance
assurance for for Dimensional Measurements on Integrated-circuit Photomasks,
measurements NBS Technical Note 1164, U.S. Dept. Commerce, 44 pages. Fundamentals K. B. Jaeger and R. S. Davis (1984). A Primer for Mass Metrology,
on ICs of mass NBS Special Publication 700-1, 85 pages.
metrology
Calibration Ted Doiron (1993). Drift Eliminating Designs for
designs for Non-Simultaneous Comparison Calibrations, J Research National Fundamentals Harry Ku (1966). Notes on the Use of Propagation of Error
gauge blocks Institute of Standards and Technology, 98, pp. 217-224. of propagation Formulas, J Research of National Bureau of Standards-C.
of error Engineering and Instrumentation, Vol. 70C, No.4, pp. 263-273.
Type A & B J. R. Ehrstein and M. C. Croarkin (1998). Standard Reference
uncertainty Materials: The Certification of 100 mm Diameter Silicon Resistivity Handbook of Mary Gibbons Natrella (1963). Experimental Statistics, NBS
analyses for SRMs 2541 through 2547 Using Dual-Configuration Four-Point statistical Handbook 91, US Deptartment of Commerce.
resistivities Probe Measurements, NIST Special Publication 260-131, Revised, methods
84 pages.
Omnitab Sally T. Peavy, Shirley G. Bremer, Ruth N. Varner, David Hogben
Calibration W. G. Eicke and J. M. Cameron (1967). Designs for Surveillance of (1986). OMNITAB 80: An Interpretive System for Statistical and
designs for the Volt Maintained By a Group of Saturated Standard Cells, NBS Numerical Data Analysis, NBS Special Publication 701, US
electrical Technical Note 430, U.S. Dept. Commerce 19 pages. Deptartment of Commerce.
standards

http://www.itl.nist.gov/div898/handbook/mpc/section7/mpc7.htm (1 of 4) [11/13/2003 5:39:01 PM] http://www.itl.nist.gov/div898/handbook/mpc/section7/mpc7.htm (2 of 4) [11/13/2003 5:39:01 PM]


2.7. References 2.7. References

Uncertainties Steve D. Phillips and Keith R. Eberhardt (1997). Guidelines for Guide to Guide to the Expression of Uncertainty of Measurement (1993).
for Expressing the Uncertainty of Measurement Results Containing uncertainty ISBN 91-67-10188-9, 1st ed. ISO, Case postale 56, CH-1211, Genève
uncorrected Uncorrected Bias, NIST Journal of Research, Vol. 102, No. 5. analysis 20, Switzerland, 101 pages.
bias
ISO 5725 for ISO 5725: 1997. Accuracy (trueness and precision) of measurement
Calibration of Charles P. Reeve (1979). Calibration designs for roundness interlaboratory results, Part 2: Basic method for repeatability and reproducibility of
roundness standards, NBSIR 79-1758, 21 pages. testing a standard measurement method, ISO, Case postale 56, CH-1211,
artifacts Genève 20, Switzerland.

Calibration Charles P. Reeve (1967). The Calibration of Angle Blocks by ISO 11095 on ISO 11095: 1997. Linear Calibration using Reference Materials,
designs for Comparison, NBSIR 80-19767, 24 pages. linear ISO, Case postale 56, CH-1211, Genève 20, Switzerland.
angle blocks calibration

SI units Barry N. Taylor (1991). Interpretation of the SI for the United MSA gauge Measurement Systems Analysis Reference Manual, 2nd ed., (1995).
States and Metric Conversion Policy for Federal Agencies, NIST studies manual Chrysler Corp., Ford Motor Corp., General Motors Corp., 120 pages.
Special Publication 841, U.S. Deptartment of Commerce.
NCSL RP on Determining and Reporting Measurement Uncertainties, National
Uncertainties Raymond Turgel and Dominic Vecchia (1987). Precision Calibration uncertainty Conference of Standards Laboratories RP-12, (1994), Suite 305B,
for calibrated of Phase Meters, IEEE Transactions on Instrumentation and analysis 1800 30th St., Boulder, CO 80301.
values Measurement, Vol. IM-36, No. 4., pp. 918-922.
ISO International Vocabulary of Basic and General Terms in
Example of James R. Whetstone et al. (1989). Measurements of Coefficients of Vocabulary for Metrology, 2nd ed., (1993). ISO, Case postale 56, CH-1211, Genève
propagation of Discharge for Concentric Flange-Tapped Square-Edged Orifice metrology 20, Switzerland, 59 pages.
error for flow Meters in Water Over the Reynolds Number Range 600 to
measurements 2,700,000, NIST Technical Note 1264. pp. 97. Exact variance Leo Goodman (1960). "On the Exact Variance of Products" in
for length and Journal of the American Statistical Association, December, 1960, pp.
Mathematica Stephen Wolfram (1993). Mathematica, A System of Doing width 708-713.
software Mathematics by Computer, 2nd edition, Addison-Wesley Publishing
Co., New York.

Restrained Marvin Zelen (1962). "Linear Estimation and Related Topics" in


least squares Survey of Numerical Analysis edited by John Todd, McGraw-Hill
Book Co. Inc., New York, pp. 558-577.

ASTM F84 for ASTM Method F84-93, Standard Test Method for Measuring
resistivity Resistivity of Silicon Wafers With an In-line Four-Point Probe.
Annual Book of ASTM Standards, 10.05, West Conshohocken, PA
19428.

ASTM E691 ASTM Method E691-92, Standard Practice for Conducting an


for Interlaboratory Study to Determine the Precision of a Test Method.
interlaboratory Annual Book of ASTM Standards, 10.05, West Conshohocken, PA
testing 19428.

http://www.itl.nist.gov/div898/handbook/mpc/section7/mpc7.htm (3 of 4) [11/13/2003 5:39:01 PM] http://www.itl.nist.gov/div898/handbook/mpc/section7/mpc7.htm (4 of 4) [11/13/2003 5:39:01 PM]


3. Production Process Characterization 3. Production Process Characterization

3. Production Process Characterization


The goal of this chapter is to learn how to plan and conduct a Production Process
Characterization Study (PPC) on manufacturing processes. We will learn how to model
manufacturing processes and use these models to design a data collection scheme and to
guide data analysis activities. We will look in detail at how to analyze the data collected
in characterization studies and how to interpret and report the results. The accompanying
Case Studies provide detailed examples of several process characterization studies.

1. Introduction 2. Assumptions
1. Definition 1. General Assumptions
2. Uses 2. Specific PPC Models
3. Terminology/Concepts
4. PPC Steps

3. Data Collection 4. Analysis


1. Set Goals 1. First Steps
2. Model the Process 2. Exploring Relationships
3. Define Sampling Plan 3. Model Building
4. Variance Components
5. Process Stability
6. Process Capability
7. Checking Assumptions

5. Case Studies
1. Furnace Case Study
2. Machine Case Study

Detailed Chapter Table of Contents

References

http://www.itl.nist.gov/div898/handbook/ppc/ppc.htm (1 of 2) [11/13/2003 5:41:02 PM] http://www.itl.nist.gov/div898/handbook/ppc/ppc.htm (2 of 2) [11/13/2003 5:41:02 PM]


3. Production Process Characterization 3. Production Process Characterization

3. Data Collection for PPC [3.3.]


1. Define Goals [3.3.1.]
2. Process Modeling [3.3.2.]
3. Define Sampling Plan [3.3.3.]

3. Production Process Characterization - 1. Identifying Parameters, Ranges and Resolution [3.3.3.1.]


2. Choosing a Sampling Scheme [3.3.3.2.]
Detailed Table of Contents [3.] 3. Selecting Sample Sizes [3.3.3.3.]
4. Data Storage and Retrieval [3.3.3.4.]
5. Assign Roles and Responsibilities [3.3.3.5.]
1. Introduction to Production Process Characterization [3.1.]
1. What is PPC? [3.1.1.] 4. Data Analysis for PPC [3.4.]
2. What are PPC Studies Used For? [3.1.2.] 1. First Steps [3.4.1.]
3. Terminology/Concepts [3.1.3.] 2. Exploring Relationships [3.4.2.]
1. Distribution (Location, Spread and Shape) [3.1.3.1.] 1. Response Correlations [3.4.2.1.]
2. Process Variability [3.1.3.2.] 2. Exploring Main Effects [3.4.2.2.]
1. Controlled/Uncontrolled Variation [3.1.3.2.1.] 3. Exploring First Order Interactions [3.4.2.3.]
3. Propagating Error [3.1.3.3.] 3. Building Models [3.4.3.]
4. Populations and Sampling [3.1.3.4.] 1. Fitting Polynomial Models [3.4.3.1.]
5. Process Models [3.1.3.5.] 2. Fitting Physical Models [3.4.3.2.]
6. Experiments and Experimental Design [3.1.3.6.] 4. Analyzing Variance Structure [3.4.4.]
4. PPC Steps [3.1.4.] 5. Assessing Process Stability [3.4.5.]
6. Assessing Process Capability [3.4.6.]
2. Assumptions / Prerequisites [3.2.]
7. Checking Assumptions [3.4.7.]
1. General Assumptions [3.2.1.]
2. Continuous Linear Model [3.2.2.] 5. Case Studies [3.5.]
3. Analysis of Variance Models (ANOVA) [3.2.3.] 1. Furnace Case Study [3.5.1.]
1. One-Way ANOVA [3.2.3.1.] 1. Background and Data [3.5.1.1.]
1. One-Way Value-Splitting [3.2.3.1.1.] 2. Initial Analysis of Response Variable [3.5.1.2.]
2. Two-Way Crossed ANOVA [3.2.3.2.] 3. Identify Sources of Variation [3.5.1.3.]
1. Two-way Crossed Value-Splitting Example [3.2.3.2.1.] 4. Analysis of Variance [3.5.1.4.]
3. Two-Way Nested ANOVA [3.2.3.3.] 5. Final Conclusions [3.5.1.5.]
1. Two-Way Nested Value-Splitting Example [3.2.3.3.1.] 6. Work This Example Yourself [3.5.1.6.]
4. Discrete Models [3.2.4.] 2. Machine Screw Case Study [3.5.2.]

http://www.itl.nist.gov/div898/handbook/ppc/ppc_d.htm (1 of 3) [11/13/2003 5:41:09 PM] http://www.itl.nist.gov/div898/handbook/ppc/ppc_d.htm (2 of 3) [11/13/2003 5:41:09 PM]


3. Production Process Characterization 3.1. Introduction to Production Process Characterization

1. Background and Data [3.5.2.1.]


2. Box Plots by Factors [3.5.2.2.]
3. Analysis of Variance [3.5.2.3.]
4. Throughput [3.5.2.4.] 3. Production Process Characterization
5. Final Conclusions [3.5.2.5.]
6. Work This Example Yourself [3.5.2.6.]
3.1. Introduction to Production Process
6. References [3.6.] Characterization
Overview The goal of this section is to provide an introduction to PPC. We will
Section define PPC and the terminology used and discuss some of the possible
uses of a PPC study. Finally, we will look at the steps involved in
designing and executing a PPC study.

Contents: 1. What is PPC?


Section 1 2. What are PPC studies used for?
3. What terminology is used in PPC?
1. Location, Spread and Shape
2. Process Variability
3. Propagating Error
4. Populations and Sampling
5. Process Models
6. Experiments and Experimental Design
4. What are the steps of a PPC?
1. Plan PPC
2. Collect Data
3. Analyze and Interpret Data
4. Report Conclusions

http://www.itl.nist.gov/div898/handbook/ppc/ppc_d.htm (3 of 3) [11/13/2003 5:41:09 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc1.htm [11/13/2003 5:41:19 PM]


3.1.1. What is PPC? 3.1.1. What is PPC?

3. Production Process Characterization


3.1. Introduction to Production Process Characterization

3.1.1. What is PPC?


In PPC, we Process characterization is an activity in which we:
build ● identify the key inputs and outputs of a process
data-based
● collect data on their behavior over the entire operating range
models
● estimate the steady-state behavior at optimal operating conditions

● and build models describing the parameter relationships across


the operating range
The result of this activity is a set of mathematical process models that
we can use to monitor and improve the process.

This is a This activity is typically a three-step process.


three-step The Screening Step
process
In this phase we identify all possible significant process inputs
and outputs and conduct a series of screening experiments in
order to reduce that list to the key inputs and outputs. These
experiments will also allow us to develop initial models of the
relationships between those inputs and outputs.
The Mapping Step
In this step we map the behavior of the key outputs over their
expected operating ranges. We do this through a series of more
detailed experiments called Response Surface experiments.
The Passive Step
In this step we allow the process to run at nominal conditions and
estimate the process stability and capability.

Not all of The first two steps are only needed for new processes or when the
the steps process has undergone some significant engineering change. There are,
need to be however, many times throughout the life of a process when the third
performed step is needed. Examples might be: initial process qualification, control
chart development, after minor process adjustments, after schedule
equipment maintenance, etc.

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc11.htm (1 of 2) [11/13/2003 5:41:19 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc11.htm (2 of 2) [11/13/2003 5:41:19 PM]


3.1.2. What are PPC Studies Used For? 3.1.3. Terminology/Concepts

3. Production Process Characterization 3. Production Process Characterization


3.1. Introduction to Production Process Characterization 3.1. Introduction to Production Process Characterization

3.1.2. What are PPC Studies Used For? 3.1.3. Terminology/Concepts


PPC is the core Process characterization is an integral part of any continuous There are just a few fundamental concepts needed for PPC.
of any CI improvement program. There are many steps in that program for This section will review these ideas briefly and provide
program which process characterization is required. These might include: links to other sections in the Handbook where they are
covered in more detail.
When process ● when we are bringing a new process or tool into use.
characterization ● when we are bringing a tool or process back up after Distribution(location, For basic data analysis, we will need to understand how to
is required scheduled/unscheduled maintenance. spread, shape) estimate location, spread and shape from the data. These
three measures comprise what is known as the distribution
● when we want to compare tools or processes.
of the data. We will look at both graphical and numerical
● when we want to check the health of our process during the techniques.
monitoring phase.
● when we are troubleshooting a bad process. Process variability We need to thoroughly understand the concept of process
variability. This includes how variation explains the
The techniques described in this chapter are equally applicable to the possible range of expected data values, the various
other chapters covered in this Handbook. These include: classifications of variability, and the role that variability
plays in process stability and capability.
Process ● calibration
characterization ● process monitoring Error propagation We also need to understand how variation propagates
techniques are through our manufacturing processes and how to
● process improvement
applicable in decompose the total observed variation into components
other areas ● process/product comparison attributable to the contributing sources.
● reliability
Populations and It is important to have an understanding of the various
sampling issues related to sampling. We will define a population and
discuss how to acquire representative random samples from
the population of interest. We will also discuss a useful
formula for estimating the number of observations required
to answer specific questions.

Modeling For modeling, we will need to know how to identify


important factors and responses. We will also need to know
how to graphically and quantitatively build models of the
relationships between the factors and responses.

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc12.htm [11/13/2003 5:41:19 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc13.htm (1 of 2) [11/13/2003 5:41:19 PM]


3.1.3. Terminology/Concepts 3.1.3.1. Distribution (Location, Spread and Shape)

Experiments Finally, we will need to know about the basics of designed


experiments including screening designs and response
surface designs so that we can quantify these relationships.
This topic will receive only a cursory treatment in this 3. Production Process Characterization
chapter. It is covered in detail in the process improvement 3.1. Introduction to Production Process Characterization
chapter. However, examples of its use are in the case 3.1.3. Terminology/Concepts
studies.

3.1.3.1. Distribution (Location, Spread and


Shape)
Distributions A fundamental concept in representing any of the outputs from a
are production process is that of a distribution. Distributions arise because
characterized any manufacturing process output will not yield the same value every
by location, time it is measured. There will be a natural scattering of the measured
spread and values about some central tendency value. This scattering about a
shape central value is known as a distribution. A distribution is characterized
by three values:

Location
The location is the expected value of the output being measured.
For a stable process, this is the value around which the process
has stabilized.
Spread
The spread is the expected amount of variation associated with
the output. This tells us the range of possible values that we
would expect to see.
Shape
The shape shows how the variation is distributed about the
location. This tells us if our variation is symmetric about the
mean or if it is skewed or possibly multimodal.

A primary One of the primary goals of a PPC study is to characterize our process
goal of PPC outputs in terms of these three measurements. If we can demonstrate
is to estimate that our process is stabilized about a constant location, with a constant
the variance and a known stable shape, then we have a process that is both
distributions predictable and controllable. This is required before we can set up
of the control charts or conduct experiments.
process
outputs

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc13.htm (2 of 2) [11/13/2003 5:41:19 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc131.htm (1 of 2) [11/13/2003 5:41:20 PM]


3.1.3.1. Distribution (Location, Spread and Shape) 3.1.3.2. Process Variability

Click on The table below shows the most common numerical and graphical
each item to measures of location, spread and shape.
read more
Parameter Numerical Graphical 3. Production Process Characterization
detail 3.1. Introduction to Production Process Characterization
scatter plot
mean 3.1.3. Terminology/Concepts
Location boxplot
median
histogram
3.1.3.2. Process Variability
variance
range boxplot
Spread Variability All manufacturing and measurement processes exhibit variation. For example, when we take sample
histogram
inter-quartile range is present data on the output of a process, such as critical dimensions, oxide thickness, or resistivity, we
everywhere observe that all the values are NOT the same. This results in a collection of observed values
skewness boxplot distributed about some location value. This is what we call spread or variability. We represent
Shape histogram variability numerically with the variance calculation and graphically with a histogram.
kurtosis probability plot
How does The standard deviation (square root of the variance) gives insight into the spread of the data through
the the use of what is known as the Empirical Rule. This rule (shown in the graph below) is:
standard
deviation Approximately 60-78% of the data are within a distance of one standard deviation from the average
describe the ( -s, +s).
spread of Approximately 90-98% of the data are within a distance of two standard deviations from the
the data? average ( -2s, +2s).
More than 99% of the data are within a distance of three standard deviations from the average (
-3s, +3s).

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc131.htm (2 of 2) [11/13/2003 5:41:20 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc132.htm (1 of 3) [11/13/2003 5:41:22 PM]


3.1.3.2. Process Variability 3.1.3.2. Process Variability

In the course of process characterization we should endeavor to eliminate all sources of uncontrolled
variation.

Variability This observed variability is an accumulation of many different sources of variation that have
accumulates occurred throughout the manufacturing process. One of the more important activities of process
from many characterization is to identify and quantify these various sources of variation so that they may be
sources minimized.

There are There are not only different sources of variation, but there are also different types of variation. Two
also important classifications of variation for the purposes of PPC are controlled variation and
different uncontrolled variation.
types

Click here CONTROLLED VARIATION


to see Variation that is characterized by a stable and consistent pattern of variation over time. This
examples type of variation will be random in nature and will be exhibited by a uniform fluctuation
about a constant level.
UNCONTROLLED VARIATION
Variation that is characterized by a pattern of variation that changes over time and hence is
unpredictable. This type of variation will typically contain some structure.

Stable This concept of controlled/uncontrolled variation is important in determining if a process is stable.


processes A process is deemed stable if it runs in a consistent and predictable manner. This means that the
only exhibit average process value is constant and the variability is controlled. If the variation is uncontrolled,
controlled then either the process average is changing or the process variation is changing or both. The first
variation process in the example above is stable; the second is not.

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc132.htm (2 of 3) [11/13/2003 5:41:22 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc132.htm (3 of 3) [11/13/2003 5:41:22 PM]


3.1.3.2.1. Controlled/Uncontrolled Variation 3.1.3.2.1. Controlled/Uncontrolled Variation

This process
exhibits
uncontrolled
3. Production Process Characterization variation.
3.1. Introduction to Production Process Characterization Note the
3.1.3. Terminology/Concepts structure in
3.1.3.2. Process Variability the
variation in
the form of
3.1.3.2.1. Controlled/Uncontrolled Variation a linear
trend.
Two trend The two figures below are two trend plots from two different oxide growth processes.
plots Thirty wafers were sampled from each process: one per day over 30 days. Thickness
at the center was measured on each wafer. The x-axis of each graph is the wafer
number and the y-axis is the film thickness in angstroms.

Examples The first process is an example of a process that is "in control" with random
of"in fluctuation about a process location of approximately 990. The second process is an
control" and example of a process that is "out of control" with a process location trending upward
"out of after observation 20.
control"
processes

This process
exhibits
controlled
variation.
Note the
random
fluctuation
about a
constant
mean.

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc1321.htm (1 of 2) [11/13/2003 5:41:22 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc1321.htm (2 of 2) [11/13/2003 5:41:22 PM]


3.1.3.3. Propagating Error 3.1.3.3. Propagating Error

3. Production Process Characterization


3.1. Introduction to Production Process Characterization
3.1.3. Terminology/Concepts

3.1.3.3. Propagating Error


The When we estimate the variance at a particular process step, this variance
variation we is typically not just a result of the current step, but rather is an
see can accumulation of variation from previous steps and from measurement
come from error. Therefore, an important question that we need to answer in PPC is
many how the variation from the different sources accumulates. This will
sources allow us to partition the total variation and assign the parts to the
various sources. Then we can attack the sources that contribute the
most.

How do I Usually we can model the contribution of the various sources of error to
partition the the total error through a simple linear relationship. If we have a simple
error? linear relationship between two variables, say,

then the variance associated with, y, is given by,

.
If the variables are not correlated, then there is no covariance and the
last term in the above equation drops off. A good example of this is the
case in which we have both process error and measurement error. Since
these are usually independent of each other, the total observed variance
is just the sum of the variances for process and measurement.
Remember to never add standard deviations, we must add variances.

How do I Of course, we rarely have the individual components of variation and


calculate the wish to know the total variation. Usually, we have an estimate of the
individual overall variance and wish to break that variance down into its individual
components? components. This is known as components of variance estimation and is
dealt with in detail in the analysis of variance page later in this chapter.

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc133.htm (1 of 2) [11/13/2003 5:41:22 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc133.htm (2 of 2) [11/13/2003 5:41:22 PM]


3.1.3.4. Populations and Sampling 3.1.3.4. Populations and Sampling

3. Production Process Characterization


3.1. Introduction to Production Process Characterization
3.1.3. Terminology/Concepts

3.1.3.4. Populations and Sampling


We take In survey sampling, if you want to know what everyone thinks about a
samples particular topic, you can just ask everyone and record their answers.
from a Depending on how you define the term, everyone (all the adults in a
target town, all the males in the USA, etc.), it may be impossible or
population impractical to survey everyone. The other option is to survey a small
and make group (Sample) of the people whose opinions you are interested in
inferences (Target Population) , record their opinions and use that information to
make inferences about what everyone thinks. Opinion pollsters have
developed a whole body of tools for doing just that and many of those
tools apply to manufacturing as well. We can use these sampling
techniques to take a few measurements from a process and make
statements about the behavior of that process.

Facts about If it weren't for process variation we could just take one sample and
a sample everything would be known about the target population. Unfortunately
are not this is never the case. We cannot take facts about the sample to be facts
necessarily about the population. Our job is to reach appropriate conclusions about
facts about the population despite this variation. The more observations we take
a population from a population, the more our sample data resembles the population.
When we have reached the point at which facts about the sample are
reasonable approximations of facts about the population, then we say the
sample is adequate.

Four Adequacy of a sample depends on the following four attributes:


attributes of ● Representativeness of the sample (is it random?)
samples
● Size of the sample

● Variability in the population

● Desired precision of the estimates

We will learn about choosing representative samples of adequate size in


the section on defining sampling plans.

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc134.htm (1 of 2) [11/13/2003 5:41:22 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc134.htm (2 of 2) [11/13/2003 5:41:22 PM]


3.1.3.5. Process Models 3.1.3.5. Process Models

3. Production Process Characterization


3.1. Introduction to Production Process Characterization
3.1.3. Terminology/Concepts

3.1.3.5. Process Models


Black box As we will see in Section 3 of this chapter, one of the first steps in PPC is to model the
model and process that is under investigation. Two very useful tools for doing this are the
fishbone black-box model and the fishbone diagram.
diagram

We use the We can use the simple black-box model, shown below, to describe most of the tools and
black-box processes we will encounter in PPC. The process will be stimulated by inputs. These
model to inputs can either be controlled (such as recipe or machine settings) or uncontrolled (such
describe as humidity, operators, power fluctuations, etc.). These inputs interact with our process
our and produce outputs. These outputs are usually some characteristic of our process that
processes we can measure. The measurable inputs and outputs can be sampled in order to observe
and understand how they behave and relate to each other.

Diagram
of the
black box These inputs and outputs are also known as Factors and Responses, respectively.
model
Factors
Observed inputs used to explain response behavior (also called explanatory
variables). Factors may be fixed-level controlled inputs or sampled uncontrolled
inputs.
Responses
Sampled process outputs. Responses may also be functions of sampled outputs
such as average thickness or uniformity.

Factors We further categorize factors and responses according to their Variable Type, which
and indicates the amount of information they contain. As the name implies, this classification
Responses is useful for data modeling activities and is critical for selecting the proper analysis
are further technique. The table below summarizes this categorization. The types are listed in order
classified of the amount of information they contain with Measurement containing the most
by information and Nominal containing the least.
variable
type

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc135.htm (1 of 4) [11/13/2003 5:41:23 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc135.htm (2 of 4) [11/13/2003 5:41:23 PM]


3.1.3.5. Process Models 3.1.3.5. Process Models

Table
Type Description Example
describing
the discrete/continuous, order is particle count, oxide thickness,
different Measurement
important, infinite range pressure, temperature
variable
types discrete, order is important, finite
Ordinal run #, wafer #, site, bin
range
good/bad, bin,
discrete, no order, very few
Nominal high/medium/low, shift,
possible values
operator

Fishbone We can use the fishbone diagram to further refine the modeling process. Fishbone
diagrams diagrams are very useful for decomposing the complexity of our manufacturing
help to processes. Typically, we choose a process characteristic (either Factors or Responses)
decompose and list out the general categories that may influence the characteristic (such as material,
complexity machine method, environment, etc.), and then provide more specific detail within each
category. Examples of how to do this are given in the section on Case Studies.

Sample
fishbone
diagram

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc135.htm (3 of 4) [11/13/2003 5:41:23 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc135.htm (4 of 4) [11/13/2003 5:41:23 PM]


3.1.3.6. Experiments and Experimental Design 3.1.3.6. Experiments and Experimental Design

First we When we have many potential factors and we want to see which ones
screen, then are correlated and have the potential to be involved in causal
we build relationships with the responses, we use screening designs to reduce
3. Production Process Characterization models the number of candidates. Once we have a reduced set of influential
3.1. Introduction to Production Process Characterization factors, we can use response surface designs to model the causal
3.1.3. Terminology/Concepts relationships with the responses across the operating range of the
process factors.

3.1.3.6. Experiments and Experimental Techniques The techniques are covered in detail in the process improvement
Design discussed in
process
section and will not be discussed much in this chapter. Examples of
how the techniques are used in PPC are given in the Case Studies.
improvement
Factors and Besides just observing our processes for evidence of stability and chapter
responses capability, we quite often want to know about the relationships
between the various Factors and Responses.

We look for There are generally two types of relationships that we are interested in
correlations for purposes of PPC. They are:
and causal Correlation
relationships
Two variables are said to be correlated if an observed change in
the level of one variable is accompanied by a change in the level
of another variable. The change may be in the same direction
(positive correlation) or in the opposite direction (negative
correlation).
Causality
There is a causal relationship between two variables if a change
in the level of one variable causes a change in the other variable.
Note that correlation does not imply causality. It is possible for two
variables to be associated with each other without one of them causing
the observed behavior in the other. When this is the case it is usually
because there is a third (possibly unknown) causal factor.

Our goal is to Generally, our ultimate goal in PPC is to find and quantify causal
find causal relationships. Once this is done, we can then take advantage of these
relationships relationships to improve and control our processes.

Find Generally, we first need to find and explore correlations and then try to
correlations establish causal relationships. It is much easier to find correlations as
and then try these are just properties of the data. It is much more difficult to prove
to establish causality as this additionally requires sound engineering judgment.
causal There is a systematic procedure we can use to accomplish this in an
relationships efficient manner. We do this through the use of designed experiments.

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc136.htm (1 of 2) [11/13/2003 5:41:23 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc136.htm (2 of 2) [11/13/2003 5:41:23 PM]


3.1.4. PPC Steps 3.1.4. PPC Steps

Step 4: Reporting is an important step that should not be overlooked. By


Report creating an informative report and archiving it in an accessible place, we
can ensure that others have access to the information generated by the
3. Production Process Characterization PPC. Often, the work involved in a PPC can be minimized by using the
3.1. Introduction to Production Process Characterization results of other, similar studies. Examples of PPC reports can be found
in the Case Studies section.

3.1.4. PPC Steps Further The planning and data collection steps are described in detail in the data
information collection section. The analysis and interpretation steps are covered in
Follow these The primary activity of a PPC is to collect and analyze data so that we detail in the analysis section. Examples of the reporting step can be seen
4 steps to may draw conclusions about and ultimately improve our production in the Case Studies.
ensure processes. In many industrial applications, access to production facilities
efficient use for the purposes of conducting experiments is very limited. Thus we
of resources must be very careful in how we go about these activities so that we can
be sure of doing them in a cost-effective manner.

Step 1: Plan The most important step by far is the planning step. By faithfully
executing this step, we will ensure that we only collect data in the most
efficient manner possible and still support the goals of the PPC.
Planning should generate the following:
● a statement of the goals

● a descriptive process model (a list of process inputs and outputs)

● a description of the sampling plan (including a description of the


procedure and settings to be used to run the process during the
study with clear assignments for each person involved)
● a description of the method of data collection, tasks and
responsibilities, formatting, and storage
● an outline of the data analysis

All decisions that affect how the characterization will be conducted


should be made during the planning phase. The process characterization
should be conducted according to this plan, with all exceptions noted.

Step 2: Data collection is essentially just the execution of the sampling plan part
Collect of the previous step. If a good job were done in the planning step, then
this step should be pretty straightforward. It is important to execute to
the plan as closely as possible and to note any exceptions.

Step 3: This is the combination of quantitative (regression, ANOVA,


Analyze and correlation, etc.) and graphical (histograms, scatter plots, box plots, etc.)
interpret analysis techniques that are applied to the collected data in order to
accomplish the goals of the PPC.

http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc14.htm (1 of 2) [11/13/2003 5:41:23 PM] http://www.itl.nist.gov/div898/handbook/ppc/section1/ppc14.htm (2 of 2) [11/13/2003 5:41:23 PM]


3.2. Assumptions / Prerequisites 3.2.1. General Assumptions

3. Production Process Characterization 3. Production Process Characterization


3.2. Assumptions / Prerequisites

3.2. Assumptions / Prerequisites


3.2.1. General Assumptions
Primary The primary goal of PPC is to identify and quantify sources of variation.
goal is to Only by doing this will we be able to define an effective plan for Assumption: In order to employ the modeling techniques described in this section,
identify and variation reduction and process improvement. Sometimes, in order to process is sum there are a few assumptions about the process under study that must
quantify achieve this goal, we must first build mathematical/statistical models of of a systematic be made. First, we must assume that the process can adequately be
sources of our processes. In these models we will identify influential factors and component and modeled as the sum of a systematic component and a random
variation the responses on which they have an effect. We will use these models to a random component. The systematic component is the mathematical model
understand how the sources of variation are influenced by the important component part and the random component is the error or noise present in the
factors. This subsection will review many of the modeling tools we have system. We also assume that the systematic component is fixed over
at our disposal to accomplish these tasks. In particular, the models the range of operating conditions and that the random component has
covered in this section are linear models, Analysis of Variance a constant location, spread and distributional form.
(ANOVA) models and discrete models.
Assumption: Finally, we assume that the data used to fit these models are
Contents: 1. General Assumptions data used to fit representative of the process being modeled. As a result, we must
Section 2 2. Continuous Linear these models additionally assume that the measurement system used to collect the
are data has been studied and proven to be capable of making
3. Analysis of Variance representative measurements to the desired precision and accuracy. If this is not the
1. One-Way of the process case, refer to the Measurement Capability Section of this Handbook.
being modeled
2. Crossed
3. Nested
4. Discrete

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2.htm [11/13/2003 5:41:23 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc21.htm [11/13/2003 5:41:23 PM]


3.2.2. Continuous Linear Model 3.2.2. Continuous Linear Model

Uses The CLM has many uses such as building predictive process models over a
range of process settings that exhibit linear behavior, control charts, process
capability, building models from the data produced by designed
3. Production Process Characterization experiments, and building response surface models for automated process
3.2. Assumptions / Prerequisites control applications.

Examples Shewhart Control Chart - The simplest example of a very common usage
3.2.2. Continuous Linear Model of the CLM is the underlying model used for Shewhart control charts. This
model assumes that the process parameter being measured is a constant with
Description The continuous linear model (CLM) is probably the most commonly used additive Gaussian noise and is given by:
model in PPC. It is applicable in many instances ranging from simple
control charts to response surface models.
Diffusion Furnace - Suppose we want to model the average wafer sheet
The CLM is a mathematical function that relates explanatory variables
resistance as a function of the location or zone in a furnace tube, the
(either discrete or continuous) to a single continuous response variable. It is
temperature, and the anneal time. In this case, let there be 3 distinct zones
called linear because the coefficients of the terms are expressed as a linear
(front, center, back) and temperature and time are continuous explanatory
sum. The terms themselves do not have to be linear.
variables. This model is given by the CLM:
Model The general form of the CLM is:

This equation just says that if we have p explanatory variables then the Diffusion Furnace (cont.) - Usually, the fitted line for the average wafer
response is modeled by a constant term plus a sum of functions of those sheet resistance is not straight but has some curvature to it. This can be
explanatory variables, plus some random error term. This will become clear accommodated by adding a quadratic term for the time parameter as
as we look at some examples below. follows:

Estimation The coefficients for the parameters in the CLM are estimated by the method
of least squares. This is a method that gives estimates which minimize the
sum of the squared distances from the observations to the fitted line or
plane. See the chapter on Process Modeling for a more complete discussion
on estimating the coefficients for these models.

Testing The tests for the CLM involve testing that the model as a whole is a good
representation of the process and whether any of the coefficients in the
model are zero or have no effect on the overall fit. Again, the details for
testing are given in the chapter on Process Modeling.

Assumptions For estimation purposes, there are no additional assumptions necessary for
the CLM beyond those stated in the assumptions section. For testing
purposes, however, it is necessary to assume that the error term is
adequately modeled by a Gaussian distribution.

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc22.htm (1 of 2) [11/13/2003 5:41:24 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc22.htm (2 of 2) [11/13/2003 5:41:24 PM]


3.2.3. Analysis of Variance Models (ANOVA) 3.2.3. Analysis of Variance Models (ANOVA)

From these tables, also called overlays, we can easily calculate the
location and spread of the data as follows:
mean = .126
3. Production Process Characterization
3.2. Assumptions / Prerequisites std. deviation = .0016.

Other While the above example is a trivial structural layout, it illustrates how
3.2.3. Analysis of Variance Models layouts we can split data values into its components. In the next sections, we
(ANOVA) will look at more complicated structural layouts for the data. In
particular we will look at multiple levels of one factor ( One-Way
ANOVA ) and multiple levels of two factors (Two-Way ANOVA)
ANOVA One of the most common analysis activities in PPC is comparison. We where the factors are crossed and nested.
allows us to often compare the performance of similar tools or processes. We also
compare the compare the effect of different treatments such as recipe settings. When
effects of we compare two things, such as two tools running the same operation,
multiple we use comparison techniques. When we want to compare multiple
levels of things, like multiple tools running the same operation or multiple tools
multiple with multiple operators running the same operation, we turn to ANOVA
factors techniques to perform the analysis.

ANOVA The easiest way to understand ANOVA is through a concept known as


splits the value splitting. ANOVA splits the observed data values into components
data into that are attributable to the different levels of the factors. Value splitting
components is best explained by example.

Example: The simplest example of value splitting is when we just have one level
Turned Pins of one factor. Suppose we have a turning operation in a machine shop
where we are turning pins to a diameter of .125 +/- .005 inches.
Throughout the course of a day we take five samples of pins and obtain
the following measurements: .125, .127, .124, .126, .128.

We can split these data values into a common value (mean) and
residuals (what's left over) as follows:
.125 .127 .124 .126 .128
=
.126 .126 .126 .126 .126

+
-.001 .001 -.002 .000 .002

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc23.htm (1 of 2) [11/13/2003 5:41:24 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc23.htm (2 of 2) [11/13/2003 5:41:24 PM]


3.2.3.1. One-Way ANOVA 3.2.3.1. One-Way ANOVA

ANOVA In general, the ANOVA table for the one-way case is given by:
table for
Degrees of
one-way Source Sum of Squares Mean Square
Freedom
3. Production Process Characterization case
3.2. Assumptions / Prerequisites Factor
I-1
3.2.3. Analysis of Variance Models (ANOVA) levels
/(I-1)

3.2.3.1. One-Way ANOVA residuals I(J-1)


/I(J-1)
Description We say we have a one-way layout when we have a single factor with corrected total IJ-1
several levels and multiple observations at each level. With this kind of
layout we can calculate the mean of the observations within each level
of our factor. The residuals will tell us about the variation within each
Level effects The other way is through the use of CLM techniques. If you look at the
level. We can also average the means of each level to obtain a grand
must sum to model above you will notice that it is in the form of a CLM. The only
mean. We can then look at the deviation of the mean of each level from
zero problem is that the model is saturated and no unique solution exists. We
the grand mean to understand something about the level effects. Finally,
we can compare the variation within levels to the variation across levels. overcome this problem by applying a constraint to the model. Since the
Hence the name analysis of variance. level effects are just deviations from the grand mean, they must sum to
zero. By applying the constraint that the level effects must sum to zero,
Model It is easy to model all of this with an equation of the form: we can now obtain a unique solution to the CLM equations. Most
analysis programs will handle this for you automatically. See the chapter
on Process Modeling for a more complete discussion on estimating the
coefficients for these models.
This equation indicates that the jth data value, from level i, is the sum of
three components: the common value (grand mean), the level effect (the Testing The testing we want to do in this case is to see if the observed data
deviation of each level mean from the grand mean), and the residual support the hypothesis that the levels of the factor are significantly
(what's left over). different from each other. The way we do this is by comparing the
within-level variancs to the between-level variance.
Estimation Estimation for the one-way layout can be performed one of two ways.
First, we can calculate the total variation, within-level variation and If we assume that the observations within each level have the same
click here to across-level variation. These can be summarized in a table as shown variance, we can calculate the variance within each level and pool these
see details below and tests can be made to determine if the factor levels are together to obtain an estimate of the overall population variance. This
of one-way significant. The value splitting example illustrates the calculations works out to be the mean square of the residuals.
value involved.
splitting Similarly, if there really were no level effect, the mean square across
levels would be an estimate of the overall variance. Therefore, if there
really were no level effect, these two estimates would be just two
different ways to estimate the same parameter and should be close
numerically. However, if there is a level effect, the level mean square
will be higher than the residual mean square.

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc231.htm (1 of 4) [11/13/2003 5:41:24 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc231.htm (2 of 4) [11/13/2003 5:41:24 PM]


3.2.3.1. One-Way ANOVA 3.2.3.1. One-Way ANOVA

It can be shown that given the assumptions about the data stated below, Test By dividing the Factor-level mean square by the residual mean square,
the ratio of the level mean square and the residual mean square follows we obtain a F-value of 4.86 which is greater than the cut-off value of
an F distribution with degrees of freedom as shown in the ANOVA 2.87 for the F-distribution at 4 and 20 degrees of freedom and 95%
table. If the F-value is significant at a given level of confidence (greater confidence. Therefore, there is sufficient evidence to reject the
than the cut-off value in a F-Table), then there is a level effect present in hypothesis that the levels are all the same.
the data.
Conclusion From the analysis of these data we can conclude that the factor
Assumptions For estimation purposes, we assume the data can adequately be modeled "machine" has an effect. There is a statistically significant difference in
as the sum of a deterministic component and a random component. We the pin diameters across the machines on which they were
further assume that the fixed (deterministic) component can be modeled manufactured.
as the sum of an overall mean and some contribution from the factor
level. Finally, it is assumed that the random component can be modeled
with a Gaussian distribution with fixed location and spread.

Uses The one-way ANOVA is useful when we want to compare the effect of
multiple levels of one factor and we have multiple observations at each
level. The factor can be either discrete (different machine, different
plants, different shifts, etc.) or continuous (different gas flows,
temperatures, etc.).

Example Let's extend the machining example by assuming that we have five
different machines making the same part and we take five random
samples from each machine to obtain the following diameter data:
Machine
1 2 3 4 5
.125 .118 .123 .126 .118
.127 .122 .125 .128 .129
.125 .120 .125 .126 .127
.126 .124 .124 .127 .120
.128 .119 .126 .129 .121

Analyze Using ANOVA software or the techniques of the value-splitting


example, we summarize the data into an ANOVA table as follows:

Sum of Degrees of Mean


Source F-value
Squares Freedom Square
Factor
.000137 4 .000034 4.86 > 2.87
levels
residuals .000132 20 .000007
corrected total .000269 24

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc231.htm (3 of 4) [11/13/2003 5:41:24 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc231.htm (4 of 4) [11/13/2003 5:41:24 PM]


3.2.3.1.1. One-Way Value-Splitting 3.2.3.1.1. One-Way Value-Splitting

Machine
1 2 3 4 5
-.0012 -.0026 -.0016 -.0012 -.005
3. Production Process Characterization
3.2. Assumptions / Prerequisites
.0008 .0014 .0004 .0008 .006
3.2.3. Analysis of Variance Models (ANOVA) -.0012 -.0006 .0004 -.0012 .004
3.2.3.1. One-Way ANOVA -.0002 .0034 -.0006 -.0002 -.003
.0018 -.0016 .0014 .0018 -.002
3.2.3.1.1. One-Way Value-Splitting Calculate The next step is to calculate the grand mean from the individual
the grand machine means as:
Example Let's use the data from the machining example to illustrate how to use mean
the techniques of value-splitting to break each data value into its
component parts. Once we have the component parts, it is then a trivial
matter to calculate the sums of squares and form the F-value for the Grand
test. Mean
.12432
Machine
1 2 3 4 5 Sweep the Finally, we can sweep the grand mean through the individual level
grand mean means to obtain the level effects:
.125 .118 .123 .126 .118
through the
.127 .122 .125 .128 .129 level means
.125 .120 .125 .126 .127
.126 .124 .124 .127 .120 Machine
.128 .119 .126 .129 .121 1 2 3 4 5
.00188 -.00372 .00028 .00288 -.00132
Calculate Remember from our model, , we say each
level-means It is easy to verify that the original data table can be constructed by
observation is the sum of a common value, a level effect and a residual
value. Value-splitting just breaks each observation into its component adding the overall mean, the machine effect and the appropriate
parts. The first step in value-splitting is to calculate the mean values residual.
(rounding to the nearest thousandth) within each machine to get the
level means. Calculate Now that we have the data values split and the overlays created, the next
ANOVA step is to calculate the various values in the One-Way ANOVA table.
Machine values We have three values to calculate for each overlay. They are the sums of
squares, the degrees of freedom, and the mean squares.
1 2 3 4 5
.1262 .1206 .1246 .1272 .123
Total sum of The total sum of squares is calculated by summing the squares of all the
squares data values and subtracting from this number the square of the grand
Sweep level We can then sweep (subtract the level mean from each associated data mean times the total number of data values. We usually don't calculate
means value) the means through the original data table to get the residuals: the mean square for the total sum of squares because we don't use this
value in any statistical test.

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2311.htm (1 of 3) [11/13/2003 5:41:24 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2311.htm (2 of 3) [11/13/2003 5:41:24 PM]


3.2.3.1.1. One-Way Value-Splitting 3.2.3.2. Two-Way Crossed ANOVA

Residual The residual sum of squares is calculated by summing the squares of the
sum of residual values. This is equal to .000132. The degrees of freedom is the
squares, number of unconstrained values. Since the residuals for each level of the
degrees of factor must sum to zero, once we know four of them, the last one is 3. Production Process Characterization
freedom and determined. This means we have four unconstrained values for each 3.2. Assumptions / Prerequisites
mean square level, or 20 degrees of freedom. This gives a mean square of .000007. 3.2.3. Analysis of Variance Models (ANOVA)

Level sum of Finally, to obtain the sum of squares for the levels, we sum the squares
squares, of each value in the level effect overlay and multiply the sum by the 3.2.3.2. Two-Way Crossed ANOVA
degrees of number of observations for each level (in this case 5) to obtain a value
freedom and of .000137. Since the deviations from the level means must sum to zero, Description When we have two factors with at least two levels and one or more
mean square we have only four unconstrained values so the degrees of freedom for observations at each level, we say we have a two-way layout. We say
level effects is 4. This produces a mean square of .000034. that the two-way layout is crossed when every level of Factor A occurs
with every level of Factor B. With this kind of layout we can estimate
Calculate The last step is to calculate the F-value and perform the test of equal the effect of each factor (Main Effects) as well as any interaction
F-value level means. The F- value is just the level mean square divided by the between the factors.
residual mean square. In this case the F-value=4.86. If we look in an
F-table for 4 and 20 degrees of freedom at 95% confidence, we see that Model If we assume that we have K observations at each combination of I
the critical value is 2.87, which means that we have a significant result levels of Factor A and J levels of Factor B, then we can model the
and that there is thus evidence of a strong machine effect. By looking at two-way layout with an equation of the form:
the level-effect overlay we see that this is driven by machines 2 and 4.

This equation just says that the kth data value for the jth level of Factor
B and the ith level of Factor A is the sum of five components: the
common value (grand mean), the level effect for Factor A, the level
effect for Factor B, the interaction effect, and the residual. Note that (ab)
does not mean multiplication; rather that there is interaction between the
two factors.

Estimation Like the one-way case, the estimation for the two-way layout can be
done either by calculating the variance components or by using CLM
techniques.

Click here For the variance components methods we display the data in a two
for the value dimensional table with the levels of Factor A in columns and the levels
splitting of Factor B in rows. The replicate observations fill each cell. We can
example sweep out the common value, the row effects, the column effects, the
interaction effects and the residuals using value-splitting techniques.
Sums of squares can be calculated and summarized in an ANOVA table
as shown below.

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2311.htm (3 of 3) [11/13/2003 5:41:24 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc232.htm (1 of 4) [11/13/2003 5:41:25 PM]


3.2.3.2. Two-Way Crossed ANOVA 3.2.3.2. Two-Way Crossed ANOVA

Degrees Example Let's extend the one-way machining example by assuming that we want
Source Sum of Squares of Mean Square to test if there are any differences in pin diameters due to different types
Freedom of coolant. We still have five different machines making the same part
and we take five samples from each machine for each coolant type to
rows I-1 obtain the following data:
/(I-1) Machine
1 2 3 4 5
columns J-1
.125 .118 .123 .126 .118
/(J-1)
Coolant .127 .122 .125 .128 .129
A .125 .120 .125 .126 .127
interaction (I-1)(J-1)
.126 .124 .124 .127 .120
/(I-1)(J-1)
.128 .119 .126 .129 .121
residuals IJ(K-1) /IJ(K-1) .124 .116 .122 .126 .125
.128 .125 .121 .129 .123
corrected Coolant
IJK-1 .127 .119 .124 .125 .114
total B
.126 .125 .126 .130 .124
.129 .120 .125 .124 .117
We can use CLM techniques to do the estimation. We still have the
problem that the model is saturated and no unique solution exists. We
Analyze For analysis details see the crossed two-way value splitting example.
overcome this problem by applying the constraints to the model that the
two main effects and interaction effects each sum to zero. We can summarize the analysis results in an ANOVA table as follows:

Testing Like testing in the one-way case, we are testing that two main effects Sum of Degrees of
Source Mean Square F-value
and the interaction are zero. Again we just form a ratio of each main Squares Freedom
effect mean square and the interaction mean square to the residual mean machine .000303 4 .000076 8.8 > 2.61
square. If the assumptions stated below are true then those ratios follow coolant .00000392 1 .00000392 .45 < 4.08
an F-distribution and the test is performed by comparing the F-ratios to interaction .00001468 4 .00000367 .42 < 2.61
values in an F-table with the appropriate degrees of freedom and
residuals .000346 40 .0000087
confidence level.
corrected total .000668 49
Assumptions For estimation purposes, we assume the data can be adequately modeled
as described in the model above. It is assumed that the random Test By dividing the mean square for machine by the mean square for
component can be modeled with a Gaussian distribution with fixed residuals we obtain an F-value of 8.8 which is greater than the cut-off
location and spread. value of 2.61 for 4 and 40 degrees of freedom and a confidence of
95%. Likewise the F-values for Coolant and Interaction, obtained by
Uses The two-way crossed ANOVA is useful when we want to compare the dividing their mean squares by the residual mean square, are less than
effect of multiple levels of two factors and we can combine every level their respective cut-off values.
of one factor with every level of the other factor. If we have multiple
observations at each level, then we can also estimate the effects of
interaction between the two factors.

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc232.htm (2 of 4) [11/13/2003 5:41:25 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc232.htm (3 of 4) [11/13/2003 5:41:25 PM]


3.2.3.2. Two-Way Crossed ANOVA 3.2.3.2.1. Two-way Crossed Value-Splitting Example

Conclusion From the ANOVA table we can conclude that machine is the most
important factor and is statistically significant. Coolant is not significant
and neither is the interaction. These results would lead us to believe that
some tool-matching efforts would be useful for improving this process. 3. Production Process Characterization
3.2. Assumptions / Prerequisites
3.2.3. Analysis of Variance Models (ANOVA)
3.2.3.2. Two-Way Crossed ANOVA

3.2.3.2.1. Two-way Crossed Value-Splitting


Example
Example: The data table below is five samples each collected from five different
Coolant is lathes each running two different types of coolant. The measurement is
completely the diameter of a turned pin.
crossed with
Machine
machine
1 2 3 4 5
.125 .118 .123 .126 .118
Coolant .127 .122 .125 .128 .129
A .125 .120 .125 .126 .127
.126 .124 .124 .127 .120
.128 .119 .126 .129 .121
.124 .116 .122 .126 .125
.128 .125 .121 .129 .123
Coolant
.127 .119 .124 .125 .114
B
.126 .125 .126 .130 .124
.129 .120 .125 .124 .117

For the crossed two-way case, the first thing we need to do is to sweep
the cell means from the data table to obtain the residual values. This is
shown in the tables below.

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc232.htm (4 of 4) [11/13/2003 5:41:25 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2321.htm (1 of 3) [11/13/2003 5:41:25 PM]


3.2.3.2.1. Two-way Crossed Value-Splitting Example 3.2.3.2.1. Two-way Crossed Value-Splitting Example

The first Machine What do By looking at the table of residuals, we see that the residuals for coolant
step is to 1 2 3 4 5 these tables B tend to be a little higher than for coolant A. This implies that there
sweep out tell us? may be more variability in diameter when we use coolant B. From the
the cell A .1262 .1206 .1246 .1272 .123 effects table above, we see that machines 2 and 5 produce smaller pin
means to B .1268 .121 .1236 .1268 .1206 diameters than the other machines. There is also a very slight coolant
obtain the -.0012 -.0026 -.0016 -.0012 -.005 effect but the machine effect is larger. Finally, there also appears to be
residuals slight interaction effects. For instance, machines 1 and 2 had smaller
.0008 .0014 .0004 .0008 .006
and means Coolant diameters with coolant A but the opposite was true for machines 3,4 and
A -.0012 -.0006 .0004 -.0012 .004 5.
-.0002 .0034 -.0006 -.0002 -.003
.0018 -.0016 .0014 .0018 -.002 Calculate We can calculate the values for the ANOVA table according to the
-.0028 -.005 -.0016 -.0008 .0044 sums of formulae in the table on the crossed two-way page. This gives the table
.0012 .004 -.0026 .0022 .0024 squares and below. From the F-values we see that the machine effect is significant
Coolant mean but the coolant and the interaction are not.
.0002 -.002 .0004 -.0018 -.0066
B squares
-.0008 .004 .0024 .0032 .0034
.0022 -.001 .0014 -.0028 -.0036
Sums of Degrees of Mean
Source F-value
Squares Freedom Square
Sweep the The next step is to sweep out the row means. This gives the table below. Machine .000303 4 .000076 8.8 > 2.61
row means
Coolant .00000392 1 .00000392 .45 < 4.08
Interaction .00001468 4 .00000367 .42 < 2.61
Machine
Residual .000346 40 .0000087
1 2 3 4 5
Corrected
A .1243 .0019 -.0037 .0003 .0029 -.0013 .000668 49
Total
B .1238 .003 -.0028 -.0002 .003 -.0032

Sweep the Finally, we sweep the column means to obtain the grand mean, row
column (coolant) effects, column (machine) effects and the interaction effects.
means

Machine
1 2 3 4 5
.1241 .0025 -.0033 .00005 .003 -.0023
A .0003 -.0006 -.0005 .00025 .0000 .001
B -.0003 .0006 .0005 -.00025 .0000 -.001

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2321.htm (2 of 3) [11/13/2003 5:41:25 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2321.htm (3 of 3) [11/13/2003 5:41:25 PM]


3.2.3.3. Two-Way Nested ANOVA 3.2.3.3. Two-Way Nested ANOVA

ANOVA Degrees of
table for Source Sum of Squares Mean Square
Freedom
nested case
3. Production Process Characterization rows I-1
3.2. Assumptions / Prerequisites /(I-1)
3.2.3. Analysis of Variance Models (ANOVA)
columns I(J-1)
/I(J-1)
3.2.3.3. Two-Way Nested ANOVA
residuals IJ(K-1) /IJ(K-1)
Description Sometimes, constraints prevent us from crossing every level of one factor
corrected
with every level of the other factor. In these cases we are forced into what IJK-1
total
is known as a nested layout. We say we have a nested layout when fewer
than all levels of one factor occur within each level of the other factor. An
example of this might be if we want to study the effects of different As with the crossed layout, we can also use CLM techniques. We still have
machines and different operators on some output characteristic, but we the problem that the model is saturated and no unique solution exists. We
can't have the operators change the machines they run. In this case, each overcome this problem by applying to the model the constraints that the
operator is not crossed with each machine but rather only runs one two main effects sum to zero.
machine.
Testing We are testing that two main effects are zero. Again we just form a ratio of
Model If Factor B is nested within Factor A, then a level of Factor B can only each main effect mean square to the residual mean square. If the
occur within one level of Factor A and there can be no interaction. This assumptions stated below are true then those ratios follow an F-distribution
gives the following model: and the test is performed by comparing the F-ratios to values in an F-table
with the appropriate degrees of freedom and confidence level.

This equation indicates that each data value is the sum of a common value Assumptions For estimation purposes, we assume the data can be adequately modeled as
(grand mean), the level effect for Factor A, the level effect of Factor B described in the model above. It is assumed that the random component can
nested Factor A, and the residual. be modeled with a Gaussian distribution with fixed location and spread.

Estimation For a nested design we typically use variance components methods to Uses The two-way nested ANOVA is useful when we are constrained from
perform the analysis. We can sweep out the common value, the row combining all the levels of one factor with all of the levels of the other
effects, the column effects and the residuals using value-splitting factor. These designs are most useful when we have what is called a
random effects situation. When the levels of a factor are chosen at random
techniques. Sums of squares can be calculated and summarized in an
rather than selected intentionally, we say we have a random effects model.
ANOVA table as shown below.
An example of this is when we select lots from a production run, then
select units from the lot. Here the units are nested within lots and the effect
Click here It is important to note that with this type of layout, since each level of one of each factor is random.
for nested factor is only present with one level of the other factor, we can't estimate
value- interaction between the two.
splitting
example

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc233.htm (1 of 4) [11/13/2003 5:41:25 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc233.htm (2 of 4) [11/13/2003 5:41:25 PM]


3.2.3.3. Two-Way Nested ANOVA 3.2.3.3. Two-Way Nested ANOVA

Example Let's change the two-way machining example slightly by assuming that we Conclusion From the ANOVA table we can conclude that the Machine is the most
have five different machines making the same part and each machine has important factor and is statistically significant. The effect of Operator
two operators, one for the day shift and one for the night shift. We take five nested within Machine is not statistically significant. Again, any
samples from each machine for each operator to obtain the following data: improvement activities should be focused on the tools.
Machine
1 2 3 4 5
.125 .118 .123 .126 .118
Operator .127 .122 .125 .128 .129
Day .125 .120 .125 .126 .127
.126 .124 .124 .127 .120
.128 .119 .126 .129 .121
.124 .116 .122 .126 .125
.128 .125 .121 .129 .123
Operator
.127 .119 .124 .125 .114
Night
.126 .125 .126 .130 .124
.129 .120 .125 .124 .117

Analyze For analysis details see the nested two-way value splitting example. We
can summarize the analysis results in an ANOVA table as follows:

Sum of Degrees of
Source Mean Square F-value
Squares Freedom
8.77 >
Machine .000303 4 .0000758
2.61
.428 <
Operator(Machine) .0000186 5 .00000372
2.45
Residuals .000346 40 .0000087
Corrected Total .000668 49

Test By dividing the mean square for machine by the mean square for residuals
we obtain an F-value of 8.5 which is greater than the cut-off value of 2.61
for 4 and 40 degrees of freedom and a confidence of 95%. Likewise the
F-value for Operator(Machine), obtained by dividing its mean square by
the residual mean square is less than the cut-off value of 2.45 for 5 and 40
degrees of freedom and 95% confidence.

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc233.htm (3 of 4) [11/13/2003 5:41:25 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc233.htm (4 of 4) [11/13/2003 5:41:25 PM]


3.2.3.3.1. Two-Way Nested Value-Splitting Example 3.2.3.3.1. Two-Way Nested Value-Splitting Example

5 Night -.00224 -.0012 .0044 .0024 -.0066 .0034 -.0036

What By looking at the residuals we see that machines 2 and 5 have the greatest variability.
does this There does not appear to be much of an operator effect but there is clearly a strong machine
3. Production Process Characterization table tell effect.
3.2. Assumptions / Prerequisites us?
3.2.3. Analysis of Variance Models (ANOVA)
3.2.3.3. Two-Way Nested ANOVA Calculate We can calculate the values for the ANOVA table according to the formulae in the table on
sums of the nested two-way page. This produces the table below. From the F-values we see that the
squares
3.2.3.3.1. Two-Way Nested Value-Splitting Example and
machine effect is significant but the operator effect is not. (Here it is assumed that both
factors are fixed).
mean
Example: The data table below contains data collected from five different lathes, each run by two squares
Operator different operators. Note we are concerned here with the effect of operators, so the layout is
is nested nested. If we were concerned with shift instead of operator, the layout would be crossed. Source Sums of Squares Degrees of Freedom Mean Square F-value
within The measurement is the diameter of a turned pin.
machine. Machine .000303 4 .0000758 8.77 > 2.61
Sample Operator(Machine) .0000186 5 .00000372 .428 < 2.45
Machine Operator
1 2 3 4 5 Residual .000346 40 .0000087
Day .125 .127 .125 .126 .128 Corrected Total .000668 49
1
Night .124 .128 .127 .126 .129
Day .118 .122 .120 .124 .119
2
Night .116 .125 .119 .125 .120
Day .123 .125 .125 .124 .126
3
Night .122 .121 .124 .126 .125
Day .126 .128 .126 .127 .129
4
Night .126 .129 .125 .130 .124
Day .118 .129 .127 .120 .121
5
Night .125 .123 .114 .124 .117

For the nested two-way case, just as in the crossed case, the first thing we need to do is to
sweep the cell means from the data table to obtain the residual values. We then sweep the
nested factor (Operator) and the top level factor (Machine) to obtain the table below.

Machine Operator Sample


Common Machine Operator
1 2 3 4 5
Day -.0003 -.0012 .0008 -.0012 -.0002 .0018
1 .00246
Night .0003 -.0028 .0012 .002 -.0008 .0022
Day -.0002 -.0026 .0014 -.0006 .0034 -.0016
2 -.00324
Night .0002 -.005 .004 -.002 .004 -.001
Day .0005 -.0016 .0004 .0004 -.0006 .0014
3 .12404 .00006
Night -.0005 -.0016 -.0026 .0004 .0024 .0014
Day .0002 -.0012 .0008 -.0012 -.002 .0018
4 .00296
Night -.0002 -.0008 .0022 -.0018 .0032 -.0028
Day .0012 -.005 .006 .004 -.003 -.002

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2331.htm (1 of 2) [11/13/2003 5:41:26 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc2331.htm (2 of 2) [11/13/2003 5:41:26 PM]


3.2.4. Discrete Models 3.2.4. Discrete Models

Under the assumption that there is no interaction between the two


classifying variables (like the number of good or bad parts does not
depend on which supplier they came from), we can calculate the counts
3. Production Process Characterization we would expect to see in each cell. Let's call the expected count for any
3.2. Assumptions / Prerequisites cell Eij . Then the expected value for a cell is Eij = Ni. * N.j /N . All we
need to do then is to compare the expected counts to the observed
counts. If there is a consderable difference between the observed counts
3.2.4. Discrete Models and the expected values, then the two variables interact in some way.

Description There are many instances when we are faced with the analysis of Estimation The estimation is very simple. All we do is make a table of the observed
discrete data rather than continuous data. Examples of this are yield counts and then calculate the expected counts as described above.
(good/bad), speed bins (slow/fast/faster/fastest), survey results
(favor/oppose), etc. We then try to explain the discrete outcomes with Testing The test is performed using a Chi-Square goodness-of-fit test according
some combination of discrete and/or continuous explanatory variables. to the following formula:
In this situation the modeling techniques we have learned so far (CLM
and ANOVA) are no longer appropriate.

Contingency There are two primary methods available for the analysis of discrete
table response data. The first one applies to situations in which we have
analysis and discrete explanatory variables and discrete responses and is known as where the summation is across all of the cells in the table.
log-linear Contingency Table Analysis. The model for this is covered in detail in
model this section. The second model applies when we have both discrete and Given the assumptions stated below, this statistic has approximately a
continuous explanatory variables and is referred to as a Log-Linear chi-square distribution and is therefore compared against a chi-square
Model. That model is beyond the scope of this Handbook, but interested table with (r-1)(s-1) degrees of freedom, with r and s as previously
readers should refer to the reference section of this chapter for a list of defined. If the value of the test statistic is less than the chi-square value
useful books on the topic. for a given level of confidence, then the classifying variables are
declared independent, otherwise they are judged to be dependent.
Model Suppose we have n individuals that we classify according to two
criteria, A and B. Suppose there are r levels of criterion A and s levels Assumptions The estimation and testing results above hold regardless of whether the
of criterion B. These responses can be displayed in an r x s table. For sample model is Poisson, multinomial, or product-multinomial. The
example, suppose we have a box of manufactured parts that we classify chi-square results start to break down if the counts in any cell are small,
as good or bad and whether they came from supplier 1, 2 or 3. say < 5.

Now, each cell of this table will have a count of the individuals who fall Uses The contingency table method is really just a test of interaction between
into its particular combination of classification levels. Let's call this discrete explanatory variables for discrete responses. The example given
count Nij. The sum of all of these counts will be equal to the total below is for two factors. The methods are equally applicable to more
number of individuals, N. Also, each row of the table will sum to Ni. factors, but as with any interaction, as you add more factors the
interpretation of the results becomes more difficult.
and each column will sum to N.j .
Example Suppose we are comparing the yield from two manufacturing processes.
We want want to know if one process has a higher yield.

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc24.htm (1 of 3) [11/13/2003 5:41:26 PM] http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc24.htm (2 of 3) [11/13/2003 5:41:26 PM]


3.2.4. Discrete Models 3.3. Data Collection for PPC

Make table Good Bad Totals


of counts Process A 86 14 100
Process B 80 20 100
3. Production Process Characterization
Totals 166 34 200
Table 1. Yields for two production processes
3.3. Data Collection for PPC
We obtain the expected values by the formula given above. This gives
the table below. Start with The data collection process for PPC starts with careful planning. The
careful planning consists of the definition of clear and concise goals, developing
Calculate Good Bad Totals planning process models and devising a sampling plan.
expected Process A 83 17 100
counts Many things This activity of course ends without the actual collection of the data
Process B 83 17 100
can go which is usually not as straightforward as it might appear. Many things
Totals 166 34 200 wrong in the can go wrong in the execution of the sampling plan. The problems can
Table 2. Expected values for two production processes data be mitigated with the use of check lists and by carefully documenting all
collection exceptions to the original sampling plan.
Calculate The chi-square statistic is 1.276. This is below the chi-square value for 1
chi-square degree of freedom and 90% confidence of 2.71 . Therefore, we conclude Table of 1. Set Goals
statistic and that there is not a (significant) difference in process yield. Contents 2. Modeling Processes
compare to
1. Black-Box Models
table value
2. Fishbone Diagrams
Conclusion Therefore, we conclude that there is no statistically significant 3. Relationships and Sensitivities
difference between the two processes.
3. Define the Sampling Plan
1. Identify the parameters, ranges and resolution
2. Design sampling scheme
3. Select sample sizes
4. Design data storage formats
5. Assign roles and responsibilities

http://www.itl.nist.gov/div898/handbook/ppc/section2/ppc24.htm (3 of 3) [11/13/2003 5:41:26 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc3.htm [11/13/2003 5:41:32 PM]


3.3.1. Define Goals 3.3.2. Process Modeling

3. Production Process Characterization


3. Production Process Characterization 3.3. Data Collection for PPC
3.3. Data Collection for PPC

3.3.2. Process Modeling


3.3.1. Define Goals
Identify Process modeling begins by identifying all of the important factors and
influential responses. This is usually best done as a team effort and is limited to the
State concise The goal statement is one of the most important parts of the parameters scope set by the goal statement.
goals characterization plan. With clearly and concisely stated goals, the rest
of the planning process falls naturally into place. Document This activity is best documented in the form of a black-box model as
with seen in the figure below. In this figure all of the outputs are shown on
Goals The goals are usually defined in terms of key specifications or black-box the right and all of the controllable inputs are shown on the left. Any
models inputs or factors that may be observable but not controllable are shown
usually manufacturing indices. We typically want to characterize a process and on the top or bottom.
defined in compare the results against these specifications. However, this is not
terms of key always the case. We may, for instance, just want to quantify key
specifications process parameters and use our estimates of those parameters in some
other activity like controller design or process improvement.

Example Click on each of the links below to see Goal Statements for each of the
goal case studies.
statements 1. Furnace Case Study (Goal)
2. Machine Case Study (Goal)

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc31.htm [11/13/2003 5:41:32 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc32.htm (1 of 3) [11/13/2003 5:41:32 PM]


3.3.2. Process Modeling 3.3.2. Process Modeling

Model The next step is to model relationships of the previously identified


relationships factors and responses. In this step we choose a parameter and identify Examples Click on each of the links below to see the process models for each of
using all of the other parameters that may have an influence on it. This the case studies.
fishbone process is easily documented with fishbone diagrams as illustrated in
1. Case Study 1 (Process Model)
diagrams the figure below. The influenced parameter is put on the center line and
the influential factors are listed off of the centerline and can be grouped 2. Case Study 2 (Process Model)
into major categories like Tool, Material, Work Methods and
Environment.

Document The final step is to document all known information about the
relationships relationships and sensitivities between the inputs and outputs. Some of
and the inputs may be correlated with each other as well as the outputs.
sensitivities There may be detailed mathematical models available from other
studies or the information available may be vague such as for a
machining process we know that as the feed rate increases, the quality
of the finish decreases.

It is best to document this kind of information in a table with all of the


inputs and outputs listed both on the left column and on the top row.
Then, correlation information can be filled in for each of the appropriate
cells. See the case studies for an example.

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc32.htm (2 of 3) [11/13/2003 5:41:32 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc32.htm (3 of 3) [11/13/2003 5:41:32 PM]


3.3.3. Define Sampling Plan 3.3.3.1. Identifying Parameters, Ranges and Resolution

3. Production Process Characterization 3. Production Process Characterization


3.3. Data Collection for PPC 3.3. Data Collection for PPC
3.3.3. Define Sampling Plan

3.3.3. Define Sampling Plan


3.3.3.1. Identifying Parameters, Ranges and
Sampling
plan is
A sampling plan is a detailed outline of which measurements will be
taken at what times, on which material, in what manner, and by whom.
Resolution
detailed Sampling plans should be designed in such a way that the resulting
outline of data will contain a representative sample of the parameters of interest Our goals and the models we built in the previous steps should
measurements and allow for all questions, as stated in the goals, to be answered. provide all of the information needed for selecting parameters and
to be taken determining the expected ranges and the required measurement
resolution.
Steps in the The steps involved in developing a sampling plan are:
sampling plan Goals will tell The first step is to carefully examine the goals. This will tell you
1. identify the parameters to be measured, the range of possible us what to which response variables need to be sampled and how. For instance, if
values, and the required resolution measure and our goal states that we want to determine if an oxide film can be
2. design a sampling scheme that details how and when samples how grown on a wafer to within 10 Angstroms of the target value with a
will be taken uniformity of <2%, then we know we have to measure the film
thickness on the wafers to an accuracy of at least +/- 3 Angstroms and
3. select sample sizes we must measure at multiple sites on the wafer in order to calculate
4. design data storage formats uniformity.
5. assign roles and responsibilities
The goals and the models we build will also indicate which
explanatory variables need to be sampled and how. Since the fishbone
Verify and Once the sampling plan has been developed, it can be verified and then
diagrams define the known important relationships, these will be our
execute passed on to the responsible parties for execution.
best guide as to which explanatory variables are candidates for
measurement.

Ranges help Defining the expected ranges of values is useful for screening outliers.
screen outliers In the machining example , we would not expect to see many values
that vary more than +/- .005" from nominal. Therefore we know that
any values that are much beyond this interval are highly suspect and
should be remeasured.

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc33.htm [11/13/2003 5:41:32 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc331.htm (1 of 2) [11/13/2003 5:41:32 PM]


3.3.3.1. Identifying Parameters, Ranges and Resolution 3.3.3.2. Choosing a Sampling Scheme

Resolution Finally, the required resolution for the measurements should be


helps choose specified. This specification will help guide the choice of metrology
measurement equipment and help define the measurement procedures. As a rule of
equipment thumb, we would like our measurement resolution to be at least 1/10 3. Production Process Characterization
of our tolerance. For the oxide growth example, this means that we 3.3. Data Collection for PPC
want to measure with an accuracy of 2 Angstroms. Similarly, for the 3.3.3. Define Sampling Plan
turning operation we would need to measure the diameter within
.001". This means that vernier calipers would be adequate as the
measurement device for this application.
3.3.3.2. Choosing a Sampling Scheme
Examples Click on each of the links below to see the parameter descriptions for
each of the case studies. A sampling A sampling scheme is a detailed description of what data will be
1. Case Study 1 (Sampling Plan) scheme defines obtained and how this will be done. In PPC we are faced with two
what data will different situations for developing sampling schemes. The first is
2. Case Study 2 (Sampling Plan) be obtained when we are conducting a controlled experiment. There are very
and how efficient and exact methods for developing sampling schemes for
designed experiments and the reader is referred to the Process
Improvement chapter for details.

Passive data The second situation is when we are conducting a passive data
collection collection (PDC) study to learn about the inherent properties of a
process. These types of studies are usually for comparison purposes
when we wish to compare properties of processes against each other
or against some hypothesis. This is the situation that we will focus on
here.

There are two Once we have selected our response parameters, it would seem to be a
principles that rather straightforward exercise to take some measurements, calculate
guide our some statistics and draw conclusions. There are, however, many
choice of things which can go wrong along the way that can be avoided with
sampling careful planning and knowing what to watch for. There are two
scheme overriding principles that will guide the design of our sampling
scheme.

The first is The first principle is that of precision. If the sampling scheme is
precision properly laid out, the difference between our estimate of some
parameter of interest and its true value will be due only to random
variation. The size of this random variation is measured by a quantity
called standard error. The magnitude of the standard error is known
as precision. The smaller the standard error, the more precise are our
estimates.

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc331.htm (2 of 2) [11/13/2003 5:41:32 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc332.htm (1 of 3) [11/13/2003 5:41:32 PM]


3.3.3.2. Choosing a Sampling Scheme 3.3.3.2. Choosing a Sampling Scheme

Precision of The precision of any estimate will depend on: Randomization Randomization is the process of randomly applying the various
an estimate ● the inherent variability of the process estimator helps too treatment combinations. In the above example, we would not want to
depends on apply recipe 1, 2 and 3 in the same order for each of the three shifts
● the measurement error
several factors but would instead randomize the order of the three recipes in each
● the number of independent replications (sample size) shift. This will avoid any systematic errors caused by the order of the
● the efficiency of the sampling scheme. recipes.

The second is The second principle is the avoidance of systematic errors. Systematic Examples The issues here are many and complicated. Click on each of the links
systematic sampling error occurs when the levels of one explanatory variable are below to see the sampling schemes for each of the case studies.
sampling error the same as some other unaccounted for explanatory variable. This is 1. Case Study 1 (Sampling Plan)
(or also referred to as confounded effects. Systematic sampling error is
confounded best seen by example. 2. Case Study 2 (Sampling Plan)
effects)

Example 1: We want to compare the effect of two


different coolants on the resulting surface finish from a
turning operation. It is decided to run one lot, change the
coolant and then run another lot. With this sampling
scheme, there is no way to distinguish the coolant effect
from the lot effect or from tool wear considerations.
There is systematic sampling error in this sampling
scheme.
Example 2: We wish to examine the effect of two
pre-clean procedures on the uniformity of an oxide
growth process. We clean one cassette of wafers with
one method and another cassette with the other method.
We load one cassette in the front of the furnace tube and
the other cassette in the middle. To complete the run, we
fill the rest of the tube with other lots. With this sampling
scheme, there is no way to distinguish between the effect
of the different pre-clean methods and the cassette effect
or the tube location effect. Again, we have systematic
sampling errors.

Stratification The way to combat systematic sampling errors (and at the same time
helps to increase precision) is through stratification and randomization.
overcome Stratification is the process of segmenting our population across
systematic levels of some factor so as to minimize variability within those
error segments or strata. For instance, if we want to try several different
process recipes to see which one is best, we may want to be sure to
apply each of the recipes to each of the three work shifts. This will
ensure that we eliminate any systematic errors caused by a shift effect.
This is where the ANOVA designs are particularly useful.

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc332.htm (2 of 3) [11/13/2003 5:41:32 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc332.htm (3 of 3) [11/13/2003 5:41:32 PM]


3.3.3.3. Selecting Sample Sizes 3.3.3.3. Selecting Sample Sizes

Practicality Of course the sample size you select must make sense. This is where
the trade-offs usually occur. We want to take enough observations to
obtain reasonably precise estimates of the parameters of interest but we
3. Production Process Characterization also want to do this within a practical resource budget. The important
3.3. Data Collection for PPC thing is to quantify the risks associated with the chosen sample size.
3.3.3. Define Sampling Plan
Sample size In summary, the steps involved in estimating a sample size are:
determination 1. There must be a statement about what is expected of the sample.
3.3.3.3. Selecting Sample Sizes We must determine what is it we are trying to estimate, how
precise we want the estimate to be, and what are we going to do
Consider When choosing a sample size, we must consider the following issues: with the estimate once we have it. This should easily be derived
these things ● What population parameters we want to estimate
from the goals.
when 2. We must find some equation that connects the desired precision
● Cost of sampling (importance of information)
selecting a of the estimate with the sample size. This is a probability
sample size ● How much is already known
statement. A couple are given below; see your statistician if
● Spread (variability) of the population these are not appropriate for your situation.
● Practicality: how hard is it to collect data 3. This equation may contain unknown properties of the population
● How precise we want the final estimates to be such as the mean or variance. This is where prior information
can help.
Cost of The cost of sampling issue helps us determine how precise our 4. If you are stratifying the population in order to reduce variation,
taking estimates should be. As we will see below, when choosing sample sample size determination must be performed for each stratum.
samples sizes we need to select risk values. If the decisions we will make from 5. The final sample size should be scrutinized for practicality. If it
the sampling activity are very valuable, then we will want low risk is unacceptable, the only way to reduce it is to accept less
values and hence larger sample sizes. precision in the sample estimate.

Prior If our process has been studied before, we can use that prior Sampling When we are sampling proportions we start with a probability
information information to reduce sample sizes. This can be done by using prior proportions statement about the desired precision. This is given by:
mean and variance estimates and by stratifying the population to
reduce variation within groups.

Inherent We take samples to form estimates of some characteristic of the where


variability population of interest. The variance of that estimate is proportional to
the inherent variability of the population divided by the sample size: ● is the estimated proportion
● P is the unknown population parameter

● is the specified precision of the estimate


● is the probability value (usually low)
.
This equation simply shows that we want the probability that the
with denoting the parameter we are trying to estimate. This means precision of our estimate being less than we want is . Of course we
that if the variability of the population is large, then we must take many
like to set low, usually .1 or less. Using some assumptions about
samples. Conversely, a small population variance means we don't have
the proportion being approximately normally distributed we can obtain
to take as many samples.
an estimate of the required sample size as:

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc333.htm (1 of 4) [11/13/2003 5:41:33 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc333.htm (2 of 4) [11/13/2003 5:41:33 PM]


3.3.3.3. Selecting Sample Sizes 3.3.3.3. Selecting Sample Sizes

Example Suppose we want to sample a stable process that deposits a 500


Angstrom film on a semiconductor wafer in order to determine the
process mean so that we can set up a control chart on the process. We
where z is the ordinate on the Normal curve corresponding to .
want to estimate the mean within 10 Angstroms ( = 10) of the true
Example Let's say we have a new process we want to try. We plan to run the mean with 95% confidence ( = .05, Z = 2). Our initial guess
new process and sample the output for yield (good/bad). Our current regarding the variation in the process is that one standard deviation is
process has been yielding 65% (p=.65, q=.35). We decide that we want about 20 Angstroms. This gives a sample size estimate of n = 16. Thus,
if we take at least 16 samples from this process and estimate the mean
the estimate of the new process yield to be accurate to within = .10
film thickness, we can be 95% sure that the estimate is within 10% of
at 95% confidence ( = .05, z=2). Using the formula above we get a the true mean value.
sample size estimate of n=91. Thus, if we draw 91 random parts from
the output of the new process and estimate the yield, then we are 95%
sure the yield estimate is within .10 of the true process yield.

Estimating If we are sampling continuous normally distributed variables, quite


location: often we are concerned about the relative error of our estimates rather
relative error than the absolute error. The probability statement connecting the
desired precision to the sample size is given by:

where is the (unknown) population mean and is the sample mean.


Again, using the normality assumptions we obtain the estimated
sample size to be:

with 2 denoting the population variance.

Estimating If instead of relative error, we wish to use absolute error, the equation
location: for sample size looks alot like the one for the case of proportions:
absolute
error

where is the population standard deviation (but in practice is


usually replaced by an engineering guesstimate).

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc333.htm (3 of 4) [11/13/2003 5:41:33 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc333.htm (4 of 4) [11/13/2003 5:41:33 PM]


3.3.3.4. Data Storage and Retrieval 3.3.3.4. Data Storage and Retrieval

3. Production Process Characterization


3.3. Data Collection for PPC
3.3.3. Define Sampling Plan

3.3.3.4. Data Storage and Retrieval


Data control If you are in a small manufacturing facility or a lab, you can simply
depends on design a sampling plan, run the material, take the measurements, fill in
facility size the run sheet and go back to your computer to analyze the results. There
really is not much to be concerned with regarding data storage and
retrieval.

In most larger facilities, however, the people handling the material


usually have nothing to do with the design. Quite often the
measurements are taken automatically and may not even be made in the
same country where the material was produced. Your data go through a
long chain of automatic acquisition, storage, reformatting, and retrieval
before you are ever able to see it. All of these steps are fraught with
peril and should be examined closely to ensure that valuable data are not
lost or accidentally altered.

Know the In the planning phase of the PPC, be sure to understand the entire data
process collection process. Things to watch out for include:
involved ● automatic measurement machines rejecting outliers

● only summary statistics (mean and standard deviation) being


saved
● values for explanatory variables (location, operator, etc.) are not
being saved
● how missing values are handled

Consult with It is important to consult with someone from the organization


support staff responsible for maintaining the data system early in the planning phase
early on of the PPC. It can also be worthwhile to perform some "dry runs" of the
data collection to ensure you will be able to actually acquire the data in
the format as defined in the plan.

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc334.htm (1 of 2) [11/13/2003 5:41:33 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc334.htm (2 of 2) [11/13/2003 5:41:33 PM]


3.3.3.5. Assign Roles and Responsibilities 3.3.3.5. Assign Roles and Responsibilities

CIM Owns Enterprise ● Maintains data


Information collection system
System ● Maintains equipment
interfaces and data
3. Production Process Characterization formatters
3.3. Data Collection for PPC ● Maintains databases
3.3.3. Define Sampling Plan and information access
Statistician Consultant ● Consults on
3.3.3.5. Assign Roles and Responsibilities experimental design
● Consults on data
analysis
PPC is a team In today's manufacturing environment, it is unusual when an
effort, get investigative study is conducted by a single individual. Most PPC Quality Control Controls Material ● Ensures quality of
everyone studies will be a team effort. It is important that all individuals who incoming material
involved early will be involved in the study become a part of the team from the ● Must approve shipment
beginning. Many of the various collateral activities will need of outgoing material
approvals and sign-offs. Be sure to account for that cycle time in your (especially for recipe
plan. changes)

Table showing A partial list of these individuals along with their roles and potential
roles and responsibilities is given in the table below. There may be multiple
potential occurrences of each of these individuals across shifts or process steps,
responsibilities so be sure to include everyone.
Tool Owner Controls Tool ● Schedules tool time
Operations ● Ensures tool state
● Advises on
experimental design
Process Owner Controls Process ● Advises on
Recipe experimental design
● Controls recipe settings
Tool Operator Executes ● Executes experimental
Experimental Plan runs
● May take
measurements
Metrology Own Measurement ● Maintains metrology
Tools equipment
● Conducts gauge studies
● May take
measurements

http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc335.htm (1 of 2) [11/13/2003 5:41:33 PM] http://www.itl.nist.gov/div898/handbook/ppc/section3/ppc335.htm (2 of 2) [11/13/2003 5:41:33 PM]


3.4. Data Analysis for PPC 3.4.1. First Steps

3. Production Process Characterization 3. Production Process Characterization


3.4. Data Analysis for PPC

3.4. Data Analysis for PPC


3.4.1. First Steps
In this section we will learn how to analyze and interpret the data we
collected in accordance with our data collection plan. Gather all After executing the data collection plan for the characterization study,
of the data the data must be gathered up for analysis. Depending on the scope of the
Click on This section discusses the following topics: into one study, the data may reside in one place or in many different places. It
desired 1. Initial Data Analysis place may be in common factory databases, flat files on individual computers,
topic to read or handwritten on run sheets. Whatever the case, the first step will be to
more 1. Gather Data collect all of the data from the various sources and enter it into a single
2. Quality Checking the Data data file. The most convenient format for most data analyses is the
variables-in-columns format. This format has the variable names in
3. Summary Analysis (Location, Spread and Shape)
column headings and the values for the variables in the rows.
2. Exploring Relationships
1. Response Correlations Perform a The next step is to perform a quality check on the data. Here we are
quality typically looking for data entry problems, unusual data values, missing
2. Exploring Main Effects
check on the data, etc. The two most useful tools for this step are the scatter plot and
3. Exploring First-Order Interactions data using the histogram. By constructing scatter plots of all of the response
3. Building Models graphical variables, any data entry problems will be easily identified. Histograms
and of response variables are also quite useful for identifying data entry
1. Fitting Polynomial Models numerical problems. Histograms of explanatory variables help identify problems
2. Fitting Physical Models techniques with the execution of the sampling plan. If the counts for each level of
4. Analyzing Variance Structure the explanatory variables are not the same as called for in the sampling
plan, you know you may have an execution problem. Running
5. Assessing Process Stablility
numerical summary statistics on all of the variables (both response and
6. Assessing Process Capability explanatory) also helps to identify data problems.
7. Checking Assumptions
Summarize Once the data quality problems are identified and fixed, we should
data by estimate the location, spread and shape for all of the response variables.
estimating This is easily done with a combination of histograms and numerical
location, summary statistics.
spread and
shape

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc4.htm [11/13/2003 5:41:34 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc41.htm [11/13/2003 5:41:34 PM]


3.4.2. Exploring Relationships 3.4.2.1. Response Correlations

3. Production Process Characterization


3. Production Process Characterization 3.4. Data Analysis for PPC
3.4. Data Analysis for PPC 3.4.2. Exploring Relationships

3.4.2. Exploring Relationships 3.4.2.1. Response Correlations


Make In this first phase of exploring our data, we plot all of the response variables in a pairwise fashion.
The first Once we have a data file created in the desired format, checked the scatter The individual scatter plots are displayed in a matrix form with the y-axis scaling the same for all
analysis of data integrity, and have estimated the summary statistics on the plots of plots in a row of the matrix.
our data is response variables, the next step is to start exploring the data and to try all of the
exploration to understand the underlying structure. The most useful tools will be response
variables
various forms of the basic scatter plot and box plot.
Check the The scatterplot matrix shows how the response variables are related to each other. If there is a linear
These techniques will allow pairwise explorations for examining slope of trend with a positive slope, this indicates that the responses are positively correlated. If there is a
relationships between any pair of response variables, any pair of the data linear trend with a negative slope, then the variables are negatively correlated. If the data appear
explanatory and response variables, or a response variable as a on the random with no slope, the variables are probably not correlated. This will be important information
function of any two explanatory variables. Beyond three dimensions scatter for subsequent model building steps.
we are pretty much limited by our human frailties at visualization. plots

This An example of a scatterplot matrix is given below. In this semiconductor manufacturing example,
Graph In this exploratory phase, the key is to graph everything that makes scatterplot three responses, yield (Bin1), N-channel Id effective (NIDEFF), and P-channel Id effective
everything sense to graph. These pictures will not only reveal any additional matrix (PIDEFF) are plotted against each other in a scatterplot matrix. We can see that Bin1 is positively
that makes quality problems with the data but will also reveal influential data shows correlated with NIDEFF and negatively correlated with PIDEFF. Also, as expected, NIDEFF is
sense points and will guide the subsequent modeling activities. examples negatively correlated with PIDEFF. This kind of information will prove to be useful when we build
of both models for yield improvement.
negatively
Graph The order that generally proves most effective for data analysis is to and
responses, first graph all of the responses against each other in a pairwise fashion. positively
then Then we graph responses against the explanatory variables. This will correlated
explanatory give an indication of the main factors that have an effect on response variables
versus variables. Finally, we graph response variables, conditioned on the
response, levels of explanatory factors. This is what reveals interactions between
then explanatory variables. We will use nested boxplots and block plots to
conditional visualize interactions.
plots

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc42.htm [11/13/2003 5:41:34 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc421.htm (1 of 2) [11/13/2003 5:41:35 PM]


3.4.2.1. Response Correlations 3.4.2.2. Exploring Main Effects

3. Production Process Characterization


3.4. Data Analysis for PPC
3.4.2. Exploring Relationships

3.4.2.2. Exploring Main Effects


The next The next step in the exploratory analysis of our data is to see which factors have an effect on which
step is to response variables and to quantify that effect. Scatter plots and box plots will be the tools of choice
look for here.
main effects

Watch out This step is relatively self explanatory. However there are two points of caution. First, be cognizant
for varying of not only the trends in these graphs but also the amount of data represented in those trends. This is
sample especially true for categorical explanatory variables. There may be many more observations in some
sizes across levels of the categorical variable than in others. In any event, take unequal sample sizes into account
levels when making inferences.

Graph The second point is to be sure to graph the responses against implicit explanatory variables (such as
implicit as observation order) as well as the explicit explanatory variables. There may be interesting insights in
well as these hidden explanatory variables.
explicit
explanatory
variables

Example: In the example below, we have collected data on the particles added to a wafer during a particular
wafer processing step. We ran a number of cassettes through the process and sampled wafers from certain
processing slots in the cassette. We also kept track of which load lock the wafers passed through. This was done
for two different process temperatures. We measured both small particles (< 2 microns) and large
particles (> 2 microns). We plot the responses (particle counts) against each of the explanatory
variables.

Cassette This first graph is a box plot of the number of small particles added for each cassette type. The "X"'s
does not in the plot represent the maximum, median, and minimum number of particles.
appear to
be an
important
factor for
small or
large
particles

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc421.htm (2 of 2) [11/13/2003 5:41:35 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (1 of 9) [11/13/2003 5:41:41 PM]


3.4.2.2. Exploring Main Effects 3.4.2.2. Exploring Main Effects

The second graph is a box plot of the number of large particles added for each cassette type. We conclude from these two box plots that cassette does not appear to be an important factor for
small or large particles.

There is a We next generate box plots of small and large particles for the slot variable. First, the box plot for
difference small particles.
between
slots for
small
particles,
one slot is
different for
large
particles

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (2 of 9) [11/13/2003 5:41:41 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (3 of 9) [11/13/2003 5:41:41 PM]


3.4.2.2. Exploring Main Effects 3.4.2.2. Exploring Main Effects

Next, the box plot for large particles. We conclude that there is a difference between slots for small particles. We also conclude that one
slot appears to be different for large particles.

Load lock We next generate box plots of small and large particles for the load lock variable. First, the box plot
may have a for small particles.
slight effect
for small
and large
particles

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (4 of 9) [11/13/2003 5:41:41 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (5 of 9) [11/13/2003 5:41:41 PM]


3.4.2.2. Exploring Main Effects 3.4.2.2. Exploring Main Effects

Next, the box plot for large particles. We conclude that there may be a slight effect for load lock for small and large particles.

For small We next generate box plots of small and large particles for the temperature variable. First, the box
particles, plot for small particles.
temperature
has a
strong
effect on
both
location
and spread.
For large
particles,
there may
be a slight
temperature
effect but
this may
just be due
to the
outliers

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (6 of 9) [11/13/2003 5:41:41 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (7 of 9) [11/13/2003 5:41:41 PM]


3.4.2.2. Exploring Main Effects 3.4.2.2. Exploring Main Effects

'
Next, the box plot for large particles. We conclude that temperature has a strong effect on both location and spread for small particles. We
conclude that there might be a small temperature effect for large particles, but this may just be due to
outliers.

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (8 of 9) [11/13/2003 5:41:41 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc422.htm (9 of 9) [11/13/2003 5:41:41 PM]


3.4.2.3. Exploring First Order Interactions 3.4.2.3. Exploring First Order Interactions

3. Production Process Characterization


3.4. Data Analysis for PPC
3.4.2. Exploring Relationships

3.4.2.3. Exploring First Order Interactions


It is The final step (and perhaps the most important one) in the exploration phase is to find any first order
important interactions. When the difference in the response between the levels of one factor is not the same for
to identify all of the levels of another factor we say we have an interaction between those two factors. When
interactions we are trying to optimize responses based on factor settings, interactions provide for compromise.

The eyes Interactions can be seen visually by using nested box plots. However, caution should be exercised
can be when identifying interactions through graphical means alone. Any graphically identified interactions
deceiving - should be verified by numerical methods as well.
be careful

Previous To continue the previous example, given below are nested box plots of the small and large particles.
example The load lock is nested within the two temperature values. There is some evidence of possible
continued interaction between these two factors. The effect of load lock is stronger at the lower temperature
than at the higher one. This effect is stronger for the smaller particles than for the larger ones. As
this example illustrates, when you have significant interactions the main effects must be interpreted
conditionally. That is, the main effects do not tell the whole story by themselves.
We conclude from this plot that for small particles, the load lock effect is not as strong for high
For small The following is the box plot of small particles for load lock nested within temperature. temperature as it is for low temperature.
particles,
the load The same The following is the box plot of large particles for load lock nested within temperature.
lock effect may be true
is not as for large
strong for particles
high but not as
temperature strongly
as it is for
low
temperature

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc423.htm (1 of 3) [11/13/2003 5:41:44 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc423.htm (2 of 3) [11/13/2003 5:41:44 PM]


3.4.2.3. Exploring First Order Interactions 3.4.3. Building Models

3. Production Process Characterization


3.4. Data Analysis for PPC

3.4.3. Building Models


Black box When we develop a data collection plan we build black box models of the
models process we are studying like the one below:

In our data
collection plan
we drew
process model
pictures

We conclude from this plot that for large particles, the load lock effect may not be as strong for high
temperature as it is for low temperature. However, this effect is not as strong as it is for small
particles.

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc423.htm (3 of 3) [11/13/2003 5:41:44 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc43.htm (1 of 2) [11/13/2003 5:41:44 PM]


3.4.3. Building Models 3.4.3.1. Fitting Polynomial Models

Numerical In the Exploring Relationships section, we looked at how to identify the


models are input/output relationships through graphical methods. However, if we want to
explicit quantify the relationships and test them for statistical significance, we must
representations resort to building mathematical models. 3. Production Process Characterization
of our process 3.4. Data Analysis for PPC
model pictures 3.4.3. Building Models

Polynomial There are two cases that we will cover for building mathematical models. If our
models are goal is to develop an empirical prediction equation or to identify statistically 3.4.3.1. Fitting Polynomial Models
generic significant explanatory variables and quantify their influence on output
descriptors of responses, we typically build polynomial models. As the name implies, these are
Polynomial We use polynomial models to estimate and predict the shape of
our output polynomial functions (typically linear or quadratic functions) that describe the
surface
models are a response values over a range of input parameter values. Polynomial
relationships between the explanatory variables and the response variable.
great tool models are a great tool for determining which input factors drive
for responses and in what direction. These are also the most common
Physical On the other hand, if our goal is to fit an existing theoretical equation, then we determining models used for analysis of designed experiments. A quadratic
models want to build physical models. Again, as the name implies, this pertains to the which input (second-order) polynomial model for two explanatory variables has the
describe the case when we already have equations representing the physics involved in the factors drive form of the equation below. The single x-terms are called the main
underlying process and we want to estimate specific parameter values. responses effects. The squared terms are called the quadratic effects and are used
physics of our and in what to model curvature in the response surface. The cross-product terms are
processes
direction used to model interactions between the explanatory variables.

We generally In most engineering and manufacturing applications we are concerned


don't need with at most second-order polynomial models. Polynomial equations
more than obviously could become much more complicated as we increase the
second-order number of explanatory variables and hence the number of cross-product
equations terms. Fortunately, we rarely see significant interaction terms above the
two-factor level. This helps to keep the equations at a manageable level.

Use multiple When the number of factors is small (less than 5), the complete
regression to polynomial equation can be fitted using the technique known as
fit multiple regression. When the number of factors is large, we should use
polynomial a technique known as stepwise regression. Most statistical analysis
models programs have a stepwise regression capability. We just enter all of the
terms of the polynomial models and let the software choose which
terms best describe the data. For a more thorough discussion of this
topic and some examples, refer to the process improvement chapter.

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc43.htm (2 of 2) [11/13/2003 5:41:44 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc431.htm (1 of 2) [11/13/2003 5:41:45 PM]


3.4.3.1. Fitting Polynomial Models 3.4.3.2. Fitting Physical Models

3. Production Process Characterization


3.4. Data Analysis for PPC
3.4.3. Building Models

3.4.3.2. Fitting Physical Models


Sometimes Sometimes, rather than approximating response behavior with polynomial
we want models, we know and can model the physics behind the underlying process. In
to use a these cases we would want to fit physical models to our data. This kind of
physical modeling allows for better prediction and is less subject to variation than
model polynomial models (as long as the underlying process doesn't change).

We will We will illustrate this concept with an example. We have collected data on a
use a chemical/mechanical planarization process (CMP) at a particular semiconductor
CMP processing step. In this process, wafers are polished using a combination of
process to chemicals in a polishing slurry using polishing pads. We polished a number of
illustrate wafers for differing periods of time in order to calculate material removal rates.

CMP From first principles we know that removal rate changes with time. Early on,
removal removal rate is high and as the wafer becomes more planar the removal rate
rate can declines. This is easily modeled with an exponential function of the form:
be removal rate = p1 + p2 x exp p3 x time
modeled
with a where p1, p2, and p3 are the parameters we want to estimate.
non-linear
equation

A The equation was fit to the data using a non-linear regression routine. A plot of
non-linear the original data and the fitted line are given in the image below. The fit is quite
regression good. This fitted equation was subsequently used in process optimization work.
routine
was used
to fit the
data to
the
equation

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc431.htm (2 of 2) [11/13/2003 5:41:45 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc432.htm (1 of 2) [11/13/2003 5:41:45 PM]


3.4.3.2. Fitting Physical Models 3.4.4. Analyzing Variance Structure

3. Production Process Characterization


3.4. Data Analysis for PPC

3.4.4. Analyzing Variance Structure


Studying One of the most common activities in process characterization work is to study the variation
variation is associated with the process and to try to determine the important sources of that variation. This
important is called analysis of variance. Refer to the section of this chapter on ANOVA models for a
in PPC discussion of the theory behind this kind of analysis.

The key to performing an analysis of variance is identifying the structure represented by the
The key is data. In the ANOVA models section we discussed one-way layouts and two-way layouts where
to know the
the factors are either crossed or nested. Review these sections if you want to learn more about
structure
ANOVA structural layouts.

To perform the analysis, we just identify the structure, enter the data for each of the factors and
levels into a statistical analysis program and then interpret the ANOVA table and other output.
This is all illustrated in the example below.

Example: The example is a furnace operation in semiconductor manufacture where we are growing an
furnace oxide layer on a wafer. Each lot of wafers is placed on quartz containers (boats) and then placed
oxide in a long tube-furnace. They are then raised to a certain temperature and held for a period of
thickness time in a gas flow. We want to understand the important factors in this operation. The furnace is
with a broken down into four sections (zones) and two wafers from each lot in each zone are measured
1-way for the thickness of the oxide layer.
layout

Look at The first thing to look at is the effect of zone location on the oxide thickness. This is a classic
effect of one-way layout. The factor is furnace zone and we have four levels. A plot of the data and an
zone ANOVA table are given below.
location on
oxide
thickness

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc432.htm (2 of 2) [11/13/2003 5:41:45 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc44.htm (1 of 2) [11/13/2003 5:41:45 PM]


3.4.4. Analyzing Variance Structure 3.4.5. Assessing Process Stability

3. Production Process Characterization


The zone 3.4. Data Analysis for PPC
effect is
masked by
the
lot-to-lot
3.4.5. Assessing Process Stability
variation
A process is A manufacturing process cannot be released to production until it has
stable if it has a been proven to be stable. Also, we cannot begin to talk about process
constant mean capability until we have demonstrated stability in our process. A
and a constant process is said to be stable when all of the response parameters that
variance over we use to measure the process have both constant means and
time constant variances over time, and also have a constant distribution.
ANOVA Analysis of Variance
table This is equivalent to our earlier definition of controlled variation.

Source DF SS Mean Square F Ratio Prob > F The graphical The graphical tool we use to assess process stability is the scatter
Zone 3 912.6905 304.23 0.467612 0.70527 tool we use to plot. We collect a sufficient number of independent samples (greater
Within 164 106699.1 650.604 assess stability than 100) from our process over a sufficiently long period of time
is the scatter (this can be specified in days, hours of processing time or number of
Let's From the graph there does not appear to be much of a zone effect; in fact, the ANOVA table plot or the
account for indicates that it is not significant. The problem is that variation due to lots is so large that it is parts processed) and plot them on a scatter plot with sample order on
control chart the x-axis and the sample value on the y-axis. The plot should look
lot with a masking the zone effect. We can fix this by adding a factor for lot. By treating this as a nested
nested two-way layout, we obtain the ANOVA table below. like constant random variation about a constant mean. Sometimes it
layout is helpful to calculate control limits and plot them on the scatter plot
along with the data. The two plots in the controlled variation
Now both Analysis of Variance example are good illustrations of stable and unstable processes.
lot and zone
are
revealed as Source DF SS Mean Square F Ratio Prob > F Numerically, Numerically, we evaluate process stability through a times series
important Lot 20 61442.29 3072.11 5.37404 1.39e-7 we assess its analysis concept know as stationarity. This is just another way of
Zone[lot] 63 36014.5 571.659 4.72864 3.9e-11 stationarity saying that the process has a constant mean and a constant variance.
Within 84 10155 120.893 using the The numerical technique used to assess stationarity is the
autocorrelation autocovariance function.
Conclusions Since the "Prob > F" is less than .05, for both lot and zone, we know that these factors are function
statistically significant at the 95% level of confidence.
Graphical Typically, graphical methods are good enough for evaluating process
methods stability. The numerical methods are generally only used for
usually good modeling purposes.
enough

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc44.htm (2 of 2) [11/13/2003 5:41:45 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc45.htm (1 of 2) [11/13/2003 5:41:45 PM]


3.4.5. Assessing Process Stability 3.4.6. Assessing Process Capability

3. Production Process Characterization


3.4. Data Analysis for PPC

3.4.6. Assessing Process Capability


Capability Process capability analysis entails comparing the performance of a process against its specifications.
compares a We say that a process is capable if virtually all of the possible variable values fall within the
process specification limits.
against its
specification

Use a Graphically, we assess process capability by plotting the process specification limits on a histogram
capability of the observations. If the histogram falls within the specification limits, then the process is capable.
chart This is illustrated in the graph below. Note how the process is shifted below target and the process
variation is too large. This is an example of an incapable process.

Notice how
the process is
off target and
has too much
variation

Numerically, we measure capability with a capability index. The general equation for the capability
index, Cp, is:

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc45.htm (2 of 2) [11/13/2003 5:41:45 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc46.htm (1 of 2) [11/13/2003 5:41:48 PM]


3.4.6. Assessing Process Capability 3.4.7. Checking Assumptions

Numerically,
we use the Cp
index
3. Production Process Characterization
3.4. Data Analysis for PPC
Interpretation This equation just says that the measure of our process capability is how much of our observed
of the Cp process variation is covered by the process specifications. In this case the process variation is
index measured by 6 standard deviations (+/- 3 on each side of the mean). Clearly, if Cp > 1.0, then the 3.4.7. Checking Assumptions
process specification covers almost all of our process observations.
Check the Many of the techniques discussed in this chapter, such as hypothesis tests, control charts and
Cp does not The only problem with with the Cp index is that it does not account for a process that is off-center. normality of capability indices, assume that the underlying structure of the data can be adequately modeled by a
account for We can modify this equation slightly to account for off-center processes to obtain the Cpk index as the data normal distribution. Many times we encounter data where this is not the case.
process that follows:
is off center Some causes There are several things that could cause the data to appear non-normal, such as:
of non- ● The data come from two or more different sources. This type of data will often have a
normality multi-modal distribution. This can be solved by identifying the reason for the multiple sets of
Or the Cpk data and analyzing the data separately.
index ● The data come from an unstable process. This type of data is nearly impossible to analyze
because the results of the analysis will have no credibility due to the changing nature of the
process.
Cpk accounts This equation just says to take the minimum distance between our specification limits and the ● The data were generated by a stable, yet fundamentally non-normal mechanism. For example,

for a process process mean and divide it by 3 standard deviations to arrive at the measure of process capability. particle counts are non-normal by the very nature of the particle generation process. Data of
being off This is all covered in more detail in the process capability section of the process monitoring chapter. this type can be handled using transformations.
center For the example above, note how the Cpk value is less than the Cp value. This is because the process
distribution is not centered between the specification limits. We can For the last case, we could try transforming the data using what is known as a power
sometimes transformation. The power transformation is given by the equation:
transform the
data to make it
look normal

where Y represents the data and lambda is the transformation value. Lambda is typically any value
between -2 and 2. Some of the more common values for lambda are 0, 1/2, and -1, which give the
following transformations:

General The general algorithm for trying to make non-normal data appear to be approximately normal is to:
algorithm for 1. Determine if the data are non-normal. (Use normal probability plot and histogram).
trying to make
non-normal 2. Find a transformation that makes the data look approximately normal, if possible. Some data
data sets may include zeros (i.e., particle data). If the data set does include zeros, you must first
approximately add a constant value to the data and then transform the results.
normal

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc46.htm (2 of 2) [11/13/2003 5:41:48 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc47.htm (1 of 3) [11/13/2003 5:41:51 PM]


3.4.7. Checking Assumptions 3.4.7. Checking Assumptions

Example: As an example, let's look at some particle count data from a semiconductor processing step. Count
particle count data are inherently non-normal. Below are histograms and normal probability plots for the original
data data and the ln, sqrt and inverse of the data. You can see that the log transform does the best job of
making the data appear as if it is normal. All analyses can be performed on the log-transformed data
and the assumptions will be approximately satisfied.

The original
data is
non-normal,
the log
transform
looks fairly
normal

Neither the
square root
nor the inverse
transformation
looks normal

http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc47.htm (2 of 3) [11/13/2003 5:41:51 PM] http://www.itl.nist.gov/div898/handbook/ppc/section4/ppc47.htm (3 of 3) [11/13/2003 5:41:51 PM]


3.5. Case Studies 3.5.1. Furnace Case Study

3. Production Process Characterization 3. Production Process Characterization


3.5. Case Studies

3.5. Case Studies


3.5.1. Furnace Case Study
Summary This section presents several case studies that demonstrate the
application of production process characterizations to specific problems. Introduction This case study analyzes a furnace oxide growth process.

Table of The following case studies are available. Table of The case study is broken down into the following steps.
Contents 1. Furnace Case Study Contents 1. Background and Data
2. Machine Case Study 2. Initial Analysis of Response Variable
3. Identify Sources of Variation
4. Analysis of Variance
5. Final Conclusions
6. Work This Example Yourself

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc5.htm [11/13/2003 5:41:52 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc51.htm [11/13/2003 5:41:52 PM]


3.5.1.1. Background and Data 3.5.1.1. Background and Data

3. Production Process Characterization


3.5. Case Studies
3.5.1. Furnace Case Study

3.5.1.1. Background and Data


Introduction In a semiconductor manufacturing process flow, we have a step
whereby we grow an oxide film on the silicon wafer using a furnace.
In this step, a cassette of wafers is placed in a quartz "boat" and the
boats are placed in the furnace. The furnace can hold four boats. A gas
flow is created in the furnace and it is brought up to temperature and
held there for a specified period of time (which corresponds to the
desired oxide thickness). This study was conducted to determine if the
process was stable and to characterize sources of variation so that a
process control strategy could be developed.

Goal The goal of this study is to determine if this process is capable of


consistently growing oxide films with a thickness of 560 Angstroms
+/- 100 Angstroms. An additional goal is to determine important
sources of variation for use in the development of a process control
strategy. Sensitivity The sensitivity model for this process is fairly straightforward and is
Model given in the figure below. The effects of the machin are mostly related
Process In the picture below we are modeling this process with one output to the preventative maintenance (PM) cycle. We want to make sure the
Model (film thickness) that is influenced by four controlled factors (gas flow, quartz tube has been cleaned recently, the mass flow controllers are in
pressure, temperature and time) and two uncontrolled factors (run and good shape and the temperature controller has been calibrated recently.
zone). The four controlled factors are part of our recipe and will The same is true of the measurement equipment where the thickness
remain constant throughout this study. We know that there is readings will be taken. We want to make sure a gauge study has been
run-to-run variation that is due to many different factors (input performed. For material, the incoming wafers will certainly have an
material variation, variation in consumables, etc.). We also know that effect on the outgoing thickness as well as the quality of the gases used.
the different zones in the furnace have an effect. A zone is a region of Finally, the recipe will have an effect including gas flow, temperature
the furnace tube that holds one boat. There are four zones in these offset for the different zones, and temperature profile (how quickly we
tubes. The zones in the middle of the tube grow oxide a little bit raise the temperature, how long we hold it and how quickly we cool it
differently from the ones on the ends. In fact, there are temperature off).
offsets in the recipe to help minimize this problem.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc511.htm (1 of 7) [11/13/2003 5:41:52 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc511.htm (2 of 7) [11/13/2003 5:41:52 PM]


3.5.1.1. Background and Data 3.5.1.1. Background and Data

Data
The following are the data that were collected for this study.

RUN ZONE WAFER THICKNESS


--------------------------------
1 1 1 546
1 1 2 540
1 2 1 566
1 2 2 564
1 3 1 577
1 3 2 546
1 4 1 543
1 4 2 529
2 1 1 561
2 1 2 556
2 2 1 577
2 2 2 553
2 3 1 563
2 3 2 577
2 4 1 556
2 4 2 540
3 1 1 515
3 1 2 520
Sampling Given our goal statement and process modeling, we can now define a
3 2 1 548
Plan sampling plan. The primary goal is to determine if the process is
3 2 2 542
capable. This just means that we need to monitor the process over some
3 3 1 505
period of time and compare the estimates of process location and spread
3 3 2 487
to the specifications. An additional goal is to identify sources of
3 4 1 506
variation to aid in setting up a process control strategy. Some obvious
3 4 2 514
sources of variation are incoming wafers, run-to-run variability,
4 1 1 568
variation due to operators or shift, and variation due to zones within a
4 1 2 584
furnace tube. One additional constraint that we must work under is that
4 2 1 570
this study should not have a significant impact on normal production
4 2 2 545
operations.
4 3 1 589
Given these constraints, the following sampling plan was selected. It 4 3 2 562
was decided to monitor the process for one day (three shifts). Because 4 4 1 569
this process is operator independent, we will not keep shift or operator 4 4 2 571
information but just record run number. For each run, we will randomly 5 1 1 550
assign cassettes of wafers to a zone. We will select two wafers from 5 1 2 550
each zone after processing and measure two sites on each wafer. This 5 2 1 562
plan should give reasonable estimates of run-to-run variation and within 5 2 2 580
zone variability as well as good overall estimates of process location and 5 3 1 560
spread. 5 3 2 554
5 4 1 545
We are expecting readings around 560 Angstroms. We would not expect 5 4 2 546
many readings above 700 or below 400. The measurement equipment is 6 1 1 584
accurate to within 0.5 Angstroms which is well within the accuracy 6 1 2 581
needed for this study. 6 2 1 567
6 2 2 558
6 3 1 556
6 3 2 560
6 4 1 591
6 4 2 599

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc511.htm (3 of 7) [11/13/2003 5:41:52 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc511.htm (4 of 7) [11/13/2003 5:41:52 PM]


3.5.1.1. Background and Data 3.5.1.1. Background and Data

7 1 1 593 13 4 1 570
7 1 2 626 13 4 2 595
7 2 1 584 14 1 1 564
7 2 2 559 14 1 2 555
7 3 1 634 14 2 1 585
7 3 2 598 14 2 2 588
7 4 1 569 14 3 1 564
7 4 2 592 14 3 2 583
8 1 1 522 14 4 1 563
8 1 2 535 14 4 2 558
8 2 1 535 15 1 1 550
8 2 2 581 15 1 2 557
8 3 1 527 15 2 1 538
8 3 2 520 15 2 2 525
8 4 1 532 15 3 1 556
8 4 2 539 15 3 2 547
9 1 1 562 15 4 1 534
9 1 2 568 15 4 2 542
9 2 1 548 16 1 1 552
9 2 2 548 16 1 2 547
9 3 1 533 16 2 1 563
9 3 2 553 16 2 2 578
9 4 1 533 16 3 1 571
9 4 2 521 16 3 2 572
10 1 1 555 16 4 1 575
10 1 2 545 16 4 2 584
10 2 1 584 17 1 1 549
10 2 2 572 17 1 2 546
10 3 1 546 17 2 1 584
10 3 2 552 17 2 2 593
10 4 1 586 17 3 1 567
10 4 2 584 17 3 2 548
11 1 1 565 17 4 1 606
11 1 2 557 17 4 2 607
11 2 1 583 18 1 1 539
11 2 2 585 18 1 2 554
11 3 1 582 18 2 1 533
11 3 2 567 18 2 2 535
11 4 1 549 18 3 1 522
11 4 2 533 18 3 2 521
12 1 1 548 18 4 1 547
12 1 2 528 18 4 2 550
12 2 1 563 19 1 1 610
12 2 2 588 19 1 2 592
12 3 1 543 19 2 1 587
12 3 2 540 19 2 2 587
12 4 1 585 19 3 1 572
12 4 2 586 19 3 2 612
13 1 1 580 19 4 1 566
13 1 2 570 19 4 2 563
13 2 1 556 20 1 1 569
13 2 2 569 20 1 2 609
13 3 1 609 20 2 1 558
13 3 2 625 20 2 2 555

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc511.htm (5 of 7) [11/13/2003 5:41:52 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc511.htm (6 of 7) [11/13/2003 5:41:52 PM]


3.5.1.1. Background and Data 3.5.1.2. Initial Analysis of Response Variable

20 3 1 577
20 3 2 579
20 4 1 552
20 4 2 558
21 1 1 595
21 1 2 583 3. Production Process Characterization
21 2 1 599 3.5. Case Studies
21 2 2 602 3.5.1. Furnace Case Study
21 3 1 598
21 3 2 616
21 4 1 580 3.5.1.2. Initial Analysis of Response Variable
21 4 2 575
Initial Plots The initial step is to assess data quality and to look for anomalies. This is done by generating a
of Response normal probability plot, a histogram, and a boxplot. For convenience, these are generated on a
Variable single page.

Conclusions We can make the following conclusions based on these initial plots.
From the ● The box plot indicates one outlier. However, this outlier is only slightly smaller than the
Plots other numbers.
● The normal probability plot and the histogram (with an overlaid normal density) indicate
that this data set is reasonably approximated by a normal distribution.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc511.htm (7 of 7) [11/13/2003 5:41:52 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc512.htm (1 of 4) [11/13/2003 5:41:53 PM]


3.5.1.2. Initial Analysis of Response Variable 3.5.1.2. Initial Analysis of Response Variable

Parameter Parameter estimates for the film thickness are summarized in the * UPPER SPEC LIMIT (USL) = 660.00000 *
Estimates following table. * TARGET (TARGET) = 560.00000 *
* USL COST (USLCOST) = UNDEFINED *
Parameter Estimates
****************************************************
Lower (95%) Upper (95%) * CP = 1.31313 *
Type Parameter Estimate Confidence Confidence * CP LOWER 95% CI = 1.17234 *
Bound Bound * CP UPPER 95% CI = 1.45372 *
Location Mean 563.0357 559.1692 566.9023 * CPL = 1.35299 *
Standard * CPL LOWER 95% CI = 1.21845 *
Dispersion 25.3847 22.9297 28.4331 * CPL UPPER 95% CI = 1.48753 *
Deviation
* CPU = 1.27327 *
* CPU LOWER 95% CI = 1.14217 *
* CPU UPPER 95% CI = 1.40436 *
Quantiles Quantiles for the film thickness are summarized in the following table. * CPK = 1.27327 *
* CPK LOWER 95% CI = 1.12771 *
Quantiles for Film Thickness
* CPK UPPER 95% CI = 1.41882 *
100.0% Maximum 634.00 * CNPK = 1.35762 *
99.5% 634.00 * CPM = 1.30384 *
97.5% 615.10 * CPM LOWER 95% CI = 1.16405 *
90.0% 595.00 * CPM UPPER 95% CI = 1.44344 *
* CC = 0.00460 *
75.0% Upper Quartile 582.75
* ACTUAL % DEFECTIVE = 0.00000 *
50.0% Median 562.50 * THEORETICAL % DEFECTIVE = 0.00915 *
25.0% Lower Quartile 546.25 * ACTUAL (BELOW) % DEFECTIVE = 0.00000 *
10.0% 532.90 * THEORETICAL(BELOW) % DEFECTIVE = 0.00247 *
2.5% 514.23 * ACTUAL (ABOVE) % DEFECTIVE = 0.00000 *
* THEORETICAL(ABOVE) % DEFECTIVE = 0.00668 *
0.5% 487.00
* EXPECTED LOSS = UNDEFINED *
0.0% Minimum 487.00 ****************************************************

Capability From the above preliminary analysis, it looks reasonable to proceed with the capability Summary of From the above capability analysis output, we can summarize the percent defective (i.e.,
Analysis analysis. Percent the number of items outside the specification limits) in the following table.
Defective
Percentage Outside Specification Limits
Theoretical (%
Specification Value Percent Actual
Based On Normal)
Percent Below
Lower Specification
460 LSL = 100* 0.0000 0.0025%
Limit
((LSL - )/s)
Percent Above
Upper Specification
660 USL = 100*(1 - 0.0000 0.0067%
Limit
((USL - )/s))
Dataplot generated the following capabilty analysis. Combined Percent
Specification Target 560 Below LSL and 0.0000 0.0091%
Above USL
**************************************************** Standard Deviation 25.38468
* CAPABILITY ANALYSIS *
* NUMBER OF OBSERVATIONS = 168 * with denoting the normal cumulative distribution function, the sample mean, and s
* MEAN = 563.03571 * the sample standard deviation.
* STANDARD DEVIATION = 25.38468 *
****************************************************
* LOWER SPEC LIMIT (LSL) = 460.00000 *

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc512.htm (2 of 4) [11/13/2003 5:41:53 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc512.htm (3 of 4) [11/13/2003 5:41:53 PM]


3.5.1.2. Initial Analysis of Response Variable 3.5.1.3. Identify Sources of Variation

Summary of From the above capability analysis output, we can summarize various capability index
Capability statistics in the following table.
Index
Statistics Capability Index Statistics
3. Production Process Characterization
Capability Statistic Index Lower CI Upper CI 3.5. Case Studies
CP 1.313 1.172 1.454 3.5.1. Furnace Case Study
CPK 1.273 1.128 1.419
CPM 1.304 1.165 1.442
CPL 1.353 1.218 1.488
3.5.1.3. Identify Sources of Variation
CPU 1.273 1.142 1.404
The next part of the analysis is to break down the sources of variation.

Conclusions The above capability analysis indicates that the process is capable and we can proceed Box Plot by The following is a box plot of the thickness by run number.
with the analysis. Run

Conclusions We can make the following conclusions from this box plot.
From Box 1. There is significant run-to-run variation.
Plot
2. Although the means of the runs are different, there is no discernable trend due to run.
3. In addition to the run-to-run variation, there is significant within-run variation as well. This
suggests that a box plot by furnace location may be useful as well.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc512.htm (4 of 4) [11/13/2003 5:41:53 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc513.htm (1 of 4) [11/13/2003 5:41:53 PM]


3.5.1.3. Identify Sources of Variation 3.5.1.3. Identify Sources of Variation

Box Plot by The following is a box plot of the thickness by furnace location.
Furnace
Location

Conclusion From this box plot, we conclude that wafer does not seem to be a significant factor.
From Box
Plot
Conclusions We can make the following conclusions from this box plot.
From Box 1. There is considerable variation within a given furnace location. Block Plot In order to show the combined effects of run, furnace location, and wafer, we draw a block plot of
Plot the thickness. Note that for aesthetic reasons, we have used connecting lines rather than enclosing
2. The variation between furnace locations is small. That is, the locations and scales of each
of the four furnace locations are fairly comparable (although furnace location 3 seems to boxes.
have a few mild outliers).

Box Plot by The following is a box plot of the thickness by wafer.


Wafer

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc513.htm (2 of 4) [11/13/2003 5:41:53 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc513.htm (3 of 4) [11/13/2003 5:41:53 PM]


3.5.1.3. Identify Sources of Variation 3.5.1.4. Analysis of Variance

3. Production Process Characterization


3.5. Case Studies
3.5.1. Furnace Case Study

3.5.1.4. Analysis of Variance


Analysis of The next step is to confirm our interpretation of the plots in the previous
Variance section by running an analysis of variance.
In this case, we want to run a nested analysis of variance. Although
Dataplot does not perform a nested analysis of variance directly, in this
case we can use the Dataplot ANOVA command with some additional
computations to generate the needed analysis.
The basic steps are to use a one-way ANOA to compute the appropriate
values for the run variable. We then run a one-way ANOVA with all the
combinations of run and furnace location to compute the "within"
values. The values for furnace location nested within run are then
computed as the difference between the previous two ANOVA runs.
Conclusions We can draw the following conclusions from this block plot. The Dataplot macro provides the details of this computation. This
From Block 1. There is significant variation both between runs and between furnace locations. The
Plot computation can be summarized in the following table.
between-run variation appears to be greater.
2. Run 3 seems to be an outlier. Analysis of Variance
Source Degrees of Sum of Mean F Ratio Prob > F
Freedom Squares Square
Error
Run 20 61,442.29 3,072.11 5.37404 0.0000001
Furnace 63 36,014.5 571.659 4.72864 3.85e-11
Location
[Run]
Within 84 10,155 120.893
Total 167 107,611.8 644.382

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc513.htm (4 of 4) [11/13/2003 5:41:53 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc514.htm (1 of 2) [11/13/2003 5:41:53 PM]


3.5.1.4. Analysis of Variance 3.5.1.5. Final Conclusions

Components From the above analysis of variance table, we can compute the
of Variance components of variance. Recall that for this data set we have 2 wafers
measured at 4 furnace locations for 21 runs. This leads to the following
set of equations. 3. Production Process Characterization
3072.11 = (4*2)*Var(Run) + 2*Var(Furnace Location) + 3.5. Case Studies
Var(Within) 3.5.1. Furnace Case Study
571.659 = 2*Var(Furnace Location) + Var(Within)
120.893 = Var(Within)
Solving these equations yields the following components of variance 3.5.1.5. Final Conclusions
table.
Final This simple study of a furnace oxide growth process indicated that the
Components of Variance Conclusions process is capable and showed that both run-to-run and
Component Variance Percent of Sqrt(Variance zone-within-run are significant sources of variation. We should take
Component Total Component) this into account when designing the control strategy for this process.
Run 312.55694 47.44 17.679 The results also pointed to where we should look when we perform
Furnace 225.38294 34.21 15.013 process improvement activities.
Location[Run]
Within 120.89286 18.35 10.995

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc514.htm (2 of 2) [11/13/2003 5:41:53 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc515.htm [11/13/2003 5:41:53 PM]


3.5.1.6. Work This Example Yourself 3.5.1.6. Work This Example Yourself

2. Analyze the response variable.

1. Normal probability plot, 1. Initial plots indicate that the


3. Production Process Characterization
3.5. Case Studies box plot, and histogram of film thickness is reasonably
3.5.1. Furnace Case Study film thickness. approximated by a normal
distribution with no significant
outliers.
3.5.1.6. Work This Example Yourself
View This page allows you to repeat the analysis outlined in the case study 2. Compute summary statistics 2. Mean is 563.04 and standard
Dataplot description on the previous page using Dataplot, if you have and quantiles of film deviation is 25.38. Data range
Macro for downloaded and installed it. Output from each analysis step below will thickness. from 487 to 634.
this Case be displayed in one or more of the Dataplot windows. The four main
Study windows are the Output window, the Graphics window, the Command
History window and the Data Sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the 3. Capability analysis indicates
3. Perform a capability analysis. that the process is capable.
bottom is a command entry window where commands can be typed in.

Data Analysis Steps Results and Conclusions


3. Identify Sources of Variation.

Click on the links below to start Dataplot and run


1. Generate a box plot by run. 1. The box plot shows significant
this case study yourself. Each step may use results The links in this column will connect you with more
from previous steps, so please be patient. Wait until detailed information about each analysis step from the variation both between runs and
the software verifies that the current step is complete case study description. within runs.
before clicking on the next step.

2. Generate a box plot by furnace 2. The box plot shows significant


location. variation within furnace location
1. Get set up and started. but not between furnace location.

1. Read in the data. 1. You have read 4 columns of numbers


into Dataplot, variables run, zone, 3. Generate a box plot by wafer. 3. The box plot shows no significant
wafer, and filmthic. effect for wafer.

4. Generate a block plot.


4. The block plot shows both run
and furnace location are
significant.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc516.htm (1 of 3) [11/13/2003 5:41:54 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc516.htm (2 of 3) [11/13/2003 5:41:54 PM]


3.5.1.6. Work This Example Yourself 3.5.2. Machine Screw Case Study

4. Perform an Analysis of Variance

1. Perform the analysis of 1. The results of the ANOVA are


variance and compute the summarized in an ANOVA table 3. Production Process Characterization
components of variance. and a components of variance 3.5. Case Studies
table.

3.5.2. Machine Screw Case Study


Introduction This case study analyzes three automatic screw machines with the intent
of replacing one of them.

Table of The case study is broken down into the following steps.
Contents 1. Background and Data
2. Box Plots by Factor
3. Analysis of Variance
4. Throughput
5. Final Conclusions
6. Work This Example Yourself

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc516.htm (3 of 3) [11/13/2003 5:41:54 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc52.htm [11/13/2003 5:41:54 PM]


3.5.2.1. Background and Data 3.5.2.1. Background and Data

and they will be changed as needed.

3. Production Process Characterization


3.5. Case Studies
3.5.2. Machine Screw Case Study

3.5.2.1. Background and Data


Introduction A machine shop has three automatic screw machines that produce
various parts. The shop has enough capital to replace one of the
machines. The quality control department has been asked to conduct a
study and make a recommendation as to which machine should be
replaced. It was decided to monitor one of the most commonly
produced parts (an 1/8th inch diameter pin) on each of the machines
and see which machine is the least stable.

Goal The goal of this study is to determine which machine is least stable in
manufacturing a steel pin with a diameter of .125 +/- .003 inches.
Stability will be measured in terms of a constant variance about a
constant mean. If all machines are stable, the decision will be based on
process variability and throughput. Namely, the machine with the
highest variability and lowest throughput will be selected for
replacement.

Process The process model for this operation is trivial and need not be Sampling Given our goal statement and process modeling, we can now define a sampling
Model addressed. Plan plan. The primary goal is to determine if the process is stable and to compare the
variances of the three machines. We also need to monitor throughput so that we
Sensitivity The sensitivity model, however, is important and is given in the figure can compare the productivity of the three machines.
Model below. The material is not very important. All machines will receive There is an upcoming three-day run of the particular part of interest, so this
barstock from the same source and the coolant will be the same. The study will be conducted on that run. There is a suspected time-of-day effect that
method is important. Each machine is slightly different and the we must account for. It is sometimes the case that the machines do not perform
operator must make adjustments to the speed (how fast the part as well in the morning, when they are first started up, as they do later in the day.
rotates), feed (how quickly the cut is made) and stops (where cuts are To account for this we will sample parts in the morning and in the afternoon. So
finished) for each machine. The same operator will be running all three as not to impact other QC operations too severely, it was decided to sample 10
machines simultaneously. Measurement is not too important. An parts, twice a day, for three days from each of the three machines. Daily
experienced QC engineer will be collecting the samples and making throughput will be recorded as well.
the measurements. Finally, the machine condition is really what this
study is all about. The wear on the ways and the lead screws will We are expecting readings around .125 +/- .003 inches. The parts will be
largely determine the stability of the machining process. Also, tool measured using a standard micrometer with readings recorded to 0.0001 of an
wear is important. The same type of tool inserts will be used on all inch. Throughput will be measured by reading the part counters on the machines
three machines. The tool insert wear will be monitored by the operator at the end of each day.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc521.htm (1 of 7) [11/13/2003 5:42:00 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc521.htm (2 of 7) [11/13/2003 5:42:00 PM]


3.5.2.1. Background and Data 3.5.2.1. Background and Data

1 2 2 9 0.1252
Data The following are the data that were collected for this study. 1 2 2 10 0.1243
1 3 1 1 0.1255
1 3 1 2 0.1237
MACHINE DAY
TIME SAMPLE DIAMETER
1 3 1 3 0.1235
(1-3) (1-3)
1 = AM (1-10) (inches)
1 3 1 4 0.1264
2 = PM
1 3 1 5 0.1239
------------------------------------------------------
1 3 1 6 0.1266
1 1 1 1 0.1247
1 3 1 7 0.1242
1 1 1 2 0.1264
1 3 1 8 0.1231
1 1 1 3 0.1252
1 3 1 9 0.1232
1 1 1 4 0.1253
1 3 1 10 0.1244
1 1 1 5 0.1263
1 3 2 1 0.1233
1 1 1 6 0.1251
1 3 2 2 0.1237
1 1 1 7 0.1254
1 3 2 3 0.1244
1 1 1 8 0.1239
1 3 2 4 0.1254
1 1 1 9 0.1235
1 3 2 5 0.1247
1 1 1 10 0.1257
1 3 2 6 0.1254
1 1 2 1 0.1271
1 3 2 7 0.1258
1 1 2 2 0.1253
1 3 2 8 0.126
1 1 2 3 0.1265
1 3 2 9 0.1235
1 1 2 4 0.1254
1 3 2 10 0.1273
1 1 2 5 0.1243
2 1 1 1 0.1239
1 1 2 6 0.124
2 1 1 2 0.1239
1 1 2 7 0.1246
2 1 1 3 0.1239
1 1 2 8 0.1244
2 1 1 4 0.1231
1 1 2 9 0.1271
2 1 1 5 0.1221
1 1 2 10 0.1241
2 1 1 6 0.1216
1 2 1 1 0.1251
2 1 1 7 0.1233
1 2 1 2 0.1238
2 1 1 8 0.1228
1 2 1 3 0.1255
2 1 1 9 0.1227
1 2 1 4 0.1234
2 1 1 10 0.1229
1 2 1 5 0.1235
2 1 2 1 0.122
1 2 1 6 0.1266
2 1 2 2 0.1239
1 2 1 7 0.125
2 1 2 3 0.1237
1 2 1 8 0.1246
2 1 2 4 0.1216
1 2 1 9 0.1243
2 1 2 5 0.1235
1 2 1 10 0.1248
2 1 2 6 0.124
1 2 2 1 0.1248
2 1 2 7 0.1224
1 2 2 2 0.1235
2 1 2 8 0.1236
1 2 2 3 0.1243
2 1 2 9 0.1236
1 2 2 4 0.1265
2 1 2 10 0.1217
1 2 2 5 0.127
2 2 1 1 0.1247
1 2 2 6 0.1229
2 2 1 2 0.122
1 2 2 7 0.125
2 2 1 3 0.1218
1 2 2 8 0.1248
2 2 1 4 0.1237

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc521.htm (3 of 7) [11/13/2003 5:42:00 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc521.htm (4 of 7) [11/13/2003 5:42:00 PM]


3.5.2.1. Background and Data 3.5.2.1. Background and Data

2 2 1 5 0.1234 3 1 2 1 0.1228
2 2 1 6 0.1229 3 1 2 2 0.126
2 2 1 7 0.1235 3 1 2 3 0.1242
2 2 1 8 0.1237 3 1 2 4 0.1236
2 2 1 9 0.1224 3 1 2 5 0.1248
2 2 1 10 0.1224 3 1 2 6 0.1243
2 2 2 1 0.1239 3 1 2 7 0.126
2 2 2 2 0.1226 3 1 2 8 0.1231
2 2 2 3 0.1224 3 1 2 9 0.1234
2 2 2 4 0.1239 3 1 2 10 0.1246
2 2 2 5 0.1237 3 2 1 1 0.1207
2 2 2 6 0.1227 3 2 1 2 0.1279
2 2 2 7 0.1218 3 2 1 3 0.1268
2 2 2 8 0.122 3 2 1 4 0.1222
2 2 2 9 0.1231 3 2 1 5 0.1244
2 2 2 10 0.1244 3 2 1 6 0.1225
2 3 1 1 0.1219 3 2 1 7 0.1234
2 3 1 2 0.1243 3 2 1 8 0.1244
2 3 1 3 0.1231 3 2 1 9 0.1207
2 3 1 4 0.1223 3 2 1 10 0.1264
2 3 1 5 0.1218 3 2 2 1 0.1224
2 3 1 6 0.1218 3 2 2 2 0.1254
2 3 1 7 0.1225 3 2 2 3 0.1237
2 3 1 8 0.1238 3 2 2 4 0.1254
2 3 1 9 0.1244 3 2 2 5 0.1269
2 3 1 10 0.1236 3 2 2 6 0.1236
2 3 2 1 0.1231 3 2 2 7 0.1248
2 3 2 2 0.1223 3 2 2 8 0.1253
2 3 2 3 0.1241 3 2 2 9 0.1252
2 3 2 4 0.1215 3 2 2 10 0.1237
2 3 2 5 0.1221 3 3 1 1 0.1217
2 3 2 6 0.1236 3 3 1 2 0.122
2 3 2 7 0.1229 3 3 1 3 0.1227
2 3 2 8 0.1205 3 3 1 4 0.1202
2 3 2 9 0.1241 3 3 1 5 0.127
2 3 2 10 0.1232 3 3 1 6 0.1224
3 1 1 1 0.1255 3 3 1 7 0.1219
3 1 1 2 0.1215 3 3 1 8 0.1266
3 1 1 3 0.1219 3 3 1 9 0.1254
3 1 1 4 0.1253 3 3 1 10 0.1258
3 1 1 5 0.1232 3 3 2 1 0.1236
3 1 1 6 0.1266 3 3 2 2 0.1247
3 1 1 7 0.1271 3 3 2 3 0.124
3 1 1 8 0.1209 3 3 2 4 0.1235
3 1 1 9 0.1212 3 3 2 5 0.124
3 1 1 10 0.1249 3 3 2 6 0.1217

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc521.htm (5 of 7) [11/13/2003 5:42:00 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc521.htm (6 of 7) [11/13/2003 5:42:00 PM]


3.5.2.1. Background and Data 3.5.2.2. Box Plots by Factors

3 3 2 7 0.1235
3 3 2 8 0.1242
3 3 2 9 0.1247
3 3 2 10 0.125
3. Production Process Characterization
3.5. Case Studies
3.5.2. Machine Screw Case Study

3.5.2.2. Box Plots by Factors


Initial Steps The initial step is to plot box plots of the measured diameter for each of the explanatory variables.

Box Plot by The following is a box plot of the diameter by machine.


Machine

Conclusions We can make the following conclusions from this box plot.
From Box 1. The location appears to be significantly different for the three machines, with machine 2
Plot having the smallest median diameter and machine 1 having the largest median diameter.
2. Machines 1 and 2 have comparable variability while machine 3 has somewhat larger
variability.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc521.htm (7 of 7) [11/13/2003 5:42:00 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc522.htm (1 of 4) [11/13/2003 5:42:00 PM]


3.5.2.2. Box Plots by Factors 3.5.2.2. Box Plots by Factors

Box Plot by The following is a box plot of the diameter by day.


Day

Conclusion We can draw the following conclusion from this box plot. Neither the location nor the spread
From Box seem to differ significantly by time of day.
Plot
Conclusions We can draw the following conclusion from this box plot. Neither the location nor the spread
From Box seem to differ significantly by day. Box Plot by The following is a box plot of the sample number.
Plot Sample
Number
Box Plot by The following is a box plot of the time of day.
Time of Day

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc522.htm (2 of 4) [11/13/2003 5:42:00 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc522.htm (3 of 4) [11/13/2003 5:42:00 PM]


3.5.2.2. Box Plots by Factors 3.5.2.3. Analysis of Variance

3. Production Process Characterization


3.5. Case Studies
3.5.2. Machine Screw Case Study

3.5.2.3. Analysis of Variance


Analysis of We can confirm our interpretation of the box plots by running an
Variance analysis of variance. Dataplot generated the following analysis of
using All variance output when all four factors were included.
Factors

**********************************
**********************************
** 4-WAY ANALYSIS OF VARIANCE **
**********************************
**********************************

NUMBER OF OBSERVATIONS = 180


NUMBER OF FACTORS = 4
NUMBER OF LEVELS FOR FACTOR 1 = 3
Conclusion We can draw the following conclusion from this box plot. Although there are some minor NUMBER OF LEVELS FOR FACTOR 2 = 3
From Box differences in location and spread between the samples, these differences do not show a NUMBER OF LEVELS FOR FACTOR 3 = 2
Plot noticeable pattern and do not seem significant. NUMBER OF LEVELS FOR FACTOR 4 = 10
BALANCED CASE
RESIDUAL STANDARD DEVIATION = 0.13743976597E-02
RESIDUAL DEGREES OF FREEDOM = 165
NO REPLICATION CASE
NUMBER OF DISTINCT CELLS = 180

*****************
* ANOVA TABLE *
*****************

SOURCE DF SUM OF SQUARES MEAN SQUARE F STATISTIC F CDF SIG


-------------------------------------------------------------------------------
TOTAL (CORRECTED) 179 0.000437 0.000002
-------------------------------------------------------------------------------
FACTOR 1 2 0.000111 0.000055 29.3159 100.000% **
FACTOR 2 2 0.000004 0.000002 0.9884 62.565%
FACTOR 3 1 0.000002 0.000002 1.2478 73.441%
FACTOR 4 9 0.000009 0.000001 0.5205 14.172%
-------------------------------------------------------------------------------
RESIDUAL 165 0.000312 0.000002

RESIDUAL STANDARD DEVIATION = 0.00137439766


RESIDUAL DEGREES OF FREEDOM = 165

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc522.htm (4 of 4) [11/13/2003 5:42:00 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc523.htm (1 of 5) [11/13/2003 5:42:01 PM]


3.5.2.3. Analysis of Variance 3.5.2.3. Analysis of Variance

last column of the ANOVA table prints a "**" for statistically


**************** significant factors. Only factor 1 (the machine) is statistically
* ESTIMATION * significant. This confirms what the box plots in the previous section
**************** had indicated graphically.

GRAND MEAN = 0.12395893037E+00 Analysis of The previous analysis of variance indicated that only the machine
GRAND STANDARD DEVIATION = 0.15631503193E-02 Variance factor was statistically significant. The following shows the ANOVA
Using Only output using only the machine factor.
Machine
LEVEL-ID NI MEAN EFFECT SD(EFFECT)
--------------------------------------------------------------------
FACTOR 1-- 1.00000 60. 0.12489 0.00093 0.00014
-- 2.00000 60. 0.12297 -0.00099 0.00014
-- 3.00000 60. 0.12402 0.00006 0.00014
FACTOR 2-- 1.00000 60. 0.12409 0.00013 0.00014 **********************************
-- 2.00000 60. 0.12403 0.00007 0.00014 **********************************
-- 3.00000 60. 0.12376 -0.00020 0.00014 ** 1-WAY ANALYSIS OF VARIANCE **
FACTOR 3-- 1.00000 90. 0.12384 -0.00011 0.00010 **********************************
-- 2.00000 90. 0.12407 0.00011 0.00010 **********************************
FACTOR 4-- 1.00000 18. 0.12371 -0.00025 0.00031
-- 2.00000 18. 0.12405 0.00009 0.00031 NUMBER OF OBSERVATIONS = 180
-- 3.00000 18. 0.12398 0.00002 0.00031 NUMBER OF FACTORS = 1
-- 4.00000 18. 0.12382 -0.00014 0.00031 NUMBER OF LEVELS FOR FACTOR 1 = 3
-- 5.00000 18. 0.12426 0.00030 0.00031 BALANCED CASE
-- 6.00000 18. 0.12379 -0.00016 0.00031 RESIDUAL STANDARD DEVIATION = 0.13584237313E-02
-- 7.00000 18. 0.12406 0.00010 0.00031 RESIDUAL DEGREES OF FREEDOM = 177
-- 8.00000 18. 0.12376 -0.00020 0.00031 REPLICATION CASE
-- 9.00000 18. 0.12376 -0.00020 0.00031 REPLICATION STANDARD DEVIATION = 0.13584237313E-02
-- 10.00000 18. 0.12440 0.00044 0.00031 REPLICATION DEGREES OF FREEDOM = 177
NUMBER OF DISTINCT CELLS = 3

MODEL RESIDUAL STANDARD DEVIATION *****************


------------------------------------------------------- * ANOVA TABLE *
CONSTANT ONLY-- 0.0015631503 *****************
CONSTANT & FACTOR 1 ONLY-- 0.0013584237
CONSTANT & FACTOR 2 ONLY-- 0.0015652323 SOURCE DF SUM OF SQUARES MEAN SQUARE F STATISTIC F CDF SIG
CONSTANT & FACTOR 3 ONLY-- 0.0015633047 -------------------------------------------------------------------------------
CONSTANT & FACTOR 4 ONLY-- 0.0015876852 TOTAL (CORRECTED) 179 0.000437 0.000002
CONSTANT & ALL 4 FACTORS -- 0.0013743977 -------------------------------------------------------------------------------
FACTOR 1 2 0.000111 0.000055 30.0094 100.000% **
-------------------------------------------------------------------------------
RESIDUAL 177 0.000327 0.000002
Interpretation The first thing to note is that Dataplot fits an overall mean when
RESIDUAL STANDARD DEVIATION = 0.00135842373
of ANOVA performing the ANOVA. That is, it fits the model
RESIDUAL DEGREES OF FREEDOM = 177
Output
REPLICATION STANDARD DEVIATION = 0.00135842373
as opposed to the model REPLICATION DEGREES OF FREEDOM = 177

****************
These models are mathematically equivalent. The effect estimates in * ESTIMATION *
the first model are relative to the overall mean. The effect estimates for ****************
the second model can be obtained by simply adding the overall mean to
effect estimates from the first model. GRAND MEAN = 0.12395893037E+00
GRAND STANDARD DEVIATION = 0.15631503193E-02
We are primarily interested in identifying the significant factors. The

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc523.htm (2 of 5) [11/13/2003 5:42:01 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc523.htm (3 of 5) [11/13/2003 5:42:01 PM]


3.5.2.3. Analysis of Variance 3.5.2.3. Analysis of Variance

LEVEL-ID NI MEAN EFFECT SD(EFFECT)


--------------------------------------------------------------------
FACTOR 1-- 1.00000 60. 0.12489 0.00093 0.00014
-- 2.00000 60. 0.12297 -0.00099 0.00014
-- 3.00000 60. 0.12402 0.00006 0.00014

MODEL RESIDUAL STANDARD DEVIATION


-------------------------------------------------------
CONSTANT ONLY-- 0.0015631503
CONSTANT & FACTOR 1 ONLY-- 0.0013584237

Interpretation At this stage, we are interested in the effect estimates for the machine variable. These can be
of ANOVA summarized in the following table.
Output
Means for Oneway Anova
Level Number Mean Standard Error Lower 95% CI Upper 95% CI
1 60 0.124887 0.00018 0.12454 0.12523
2 60 0.122968 0.00018 0.12262 0.12331
3 60 0.124022 0.00018 0.12368 0.12437

The Dataplot macro file shows the computations required to go from the Dataplot ANOVA
output to the numbers in the above table.

Model As a final step, we validate the model by generating a 4-plot of the residuals. The 4-plot does not indicate any significant problems with the ANOVA model.
Validation

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc523.htm (4 of 5) [11/13/2003 5:42:01 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc523.htm (5 of 5) [11/13/2003 5:42:01 PM]


3.5.2.4. Throughput 3.5.2.4. Throughput

Analysis of We can confirm the statistical significance of the lower throughput of machine 3 by running an
Variance for analysis of variance.
Throughput
3. Production Process Characterization
3.5. Case Studies
3.5.2. Machine Screw Case Study

3.5.2.4. Throughput **********************************


**********************************
** 1-WAY ANALYSIS OF VARIANCE **
Summary of The throughput is summarized in the following table (this was part of the original data collection, **********************************
Throughput not the result of analysis). **********************************
Machine Day 1 Day 2 Day 3
NUMBER OF OBSERVATIONS = 9
1 576 604 583
NUMBER OF FACTORS = 1
2 657 604 586 NUMBER OF LEVELS FOR FACTOR 1 = 3
3 510 546 571 BALANCED CASE
This table shows that machine 3 had significantly lower throughput. RESIDUAL STANDARD DEVIATION = 0.28953985214E+02
RESIDUAL DEGREES OF FREEDOM = 6
REPLICATION CASE
Graphical We can show the throughput graphically.
REPLICATION STANDARD DEVIATION = 0.28953985214E+02
Representation
REPLICATION DEGREES OF FREEDOM = 6
of Throughput
NUMBER OF DISTINCT CELLS = 3

*****************
* ANOVA TABLE *
*****************

SOURCE DF SUM OF SQUARES MEAN SQUARE F STATISTIC F CDF SIG


-------------------------------------------------------------------------------
TOTAL (CORRECTED) 8 13246.888672 1655.861084
-------------------------------------------------------------------------------
FACTOR 1 2 8216.898438 4108.449219 4.9007 94.525%
-------------------------------------------------------------------------------
RESIDUAL 6 5030.000000 838.333313

RESIDUAL STANDARD DEVIATION = 28.95398521423


RESIDUAL DEGREES OF FREEDOM = 6
REPLICATION STANDARD DEVIATION = 28.95398521423
REPLICATION DEGREES OF FREEDOM = 6

****************
* ESTIMATION *
****************

GRAND MEAN = 0.58188891602E+03


GRAND STANDARD DEVIATION = 0.40692272186E+02

The graph clearly shows the lower throughput for machine 3.


LEVEL-ID NI MEAN EFFECT SD(EFFECT)
--------------------------------------------------------------------
FACTOR 1-- 1.00000 3. 587.66669 5.77777 13.64904
-- 2.00000 3. 615.66669 33.77777 13.64904

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc524.htm (1 of 3) [11/13/2003 5:42:01 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc524.htm (2 of 3) [11/13/2003 5:42:01 PM]


3.5.2.4. Throughput 3.5.2.5. Final Conclusions

-- 3.00000 3. 542.33331 -39.55560 13.64904

MODEL RESIDUAL STANDARD DEVIATION


-------------------------------------------------------
CONSTANT ONLY-- 40.6922721863
CONSTANT & FACTOR 1 ONLY-- 28.9539852142 3. Production Process Characterization
3.5. Case Studies
3.5.2. Machine Screw Case Study
Interpretation We summarize the effect estimates in the following table.
of ANOVA
Output Means for Oneway Anova 3.5.2.5. Final Conclusions
Level Number Mean Standard Error Lower 95% Upper 95%
CI CI
1 3 587.667 16.717 546.76 628.57 Final The analysis shows that machines 1 and 2 had about the same
2 3 615.667 16.717 574.76 656.57 Conclusions variablity but significantly different locations. The throughput for
3 3 542.33 16.717 501.43 583.24 machine 2 was also higher with greater variability than for machine 1.
An interview with the operator revealed that he realized the second
The Dataplot macro file shows the computations required to go from machine was not set correctly. However, he did not want to change the
the Dataplot ANOVA output to the numbers in the above table. settings because he knew a study was being conducted and was afraid
he might impact the results by making changes. Machine 3 had
significantly more variation and lower throughput. The operator
indicated that the machine had to be taken down several times for
minor repairs. Given the preceeding analysis results, the team
recommended replacing machine 3.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc524.htm (3 of 3) [11/13/2003 5:42:01 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc525.htm [11/13/2003 5:42:01 PM]


3.5.2.6. Work This Example Yourself 3.5.2.6. Work This Example Yourself

2. Box Plots by Factor Variables

1. Generate a box plot by machine. 1. The box plot shows significant


3. Production Process Characterization
3.5. Case Studies variation for both location and
3.5.2. Machine Screw Case Study spread.

3.5.2.6. Work This Example Yourself 2. Generate a box plot by day. 2. The box plot shows no significant
location or spread effects for
View This page allows you to repeat the analysis outlined in the case study day.
Dataplot description on the previous page using Dataplot, if you have
Macro for downloaded and installed it. Output from each analysis step below will
this Case be displayed in one or more of the Dataplot windows. The four main 3. Generate a box plot by time of
3. The box plot shows no significant
Study windows are the Output window, the Graphics window, the Command day.
location or spread effects for
History window and the Data Sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the time of day.
bottom is a command entry window where commands can be typed in.
4. Generate a box plot by
4. The box plot shows no significant
sample.
Data Analysis Steps Results and Conclusions location or spread effects for
sample.

Click on the links below to start Dataplot and run this


case study yourself. Each step may use results from The links in this column will connect you with more
previous steps, so please be patient. Wait until the detailed information about each analysis step from the 3. Analysis of Variance
software verifies that the current step is complete case study description.
before clicking on the next step. 1. Perform an analysis of variance 1. The analysis of variance shows
with all factors. that only the machine factor
is statistically significant.

1. Get set up and started.


2. Perform an analysis of variance 2. The analysis of variance shows
1. Read in the data. 1. You have read 5 columns of numbers with only the machine factor. the overall mean and the
into Dataplot, variables machine,
effect estimates for the levels
day, time, sample, and diameter.
of the machine variable.

3. Perform model validation by 3. The 4-plot of the residuals does


generating a 4-plot of the not indicate any significant
residuals. problems with the model.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc526.htm (1 of 3) [11/13/2003 5:42:02 PM] http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc526.htm (2 of 3) [11/13/2003 5:42:02 PM]


3.5.2.6. Work This Example Yourself 3.6. References

4. Graph of Throughput

1. Generate a graph of the 1. The graph shows the throughput


throughput. for machine 3 is lower than 3. Production Process Characterization
the other machines.

2. Perform an analysis of 2. The effect estimates from the


3.6. References
variance of the throughput. ANIVA are given.
Box, G.E.P., Hunter, W.G., and Hunter, J.S. (1978), Statistics for Experimenters, John
Wiley and Sons, New York.
Cleveland, W.S. (1993), Visualizing Data, Hobart Press, New Jersey.
Hoaglin, D.C., Mosteller, F., and Tukey, J.W. (1985), Exploring Data Tables, Trends,
and Shapes, John Wiley and Sons, New York.
Hoaglin, D.C., Mosteller, F., and Tukey, J.W. (1991), Fundamentals of Exploratory
Analysis of Variance, John Wiley and Sons, New York.

http://www.itl.nist.gov/div898/handbook/ppc/section5/ppc526.htm (3 of 3) [11/13/2003 5:42:02 PM] http://www.itl.nist.gov/div898/handbook/ppc/section6/ppc6.htm [11/13/2003 5:42:02 PM]


4. Process Modeling 4. Process Modeling

4. Process Modeling
The goal for this chapter is to present the background and specific analysis techniques
needed to construct a statistical model that describes a particular scientific or
engineering process. The types of models discussed in this chapter are limited to those
based on an explicit mathematical function. These types of models can be used for
prediction of process outputs, for calibration, or for process optimization.

1. Introduction 2. Assumptions
1. Definition 1. Assumptions
2. Terminology
3. Uses
4. Methods

3. Design 4. Analysis
1. Definition 1. Modeling Steps
2. Importance 2. Model Selection
3. Design Principles 3. Model Fitting
4. Optimal Designs 4. Model Validation
5. Assessment 5. Model Improvement

5. Interpretation & Use 6. Case Studies


1. Prediction 1. Load Cell Output
2. Calibration 2. Alaska Pipeline
3. Optimization 3. Ultrasonic Reference Block
4. Thermal Expansion of Copper

Detailed Table of Contents: Process Modeling

References: Process Modeling

Appendix: Some Useful Functions for Process Modeling

http://www.itl.nist.gov/div898/handbook/pmd/pmd.htm (1 of 2) [11/14/2003 5:49:51 PM] http://www.itl.nist.gov/div898/handbook/pmd/pmd.htm (2 of 2) [11/14/2003 5:49:51 PM]


4. Process Modeling 4. Process Modeling

6. The explanatory variables are observed without error. [4.2.1.6.]

3. Data Collection for Process Modeling [4.3.]


1. What is design of experiments (aka DEX or DOE)? [4.3.1.]
2. Why is experimental design important for process modeling? [4.3.2.]
3. What are some general design principles for process modeling? [4.3.3.]
4. Process Modeling - Detailed Table of 4. I've heard some people refer to "optimal" designs, shouldn't I use those? [4.3.4.]
Contents [4.] 5. How can I tell if a particular experimental design is good for my
application? [4.3.5.]
The goal for this chapter is to present the background and specific analysis techniques needed to
construct a statistical model that describes a particular scientific or engineering process. The types 4. Data Analysis for Process Modeling [4.4.]
of models discussed in this chapter are limited to those based on an explicit mathematical 1. What are the basic steps for developing an effective process model? [4.4.1.]
function. These types of models can be used for prediction of process outputs, for calibration, or
2. How do I select a function to describe my process? [4.4.2.]
for process optimization.
1. Introduction to Process Modeling [4.1.] 1. Incorporating Scientific Knowledge into Function Selection [4.4.2.1.]
1. What is process modeling? [4.1.1.] 2. Using the Data to Select an Appropriate Function [4.4.2.2.]
2. What terminology do statisticians use to describe process models? [4.1.2.] 3. Using Methods that Do Not Require Function Specification [4.4.2.3.]
3. What are process models used for? [4.1.3.] 3. How are estimates of the unknown parameters obtained? [4.4.3.]
1. Estimation [4.1.3.1.] 1. Least Squares [4.4.3.1.]
2. Prediction [4.1.3.2.] 2. Weighted Least Squares [4.4.3.2.]
3. Calibration [4.1.3.3.] 4. How can I tell if a model fits my data? [4.4.4.]
4. Optimization [4.1.3.4.] 1. How can I assess the sufficiency of the functional part of the model? [4.4.4.1.]
4. What are some of the different statistical methods for model building? [4.1.4.] 2. How can I detect non-constant variation across the data? [4.4.4.2.]
1. Linear Least Squares Regression [4.1.4.1.] 3. How can I tell if there was drift in the measurement process? [4.4.4.3.]
2. Nonlinear Least Squares Regression [4.1.4.2.] 4. How can I assess whether the random errors are independent from one to the
next? [4.4.4.4.]
3. Weighted Least Squares Regression [4.1.4.3.]
5. How can I test whether or not the random errors are distributed
4. LOESS (aka LOWESS) [4.1.4.4.]
normally? [4.4.4.5.]
6. How can I test whether any significant terms are missing or misspecified in the
2. Underlying Assumptions for Process Modeling [4.2.]
functional part of the model? [4.4.4.6.]
1. What are the typical underlying assumptions in process modeling? [4.2.1.]
7. How can I test whether all of the terms in the functional part of the model are
1. The process is a statistical process. [4.2.1.1.] necessary? [4.4.4.7.]
2. The means of the random errors are zero. [4.2.1.2.] 5. If my current model does not fit the data well, how can I improve it? [4.4.5.]
3. The random errors have a constant standard deviation. [4.2.1.3.] 1. Updating the Function Based on Residual Plots [4.4.5.1.]
4. The random errors follow a normal distribution. [4.2.1.4.] 2. Accounting for Non-Constant Variation Across the Data [4.4.5.2.]
5. The data are randomly sampled from the process. [4.2.1.5.] 3. Accounting for Errors with a Non-Normal Distribution [4.4.5.3.]

http://www.itl.nist.gov/div898/handbook/pmd/pmd_d.htm (1 of 5) [11/14/2003 5:50:03 PM] http://www.itl.nist.gov/div898/handbook/pmd/pmd_d.htm (2 of 5) [11/14/2003 5:50:03 PM]


4. Process Modeling 4. Process Modeling

2. Initial Non-Linear Fit [4.6.3.2.]


5. Use and Interpretation of Process Models [4.5.] 3. Transformations to Improve Fit [4.6.3.3.]
1. What types of predictions can I make using the model? [4.5.1.] 4. Weighting to Improve Fit [4.6.3.4.]
1. How do I estimate the average response for a particular set of predictor 5. Compare the Fits [4.6.3.5.]
variable values? [4.5.1.1.] 6. Work This Example Yourself [4.6.3.6.]
2. How can I predict the value and and estimate the uncertainty of a single 4. Thermal Expansion of Copper Case Study [4.6.4.]
response? [4.5.1.2.]
1. Background and Data [4.6.4.1.]
2. How can I use my process model for calibration? [4.5.2.]
2. Rational Function Models [4.6.4.2.]
1. Single-Use Calibration Intervals [4.5.2.1.]
3. Initial Plot of Data [4.6.4.3.]
3. How can I optimize my process using the process model? [4.5.3.]
4. Quadratic/Quadratic Rational Function Model [4.6.4.4.]

6. Case Studies in Process Modeling [4.6.] 5. Cubic/Cubic Rational Function Model [4.6.4.5.]
1. Load Cell Calibration [4.6.1.] 6. Work This Example Yourself [4.6.4.6.]

1. Background & Data [4.6.1.1.]


7. References For Chapter 4: Process Modeling [4.7.]
2. Selection of Initial Model [4.6.1.2.]
3. Model Fitting - Initial Model [4.6.1.3.] 8. Some Useful Functions for Process Modeling [4.8.]
4. Graphical Residual Analysis - Initial Model [4.6.1.4.] 1. Univariate Functions [4.8.1.]
5. Interpretation of Numerical Output - Initial Model [4.6.1.5.] 1. Polynomial Functions [4.8.1.1.]
6. Model Refinement [4.6.1.6.] 1. Straight Line [4.8.1.1.1.]
7. Model Fitting - Model #2 [4.6.1.7.] 2. Quadratic Polynomial [4.8.1.1.2.]
8. Graphical Residual Analysis - Model #2 [4.6.1.8.] 3. Cubic Polynomial [4.8.1.1.3.]
9. Interpretation of Numerical Output - Model #2 [4.6.1.9.] 2. Rational Functions [4.8.1.2.]
10. Use of the Model for Calibration [4.6.1.10.] 1. Constant / Linear Rational Function [4.8.1.2.1.]
11. Work This Example Yourself [4.6.1.11.] 2. Linear / Linear Rational Function [4.8.1.2.2.]
2. Alaska Pipeline [4.6.2.] 3. Linear / Quadratic Rational Function [4.8.1.2.3.]
1. Background and Data [4.6.2.1.] 4. Quadratic / Linear Rational Function [4.8.1.2.4.]
2. Check for Batch Effect [4.6.2.2.] 5. Quadratic / Quadratic Rational Function [4.8.1.2.5.]
3. Initial Linear Fit [4.6.2.3.] 6. Cubic / Linear Rational Function [4.8.1.2.6.]
4. Transformations to Improve Fit and Equalize Variances [4.6.2.4.] 7. Cubic / Quadratic Rational Function [4.8.1.2.7.]
5. Weighting to Improve Fit [4.6.2.5.] 8. Linear / Cubic Rational Function [4.8.1.2.8.]
6. Compare the Fits [4.6.2.6.] 9. Quadratic / Cubic Rational Function [4.8.1.2.9.]
7. Work This Example Yourself [4.6.2.7.] 10. Cubic / Cubic Rational Function [4.8.1.2.10.]
3. Ultrasonic Reference Block Study [4.6.3.] 11. Determining m and n for Rational Function Models [4.8.1.2.11.]
1. Background and Data [4.6.3.1.]

http://www.itl.nist.gov/div898/handbook/pmd/pmd_d.htm (3 of 5) [11/14/2003 5:50:03 PM] http://www.itl.nist.gov/div898/handbook/pmd/pmd_d.htm (4 of 5) [11/14/2003 5:50:03 PM]


4. Process Modeling 4.1. Introduction to Process Modeling

4. Process Modeling

4.1. Introduction to Process Modeling


Overview of The goal for this section is to give the big picture of function-based
Section 4.1 process modeling. This includes a discussion of what process modeling
is, the goals of process modeling, and a comparison of the different
statistical methods used for model building. Detailed information on
how to collect data, construct appropriate models, interpret output, and
use process models is covered in the following sections. The final
section of the chapter contains case studies that illustrate the general
information presented in the first five sections using data from a variety
of scientific and engineering applications.

Contents of 1. What is process modeling?


Section 4.1 2. What terminology do statisticians use to describe process models?
3. What are process models used for?
1. Estimation
2. Prediction
3. Calibration
4. Optimization
4. What are some of the statistical methods for model building?
1. Linear Least Squares Regression
2. Nonlinear Least Squares Regression
3. Weighted Least Squares Regression
4. LOESS (aka LOWESS)

http://www.itl.nist.gov/div898/handbook/pmd/pmd_d.htm (5 of 5) [11/14/2003 5:50:03 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd1.htm [11/14/2003 5:50:14 PM]


4.1.1. What is process modeling? 4.1.1. What is process modeling?

Click Figure
for Full-Sized
Copy

4. Process Modeling
4.1. Introduction to Process Modeling

4.1.1. What is process modeling?


Basic Process modeling is the concise description of the total variation in one quantity, , by
Definition partitioning it into
1. a deterministic component given by a mathematical function of one or more other
quantities, , plus

2. a random component that follows a particular probability distribution.

Example For example, the total variation of the measured pressure of a fixed amount of a gas in a tank can
be described by partitioning the variability into its deterministic part, which is a function of the
temperature of the gas, plus some left-over random error. Charles' Law states that the pressure of
a gas is proportional to its temperature under the conditions described here, and in this case most
of the variation will be deterministic. However, due to measurement error in the pressure gauge,
the relationship will not be purely deterministic. The random errors cannot be characterized
individually, but will follow some probability distribution that will describe the relative
frequencies of occurrence of different-sized errors.

Graphical Using the example above, the definition of process modeling can be graphically depicted like The top left plot in the figure shows pressure data that vary deterministically with temperature
Interpretation this: except for a small amount of random error. The relationship between pressure and temperature is
a straight line, but not a perfect straight line. The top row plots on the right-hand side of the
equals sign show a partitioning of the data into a perfect straight line and the remaining
"unexplained" random variation in the data (note the different vertical scales of these plots). The
plots in the middle row of the figure show the deterministic structure in the data again and a
histogram of the random variation. The histogram shows the relative frequencies of observing
different-sized random errors. The bottom row of the figure shows how the relative frequencies of
the random errors can be summarized by a (normal) probability distribution.

An Example Of course, the straight-line example is one of the simplest functions used for process modeling.
from a More Another example is shown below. The concept is identical to the straight-line example, but the
Complex structure in the data is more complex. The variation in is partitioned into a deterministic part,
Process which is a function of another variable, , plus some left-over random variation. (Again note the
difference in the vertical axis scales of the two plots in the top right of the figure.) A probability
distribution describes the leftover random variation.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd11.htm (1 of 4) [11/14/2003 5:50:14 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd11.htm (2 of 4) [11/14/2003 5:50:14 PM]


4.1.1. What is process modeling? 4.1.1. What is process modeling?

An Example The examples of process modeling shown above have only one explanatory variable but the
with Multiple concept easily extends to cases with more than one explanatory variable. The three-dimensional
Explanatory perspective plots below show an example with two explanatory variables. Examples with three or
Variables more explanatory variables are exactly analogous, but are difficult to show graphically.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd11.htm (3 of 4) [11/14/2003 5:50:14 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd11.htm (4 of 4) [11/14/2003 5:50:14 PM]


4.1.2. What terminology do statisticians use to describe process models? 4.1.2. What terminology do statisticians use to describe process models?

Mathematical The mathematical function consists of two parts. These parts are the
Function predictor variables, , and the parameters, . The
predictor variables are observed along with the response variable. They
4. Process Modeling are the quantities described on the previous page as inputs to the
4.1. Introduction to Process Modeling mathematical function, . The collection of all of the predictor
variables is denoted by for short.
4.1.2. What terminology do statisticians use
to describe process models?
The parameters are the quantities that will be estimated during the
Model There are three main parts to every process model. These are modeling process. Their true values are unknown and unknowable,
Components 1. the response variable, usually denoted by , except in simulation experiments. As for the predictor variables, the
collection of all of the parameters is denoted by for short.
2. the mathematical function, usually denoted as , and

3. the random errors, usually denoted by .


The parameters and predictor variables are combined in different forms
Form of The general form of the model is to give the function used to describe the deterministic variation in the
Model response variable. For a straight line with an unknown intercept and
slope, for example, there are two parameters and one predictor variable
.

All process models discussed in this chapter have this general form. As .
alluded to earlier, the random errors that are included in the model make
the relationship between the response variable and the predictor For a straight line with a known slope of one, but an unknown intercept,
variables a "statistical" one, rather than a perfect deterministic one. This there would only be one parameter
is because the functional relationship between the response and
predictors holds only on average, not for each data point.
.
Some of the details about the different parts of the model are discussed
below, along with alternate terminology for the different components of For a quadratic surface with two predictor variables, there are six
the model. parameters for the full model.

Response The response variable, , is a quantity that varies in a way that we hope
Variable to be able to summarize and exploit via the modeling process. Generally .
it is known that the variation of the response variable is systematically
related to the values of one or more other variables before the modeling
process is begun, although testing the existence and nature of this
dependence is part of the modeling process itself.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd12.htm (1 of 3) [11/14/2003 5:50:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd12.htm (2 of 3) [11/14/2003 5:50:15 PM]


4.1.2. What terminology do statisticians use to describe process models? 4.1.3. What are process models used for?

Random Like the parameters in the mathematical function, the random errors are
Error unknown. They are simply the difference between the data and the
mathematical function. They are assumed to follow a particular
probability distribution, however, which is used to describe their 4. Process Modeling
aggregate behavior. The probability distribution that describes the errors 4.1. Introduction to Process Modeling
has a mean of zero and an unknown standard deviation, denoted by ,
that is another parameter in the model, like the 's.
4.1.3. What are process models used for?
Alternate Unfortunately, there are no completely standardardized names for the
Terminology parts of the model discussed above. Other publications or software may Three Main Process models are used for four main purposes:
use different terminology. For example, another common name for the Purposes 1. estimation,
response variable is "dependent variable". The response variable is also
simply called "the response" for short. Other names for the predictor 2. prediction,
variables include "explanatory variables", "independent variables", 3. calibration, and
"predictors" and "regressors". The mathematical function used to 4. optimization.
describe the deterministic variation in the response variable is sometimes The rest of this page lists brief explanations of the different uses of
called the "regression function", the "regression equation", the process models. More detailed explanations of the uses for process
"smoothing function", or the "smooth". models are given in the subsections of this section listed at the bottom
of this page.
Scope of In its correct usage, the term "model" refers to the equation above and
"Model" also includes the underlying assumptions made about the probability Estimation The goal of estimation is to determine the value of the regression
distribution used to describe the variation of the random errors. Often, function (i.e., the average value of the response variable), for a
however, people will also use the term "model" when referring
particular combination of the values of the predictor variables.
specifically to the mathematical function describing the deterministic
Regression function values can be estimated for any combination of
variation in the data. Since the function is part of the model, the more
predictor variable values, including values for which no data have been
limited usage is not wrong, but it is important to remember that the term
measured or observed. Function values estimated for points within the
"model" might refer to more than just the mathematical function.
observed space of predictor variable values are sometimes called
interpolations. Estimation of regression function values for points
outside the observed space of predictor variable values, called
extrapolations, are sometimes necessary, but require caution.

Prediction The goal of prediction is to determine either


1. the value of a new observation of the response variable, or
2. the values of a specified proportion of all future observations of
the response variable
for a particular combination of the values of the predictor variables.
Predictions can be made for any combination of predictor variable
values, including values for which no data have been measured or
observed. As in the case of estimation, predictions made outside the
observed space of predictor variable values are sometimes necessary,
but require caution.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd12.htm (3 of 3) [11/14/2003 5:50:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd13.htm (1 of 2) [11/14/2003 5:50:15 PM]


4.1.3. What are process models used for? 4.1.3.1. Estimation

Calibration The goal of calibration is to quantitatively relate measurements made


using one measurement system to those of another measurement system.
This is done so that measurements can be compared in common units or
4. Process Modeling
to tie results from a relative measurement method to absolute units.
4.1. Introduction to Process Modeling
4.1.3. What are process models used for?
Optimization Optimization is performed to determine the values of process inputs that
should be used to obtain the desired process output. Typical
optimization goals might be to maximize the yield of a process, to 4.1.3.1. Estimation
minimize the processing time required to fabricate a product, or to hit a
target product specification with minimum variation in order to More on As mentioned on the preceding page, the primary goal of estimation is to determine the value of
maintain specified tolerances. Estimation the regression function that is associated with a specific combination of predictor variable values.
The estimated values are computed by plugging the value(s) of the predictor variable(s) into the
regression equation, after estimating the unknown parameters from the data. This process is
Further 1. Estimation illustrated below using the Pressure/Temperature example from a few pages earlier.
Details 2. Prediction
Example Suppose in this case the predictor variable value of interest is a temperature of 47 degrees.
3. Calibration
Computing the estimated value of the regression function using the equation
4. Optimization

yields an estimated average pressure of 192.4655.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd13.htm (2 of 2) [11/14/2003 5:50:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd131.htm (1 of 4) [11/14/2003 5:50:15 PM]


4.1.3.1. Estimation 4.1.3.1. Estimation

Of course, if the pressure/temperature experiment were repeated, the estimates of the parameters Uncertainty A critical part of estimation is an assessment of how much an estimated value will fluctuate due
of the regression function obtained from the data would differ slightly each time because of the of the to the noise in the data. Without that information there is no basis for comparing an estimated
randomness in the data and the need to sample a limited amount of data. Different parameter Estimated value to a target value or to another estimate. Any method used for estimation should include an
estimates would, in turn, yield different estimated values. The plot below illustrates the type of Value assessment of the uncertainty in the estimated value(s). Fortunately it is often the case that the
slight variation that could occur in a repeated experiment. data used to fit the model to a process can also be used to compute the uncertainty of estimated
values obtained from the model. In the pressure/temperature example a confidence interval for the
Estimated value of the regresion function at 47 degrees can be computed from the data used to fit the model.
Value from The plot below shows a 99% confidence interval produced using the original data. This interval
a Repeated gives the range of plausible values for the average pressure for a temperature of 47 degrees based
Experiment on the parameter estimates and the noise in the data.

99%
Confidence
Interval for
Pressure at
T=47

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd131.htm (2 of 4) [11/14/2003 5:50:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd131.htm (3 of 4) [11/14/2003 5:50:15 PM]


4.1.3.1. Estimation 4.1.3.2. Prediction

4. Process Modeling
4.1. Introduction to Process Modeling
4.1.3. What are process models used for?

4.1.3.2. Prediction
More on As mentioned earlier, the goal of prediction is to determine future value(s) of the response
Prediction variable that are associated with a specific combination of predictor variable values. As in
estimation, the predicted values are computed by plugging the value(s) of the predictor variable(s)
into the regression equation, after estimating the unknown parameters from the data. The
difference between estimation and prediction arises only in the computation of the uncertainties.
These differences are illustrated below using the Pressure/Temperature example in parallel with
the example illustrating estimation.

Example Suppose in this case the predictor variable value of interest is a temperature of 47 degrees.
Computing the predicted value using the equation

yields a predicted pressure of 192.4655.


Length of Because the confidence interval is an interval for the value of the regression function, the
Confidence uncertainty only includes the noise that is inherent in the estimates of the regression parameters.
Intervals The uncertainty in the estimated value can be less than the uncertainty of a single measurement
from the process because the data used to estimate the unknown parameters is essentially
averaged (in a way that depends on the statistical method being used) to determine each
parameter estimate. This "averaging" of the data tends to cancel out errors inherent in each
individual observed data point. The noise in the this type of result is generally less than the noise
in the prediction of one or more future measurements, which must account for both the
uncertainty in the estimated parameters and the uncertainty of the new measurement.

More Info For more information on the interpretation and computation confidence, intervals see Section 5.1

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd131.htm (4 of 4) [11/14/2003 5:50:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd132.htm (1 of 5) [11/14/2003 5:50:16 PM]


4.1.3.2. Prediction 4.1.3.2. Prediction

Of course, if the pressure/temperature experiment were repeated, the estimates of the parameters Prediction A critical part of prediction is an assessment of how much a predicted value will fluctuate due to
of the regression function obtained from the data would differ slightly each time because of the Uncertainty the noise in the data. Without that information there is no basis for comparing a predicted value to
randomness in the data and the need to sample a limited amount of data. Different parameter a target value or to another prediction. As a result, any method used for prediction should include
estimates would, in turn, yield different predicted values. The plot below illustrates the type of an assessment of the uncertainty in the predicted value(s). Fortunately it is often the case that the
slight variation that could occur in a repeated experiment. data used to fit the model to a process can also be used to compute the uncertainty of predictions
from the model. In the pressure/temperature example a prediction interval for the value of the
Predicted regresion function at 47 degrees can be computed from the data used to fit the model. The plot
Value from below shows a 99% prediction interval produced using the original data. This interval gives the
a Repeated range of plausible values for a single future pressure measurement observed at a temperature of
Experiment 47 degrees based on the parameter estimates and the noise in the data.

99%
Prediction
Interval for
Pressure at
T=47

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd132.htm (2 of 5) [11/14/2003 5:50:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd132.htm (3 of 5) [11/14/2003 5:50:16 PM]


4.1.3.2. Prediction 4.1.3.2. Prediction

More Info For more information on the interpretation and computation of prediction and tolerance intervals,
see Section 5.1.

Length of Because the prediction interval is an interval for the value of a single new measurement from the
Prediction process, the uncertainty includes the noise that is inherent in the estimates of the regression
Intervals parameters and the uncertainty of the new measurement. This means that the interval for a new
measurement will be wider than the confidence interval for the value of the regression function.
These intervals are called prediction intervals rather than confidence intervals because the latter
are for parameters, and a new measurement is a random variable, not a parameter.

Tolerance Like a prediction interval, a tolerance interval brackets the plausible values of new measurements
Intervals from the process being modeled. However, instead of bracketing the value of a single
measurement or a fixed number of measurements, a tolerance interval brackets a specified
percentage of all future measurements for a given set of predictor variable values. For example, to
monitor future pressure measurements at 47 degrees for extreme values, either low or high, a
tolerance interval that brackets 98% of all future measurements with high confidence could be
used. If a future value then fell outside of the interval, the system would then be checked to
ensure that everything was working correctly. A 99% tolerance interval that captures 98% of all
future pressure measurements at a temperature of 47 degrees is 192.4655 +/- 14.5810. This
interval is wider than the prediction interval for a single measurement because it is designed to
capture a larger proportion of all future measurements. The explanation of tolerance intervals is
potentially confusing because there are two percentages used in the description of the interval.
One, in this case 99%, describes how confident we are that the interval will capture the quantity
that we want it to capture. The other, 98%, describes what the target quantity is, which in this
case that is 98% of all future measurements at T=47 degrees.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd132.htm (4 of 5) [11/14/2003 5:50:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd132.htm (5 of 5) [11/14/2003 5:50:16 PM]


4.1.3.3. Calibration 4.1.3.3. Calibration

Thermocouple
Calibration

4. Process Modeling
4.1. Introduction to Process Modeling
4.1.3. What are process models used for?

4.1.3.3. Calibration
More on As mentioned in the page introducing the different uses of process models, the goal of calibration
Calibration is to quantitatively convert measurements made on one of two measurement scales to the other
measurement scale. The two scales are generally not of equal importance, so the conversion
occurs in only one direction. The primary measurement scale is usually the scientifically relevant
scale and measurements made directly on this scale are often the more precise (relatively) than
measurements made on the secondary scale. A process model describing the relationship between
the two measurement scales provides the means for conversion. A process model that is
constructed primarily for the purpose of calibration is often referred to as a "calibration curve". A
graphical depiction of the calibration process is shown in the plot below, using the example
described next.

Example Thermocouples are a common type of temperature measurement device that is often more
practical than a thermometer for temperature assessment. Thermocouples measure temperature in
terms of voltage, however, rather than directly on a temperature scale. In addition, the response of
a particular thermocouple depends on the exact formulation of the metals used to construct it,
meaning two thermocouples will respond somewhat differently under identical measurement
conditions. As a result, thermocouples need to be calibrated to produce interpretable measurement
information. The calibration curve for a thermocouple is often constructed by comparing Just as in estimation or prediction, if the calibration experiment were repeated, the results would
thermocouple output to relatively precise thermometer data. Then, when a new temperature is vary slighly due to the randomness in the data and the need to sample a limited amount of data
measured with the thermocouple, the voltage is converted to temperature terms by plugging the from the process. This means that an uncertainty statement that quantifies how much the results
observed voltage into the regression equation and solving for temperature. of a particular calibration could vary due to randomness is necessary. The plot below shows what
would happen if the thermocouple calibration were repeated under conditions identical to the first
The plot below shows a calibration curve for a thermocouple fit with a locally quadratic model experiment.
using a method called LOESS. Traditionally, complicated, high-degree polynomial models have
been used for thermocouple calibration, but locally linear or quadratic models offer better Calibration
computational stability and more flexibility. With the locally quadratic model the solution of the Result from
regression equation for temperature is done numerically rather than analytically, but the concept Repeated
of calibration is identical regardless of which type of model is used. It is important to note that the Experiment
thermocouple measurements, made on the secondary measurement scale, are treated as the
response variable and the more precise thermometer results, on the primary scale, are treated as
the predictor variable because this best satisfies the underlying assumptions of the analysis.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd133.htm (1 of 4) [11/14/2003 5:50:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd133.htm (2 of 4) [11/14/2003 5:50:16 PM]


4.1.3.3. Calibration 4.1.3.3. Calibration

Calibration Again, as with prediction, the data used to fit the process model can also be used to determine the In almost all calibration applications the ultimate quantity of interest is the true value of the
Uncertainty uncertainty in the calibration. Both the variation in the estimated model parameters and in the primary-scale measurement method associated with a measurement made on the secondary scale.
new voltage observation need to be accounted for. This is similar to uncertainty for the prediction As a result, there are no analogs of the prediction interval or tolerance interval in calibration.
of a new measurement. In fact, calibration intervals are computed by solving for the predictor
variable value in the formulas for a prediction interval end points. The plot below shows a 99% More Info More information on the construction and interpretation of calibration intervals can be found in
calibration interval for the original calibration data used in the first plot on this page. The area of Section 5.2 of this chapter. There is also more information on calibration, especially "one-point"
interest in the plot has been magnified so the endpoints of the interval can be visually calibrations and other special cases, in Section 3 of Chapter 2: Measurement Process
differentiated. The calibration interval is 387.3748 +/- 0.307 degrees Celsius. Characterization.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd133.htm (3 of 4) [11/14/2003 5:50:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd133.htm (4 of 4) [11/14/2003 5:50:16 PM]


4.1.3.4. Optimization 4.1.3.4. Optimization

4. Process Modeling
4.1. Introduction to Process Modeling
4.1.3. What are process models used for?

4.1.3.4. Optimization
More on As mentioned earlier, the goal of optimization is to determine the necessary process input values
Optimization to obtain a desired output. Like calibration, optimization involves substitution of an output value
for the response variable and solving for the associated predictor variable values. The process
model is again the link that ties the inputs and output together. Unlike calibration and prediction,
however, successful optimization requires a cause-and-effect relationship between the predictors
and the response variable. Designed experiments, run in a randomized order, must be used to
ensure that the process model represents a cause-and-effect relationship between the variables.
Quadratic models are typically used, along with standard calculus techniques for finding
minimums and maximums, to carry out an optimization. Other techniques can also be used,
however. The example discussed below includes a graphical depiction of the optimization
process.

Example In a manufacturing process that requires a chemical reaction to take place, the temperature and
pressure under which the process is carried out can affect reaction time. To maximize the
throughput of this process, an optimization experiment was carried out in the neighborhood of the
conditions felt to be best, using a central composite design with 13 runs. Calculus was used to
determine the input values associated with local extremes in the regression function. The plot As with prediction and calibration, randomness in the data and the need to sample data from the
below shows the quadratic surface that was fit to the data and conceptually how the input values process affect the results. If the optimization experiment were carried out again under identical
associated with the maximum throughput are found. conditions, the optimal input values computed using the model would be slightly different. Thus,
it is important to understand how much random variability there is in the results in order to
interpret the results correctly.

Optimization
Result from
Repeated
Experiment

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd134.htm (1 of 4) [11/14/2003 5:50:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd134.htm (2 of 4) [11/14/2003 5:50:16 PM]


4.1.3.4. Optimization 4.1.3.4. Optimization

Contour
Plot,
Estimated
Optimum &
Confidence
Region

Optimization As with prediction and calibration, the uncertainty in the input values estimated to maximize
Uncertainty throughput can also be computed from the data used to fit the model. Unlike prediction or More Info Computational details for optimization are primarily presented in Chapter 5: Process
calibration, however, optimization almost always involves simultaneous estimation of several Improvement along with material on appropriate experimental designs for optimization. Section
quantities, the values of the process inputs. As a result, we will compute a joint confidence region
5.5.3. specifically focuses on optimization methods and their associated uncertainties.
for all of the input values, rather than separate uncertainty intervals for each input. This
confidence region will contain the complete set of true process inputs that will maximize
throughput with high probability. The plot below shows the contours of equal throughput on a
map of various possible input value combinations. The solid contours show throughput while the
dashed contour in the center encloses the plausible combinations of input values that yield
optimum results. The "+" marks the estimated optimum value. The dashed region is a 95% joint
confidence region for the two process inputs. In this region the throughput of the process will be
approximately 217 units/hour.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd134.htm (3 of 4) [11/14/2003 5:50:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd134.htm (4 of 4) [11/14/2003 5:50:16 PM]


4.1.4. What are some of the different statistical methods for model building? 4.1.4. What are some of the different statistical methods for model building?

4. Process Modeling
4.1. Introduction to Process Modeling

4.1.4. What are some of the different


statistical methods for model
building?
Selecting an For many types of data analysis problems there are no more than a
Appropriate couple of general approaches to be considered on the route to the
Stat problem's solution. For example, there is often a dichotomy between
Method: highly-efficient methods appropriate for data with noise from a normal
General distribution and more general methods for data with other types of
Case noise. Within the different approaches for a specific problem type, there
are usually at most a few competing statistical tools that can be used to
obtain an appropriate solution. The bottom line for most types of data
analysis problems is that selection of the best statistical method to solve
the problem is largely determined by the goal of the analysis and the
nature of the data.

Selecting an Model building, however, is different from most other areas of statistics
Appropriate with regard to method selection. There are more general approaches and
Stat more competing techniques available for model building than for most
Method: other types of problems. There is often more than one statistical tool that
Modeling can be effectively applied to a given modeling application. The large
menu of methods applicable to modeling problems means that there is
both more opportunity for effective and efficient solutions and more
potential to spend time doing different analyses, comparing different
solutions and mastering the use of different tools. The remainder of this
section will introduce and briefly discuss some of the most popular and
well-established statistical techniques that are useful for different model
building situations.

Process 1. Linear Least Squares Regression


Modeling 2. Nonlinear Least Squares Regression
Methods
3. Weighted Least Squares Regression
4. LOESS (aka LOWESS)

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd14.htm (1 of 2) [11/14/2003 5:50:17 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd14.htm (2 of 2) [11/14/2003 5:50:17 PM]


4.1.4.1. Linear Least Squares Regression 4.1.4.1. Linear Least Squares Regression

said to be "linear in the parameters" or "statistically linear".

Why "Least Linear least squares regression also gets its name from the way the
Squares"? estimates of the unknown parameters are computed. The "method of
4. Process Modeling
least squares" that is used to obtain parameter estimates was
4.1. Introduction to Process Modeling independently developed in the late 1700's and the early 1800's by the
4.1.4. What are some of the different statistical methods for model building? mathematicians Karl Friedrich Gauss, Adrien Marie Legendre and
(possibly) Robert Adrain [Stigler (1978)] [Harter (1983)] [Stigler
(1986)] working in Germany, France and America, respectively. In the
4.1.4.1. Linear Least Squares Regression least squares method the unknown parameters are estimated by
minimizing the sum of the squared deviations between the data and
Modeling Linear least squares regression is by far the most widely used the model. The minimization process reduces the overdetermined
Workhorse modeling method. It is what most people mean when they say they system of equations formed by the data to a sensible system of
have used "regression", "linear regression" or "least squares" to fit a (where is the number of parameters in the functional part of the
model to their data. Not only is linear least squares regression the model) equations in unknowns. This new system of equations is
most widely used modeling method, but it has been adapted to a broad then solved to obtain the parameter estimates. To learn more about
range of situations that are outside its direct scope. It plays a strong how the method of least squares is used to estimate the parameters,
underlying role in many other modeling methods, including the other see Section 4.4.3.1.
methods discussed in this section: nonlinear least squares regression,
weighted least squares regression and LOESS. Examples of As just mentioned above, linear models are not limited to being
Linear straight lines or planes, but include a fairly wide range of shapes. For
Definition of a Used directly, with an appropriate data set, linear least squares Functions example, a simple quadratic curve
Linear Least
regression can be used to fit the data with any function of the form
Squares
Model

is linear in the statistical sense. A straight-line model in


in which
1. each explanatory variable in the function is multiplied by an
unknown parameter,
2. there is at most one unknown parameter with no corresponding
or a polynomial in
explanatory variable, and
3. all of the individual terms are summed to produce the final
function value.
In statistical terms, any function that meets these criteria would be
called a "linear function". The term "linear" is used, even though the is also linear in the statistical sense because they are linear in the
function may not be a straight line, because if the unknown parameters parameters, though not with respect to the observed explanatory
are considered to be variables and the explanatory variables are variable, .
considered to be known coefficients corresponding to those
"variables", then the problem becomes a system (usually
overdetermined) of linear equations that can be solved for the values
of the unknown parameters. To differentiate the various meanings of
the word "linear", the linear models being discussed here are often

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd141.htm (1 of 4) [11/14/2003 5:50:17 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd141.htm (2 of 4) [11/14/2003 5:50:17 PM]


4.1.4.1. Linear Least Squares Regression 4.1.4.1. Linear Least Squares Regression

Nonlinear Just as models that are linear in the statistical sense do not have to be Linear models with nonlinear terms in the predictor variables curve
Model linear with respect to the explanatory variables, nonlinear models can relatively slowly, so for inherently nonlinear processes it becomes
Example be linear with respect to the explanatory variables, but not with respect increasingly difficult to find a linear model that fits the data well as
to the parameters. For example, the range of the data increases. As the explanatory variables become
extreme, the output of the linear model will also always more extreme.
This means that linear models may not be effective for extrapolating
the results of a process for which data cannot be collected in the
region of interest. Of course extrapolation is potentially dangerous
is linear in , but it cannot be written in the general form of a linear regardless of the model type.
model presented above. This is because the slope of this line is
expressed as the product of two parameters. As a result, nonlinear Finally, while the method of least squares often gives optimal
least squares regression could be used to fit this model, but linear least estimates of the unknown parameters, it is very sensitive to the
squares cannot be used. For further examples and discussion of presence of unusual data points in the data used to fit a model. One or
nonlinear models see the next section, Section 4.1.4.2. two outliers can sometimes seriously skew the results of a least
squares analysis. This makes model validation, especially with respect
Advantages of to outliers, critical to obtaining sound answers to the questions
Linear least squares regression has earned its place as the primary tool
Linear Least motivating the construction of the model.
for process modeling because of its effectiveness and completeness.
Squares

Though there are types of data that are better described by functions
that are nonlinear in the parameters, many processes in science and
engineering are well-described by linear models. This is because
either the processes are inherently linear or because, over short ranges,
any process can be well-approximated by a linear model.

The estimates of the unknown parameters obtained from linear least


squares regression are the optimal estimates from a broad class of
possible parameter estimates under the usual assumptions used for
process modeling. Practically speaking, linear least squares regression
makes very efficient use of the data. Good results can be obtained
with relatively small data sets.

Finally, the theory associated with linear regression is well-understood


and allows for construction of different types of easily-interpretable
statistical intervals for predictions, calibrations, and optimizations.
These statistical intervals can then be used to give clear answers to
scientific and engineering questions.

Disadvantages The main disadvantages of linear least squares are limitations in the
of Linear shapes that linear models can assume over long ranges, possibly poor
Least Squares extrapolation properties, and sensitivity to outliers.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd141.htm (3 of 4) [11/14/2003 5:50:17 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd141.htm (4 of 4) [11/14/2003 5:50:17 PM]


4.1.4.2. Nonlinear Least Squares Regression 4.1.4.2. Nonlinear Least Squares Regression

Due to the way in which the unknown parameters of the function are
usually estimated, however, it is often much easier to work with
models that meet two additional criteria:
4. Process Modeling 3. the function is smooth with respect to the unknown parameters,
4.1. Introduction to Process Modeling and
4.1.4. What are some of the different statistical methods for model building? 4. the least squares criterion that is used to obtain the parameter
estimates has a unique solution.
4.1.4.2. Nonlinear Least Squares These last two criteria are not essential parts of the definition of a
nonlinear least squares model, but are of practical importance.
Regression
Examples of Some examples of nonlinear models include:
Extension of Nonlinear least squares regression extends linear least squares Nonlinear
Linear Least regression for use with a much larger and more general class of Models
Squares functions. Almost any function that can be written in closed form can
Regression be incorporated in a nonlinear regression model. Unlike linear
regression, there are very few limitations on the way parameters can
be used in the functional part of a nonlinear regression model. The
way in which the unknown parameters in the function are estimated,
however, is conceptually the same as it is in linear least squares
regression.

Definition of a As the name suggests, a nonlinear model is any model of the basic
Nonlinear form
Regression
Model
.

in which
1. the functional part of the model is not linear with respect to the
unknown parameters, , and
2. the method of least squares is used to estimate the values of the
unknown parameters.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd142.htm (1 of 4) [11/14/2003 5:50:17 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd142.htm (2 of 4) [11/14/2003 5:50:17 PM]


4.1.4.2. Nonlinear Least Squares Regression 4.1.4.2. Nonlinear Least Squares Regression

Advantages of The biggest advantage of nonlinear least squares regression over many Disadvantages shared with the linear least squares procedure includes
Nonlinear other techniques is the broad range of functions that can be fit. a strong sensitivity to outliers. Just as in a linear least squares analysis,
Least Squares Although many scientific and engineering processes can be described the presence of one or two outliers in the data can seriously affect the
well using linear models, or other relatively simple types of models, results of a nonlinear analysis. In addition there are unfortunately
there are many other processes that are inherently nonlinear. For fewer model validation tools for the detection of outliers in nonlinear
example, the strengthening of concrete as it cures is a nonlinear regression than there are for linear regression.
process. Research on concrete strength shows that the strength
increases quickly at first and then levels off, or approaches an
asymptote in mathematical terms, over time. Linear models do not
describe processes that asymptote very well because for all linear
functions the function value can't increase or decrease at a declining
rate as the explanatory variables go to the extremes. There are many
types of nonlinear models, on the other hand, that describe the
asymptotic behavior of a process well. Like the asymptotic behavior
of some processes, other features of physical processes can often be
expressed more easily using nonlinear models than with simpler
model types.

Being a "least squares" procedure, nonlinear least squares has some of


the same advantages (and disadvantages) that linear least squares
regression has over other methods. One common advantage is
efficient use of data. Nonlinear regression can produce good estimates
of the unknown parameters in the model with relatively small data
sets. Another advantage that nonlinear least squares shares with linear
least squares is a fairly well-developed theory for computing
confidence, prediction and calibration intervals to answer scientific
and engineering questions. In most cases the probabilistic
interpretation of the intervals produced by nonlinear regression are
only approximately correct, but these intervals still work very well in
practice.

Disadvantages The major cost of moving to nonlinear least squares regression from
of Nonlinear simpler modeling techniques like linear least squares is the need to use
Least Squares iterative optimization procedures to compute the parameter estimates.
With functions that are linear in the parameters, the least squares
estimates of the parameters can always be obtained analytically, while
that is generally not the case with nonlinear models. The use of
iterative procedures requires the user to provide starting values for the
unknown parameters before the software can begin the optimization.
The starting values must be reasonably close to the as yet unknown
parameter estimates or the optimization procedure may not converge.
Bad starting values can also cause the software to converge to a local
minimum rather than the global minimum that defines the least
squares estimates.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd142.htm (3 of 4) [11/14/2003 5:50:17 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd142.htm (4 of 4) [11/14/2003 5:50:17 PM]


4.1.4.3. Weighted Least Squares Regression 4.1.4.3. Weighted Least Squares Regression

Model Types Unlike linear and nonlinear least squares regression, weighted least squares regression is not
and Weighted associated with a particular type of function used to describe the relationship between the process
Least Squares variables. Instead, weighted least squares reflects the behavior of the random errors in the model;
4. Process Modeling and it can be used with functions that are either linear or nonlinear in the parameters. It works by
4.1. Introduction to Process Modeling incorporating extra nonnegative constants, or weights, associated with each data point, into the
4.1.4. What are some of the different statistical methods for model building? fitting criterion. The size of the weight indicates the precision of the information contained in the
associated observation. Optimizing the weighted fitting criterion to find the parameter estimates
allows the weights to determine the contribution of each observation to the final parameter
4.1.4.3. Weighted Least Squares Regression estimates. It is important to note that the weight for each observation is given relative to the
weights of the other observations; so different sets of absolute weights can have identical effects.
Handles One of the common assumptions underlying most process modeling methods, including linear
Cases Where and nonlinear least squares regression, is that each data point provides equally precise Advantages of Like all of the least squares methods discussed so far, weighted least squares is an efficient
Data Quality information about the deterministic part of the total process variation. In other words, the standard Weighted method that makes good use of small data sets. It also shares the ability to provide different types
Varies deviation of the error term is constant over all values of the predictor or explanatory variables. Least Squares of easily interpretable statistical intervals for estimation, prediction, calibration and optimization.
This assumption, however, clearly does not hold, even approximately, in every modeling In addition, as discussed above, the main advantage that weighted least squares enjoys over other
application. For example, in the semiconductor photomask linespacing data shown below, it methods is the ability to handle regression situations in which the data points are of varying
appears that the precision of the linespacing measurements decreases as the line spacing quality. If the standard deviation of the random errors in the data is not constant across all levels
increases. In situations like this, when it may not be reasonable to assume that every observation of the explanatory variables, using weighted least squares with weights that are inversely
should be treated equally, weighted least squares can often be used to maximize the efficiency of proportional to the variance at each level of the explanatory variables yields the most precise
parameter estimation. This is done by attempting to give each data point its proper amount of parameter estimates possible.
influence over the parameter estimates. A procedure that treats all of the data equally would give
less precisely measured points more influence than they should have and would give highly Disadvantages The biggest disadvantage of weighted least squares, which many people are not aware of, is
precise points too little influence. of Weighted probably the fact that the theory behind this method is based on the assumption that the weights
Least Squares are known exactly. This is almost never the case in real applications, of course, so estimated
Linespacing weights must be used instead. The effect of using estimated weights is difficult to assess, but
Measurement experience indicates that small variations in the the weights due to estimation do not often affect a
Error Data regression analysis or its interpretation. However, when the weights are estimated from small
numbers of replicated observations, the results of an analysis can be very badly and unpredictably
affected. This is especially likely to be the case when the weights for extreme values of the
predictor or explanatory variables are estimated using only a few observations. It is important to
remain aware of this potential problem, and to only use weighted least squares when the weights
can be estimated precisely relative to one another [Carroll and Ruppert (1988), Ryan (1997)].

Weighted least squares regression, like the other least squares methods, is also sensitive to the
effects of outliers. If potential outliers are not investigated and dealt with appropriately, they will
likely have a negative impact on the parameter estimation and other aspects of a weighted least
squares analysis. If a weighted least squares regression actually increases the influence of an
outlier, the results of the analysis may be far inferior to an unweighted least squares analysis.

Futher Further information on the weighted least squares fitting criterion can be found in Section 4.3.
Information Discussion of methods for weight estimation can be found in Section 4.5.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd143.htm (1 of 2) [11/14/2003 5:50:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd143.htm (2 of 2) [11/14/2003 5:50:18 PM]


4.1.4.4. LOESS (aka LOWESS) 4.1.4.4. LOESS (aka LOWESS)

Definition of a LOESS, originally proposed by Cleveland (1979) and further


LOESS Model developed by Cleveland and Devlin (1988), specifically denotes a
method that is (somewhat) more descriptively known as locally
4. Process Modeling weighted polynomial regression. At each point in the data set a
4.1. Introduction to Process Modeling low-degree polynomial is fit to a subset of the data, with explanatory
4.1.4. What are some of the different statistical methods for model building? variable values near the point whose response is being estimated. The
polynomial is fit using weighted least squares, giving more weight to
points near the point whose response is being estimated and less
4.1.4.4. LOESS (aka LOWESS) weight to points further away. The value of the regression function for
the point is then obtained by evaluating the local polynomial using the
Useful When LOESS is one of many "modern" modeling methods that build on explanatory variable values for that data point. The LOESS fit is
"classical" methods, such as linear and nonlinear least squares complete after regression function values have been computed for
regression. Modern regression methods are designed to address each of the n data points. Many of the details of this method, such as
Unknown & situations in which the classical procedures do not perform well or the degree of the polynomial model and the weights, are flexible. The
Complicated cannot be effectively applied without undue labor. LOESS combines range of choices for each part of the method and typical defaults are
much of the simplicity of linear least squares regression with the briefly discussed next.
flexibility of nonlinear regression. It does this by fitting simple models
to localized subsets of the data to build up a function that describes the Localized The subsets of data used for each weighted least squares fit in LOESS
deterministic part of the variation in the data, point by point. In fact, Subsets of are determined by a nearest neighbors algorithm. A user-specified
one of the chief attractions of this method is that the data analyst is not Data input to the procedure called the "bandwidth" or "smoothing
required to specify a global function of any form to fit a model to the parameter" determines how much of the data is used to fit each local
data, only to fit segments of the data. polynomial. The smoothing parameter, q, is a number between
(d+1)/n and 1, with d denoting the degree of the local polynomial. The
The trade-off for these features is increased computation. Because it is value of q is the proportion of data used in each fit. The subset of data
so computationally intensive, LOESS would have been practically used in each weighted least squares fit is comprised of the nq
impossible to use in the era when least squares regression was being (rounded to the next largest integer) points whose explanatory
developed. Most other modern methods for process modeling are variables values are closest to the point at which the response is being
estimated.
similar to LOESS in this respect. These methods have been
consciously designed to use our current computational ability to the
fullest possible advantage to achieve goals not easily achieved by q is called the smoothing parameter because it controls the flexibility
traditional approaches. of the LOESS regression function. Large values of q produce the
smoothest functions that wiggle the least in response to fluctuations in
the data. The smaller q is, the closer the regression function will
conform to the data. Using too small a value of the smoothing
parameter is not desirable, however, since the regression function will
eventually start to capture the random error in the data. Useful values
of the smoothing parameter typically lie in the range 0.25 to 0.5 for
most LOESS applications.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd144.htm (1 of 5) [11/14/2003 5:50:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd144.htm (2 of 5) [11/14/2003 5:50:18 PM]


4.1.4.4. LOESS (aka LOWESS) 4.1.4.4. LOESS (aka LOWESS)

Degree of The local polynomials fit to each subset of the data are almost always Advantages of As discussed above, the biggest advantage LOESS has over many
Local of first or second degree; that is, either locally linear (in the straight LOESS other methods is the fact that it does not require the specification of a
Polynomials line sense) or locally quadratic. Using a zero degree polynomial turns function to fit a model to all of the data in the sample. Instead the
LOESS into a weighted moving average. Such a simple local model analyst only has to provide a smoothing parameter value and the
might work well for some situations, but may not always approximate degree of the local polynomial. In addition, LOESS is very flexible,
the underlying function well enough. Higher-degree polynomials making it ideal for modeling complex processes for which no
would work in theory, but yield models that are not really in the spirit theoretical models exist. These two advantages, combined with the
of LOESS. LOESS is based on the ideas that any function can be well simplicity of the method, make LOESS one of the most attractive of
approximated in a small neighborhood by a low-order polynomial and the modern regression methods for applications that fit the general
that simple models can be fit to data easily. High-degree polynomials framework of least squares regression but which have a complex
would tend to overfit the data in each subset and are numerically deterministic structure.
unstable, making accurate computations difficult.
Although it is less obvious than for some of the other methods related
Weight As mentioned above, the weight function gives the most weight to the to linear least squares regression, LOESS also accrues most of the
Function data points nearest the point of estimation and the least weight to the benefits typically shared by those procedures. The most important of
data points that are furthest away. The use of the weights is based on those is the theory for computing uncertainties for prediction and
the idea that points near each other in the explanatory variable space calibration. Many other tests and procedures used for validation of
are more likely to be related to each other in a simple way than points least squares models can also be extended to LOESS models.
that are further apart. Following this logic, points that are likely to
follow the local model best influence the local model parameter Disadvantages Although LOESS does share many of the best features of other least
estimates the most. Points that are less likely to actually conform to of LOESS squares methods, efficient use of data is one advantage that LOESS
the local model have less influence on the local model parameter doesn't share. LOESS requires fairly large, densely sampled data sets
estimates. in order to produce good models. This is not really surprising,
however, since LOESS needs good empirical information on the local
The traditional weight function used for LOESS is the tri-cube weight structure of the process in order perform the local fitting. In fact, given
function, the results it provides, LOESS could arguably be more efficient
overall than other methods like nonlinear least squares. It may simply
frontload the costs of an experiment in data collection but then reduce
analysis costs.
.
Another disadvantage of LOESS is the fact that it does not produce a
regression function that is easily represented by a mathematical
formula. This can make it difficult to transfer the results of an analysis
However, any other weight function that satisfies the properties listed to other people. In order to transfer the regression function to another
in Cleveland (1979) could also be used. The weight for a specific person, they would need the data set and software for LOESS
point in any localized subset of data is obtained by evaluating the calculations. In nonlinear regression, on the other hand, it is only
weight function at the distance between that point and the point of necessary to write down a functional form in order to provide
estimation, after scaling the distance so that the maximum absolute estimates of the unknown parameters and the estimated uncertainty.
distance over all of the points in the subset of data is exactly one. Depending on the application, this could be either a major or a minor
drawback to using LOESS.
Examples A simple computational example is given here to further illustrate
exactly how LOESS works. A more realistic example, showing a
LOESS model used for thermocouple calibration, can be found in
Section 4.1.3.2

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd144.htm (3 of 5) [11/14/2003 5:50:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd144.htm (4 of 5) [11/14/2003 5:50:18 PM]


4.1.4.4. LOESS (aka LOWESS) 4.2. Underlying Assumptions for Process Modeling

Finally, as discussed above, LOESS is a computational intensive


method. This is not usually a problem in our current computing
environment, however, unless the data sets being used are very large.
LOESS is also prone to the effects of outliers in the data set, like other 4. Process Modeling
least squares methods. There is an iterative, robust version of LOESS
[Cleveland (1979)] that can be used to reduce LOESS' sensitivity to
outliers, but extreme outliers can still overcome even the robust 4.2. Underlying Assumptions for Process
method.
Modeling
Implicit Most, if not all, thoughtful actions that people take are based on ideas,
Assumptions or assumptions, about how those actions will affect the goals they want
Underlie to achieve. The actual assumptions used to decide on a particular course
Most of action are rarely laid out explicitly, however. Instead, they are only
Actions implied by the nature of the action itself. Implicit assumptions are
inherent to process modeling actions, just as they are to most other types
of action. It is important to understand what the implicit assumptions are
for any process modeling method because the validity of these
assumptions affect whether or not the goals of the analysis will be met.

Checking If the implicit assumptions that underlie a particular action are not true,
Assumptions then that action is not likely to meet expectations either. Sometimes it is
Provides abundantly clear when a goal has been met, but unfortunately that is not
Feedback on always the case. In particular, it is usually not possible to obtain
Actions immediate feedback on the attainment of goals in most process
modeling applications. The goals of process modeling, sucha as
answering a scientific or engineering question, depend on the
correctness of a process model, which can often only be directly and
absolutely determined over time. In lieu of immediate, direct feedback,
however, indirect information on the effectiveness of a process
modeling analysis can be obtained by checking the validity of the
underlying assumptions. Confirming that the underlying assumptions
are valid helps ensure that the methods of analysis were appropriate and
that the results will be consistent with the goals.

Overview of This section discusses the specific underlying assumptions associated


Section 4.2 with most model-fitting methods. In discussing the underlying
assumptions, some background is also provided on the consequences of
stopping the modeling process short of completion and leaving the
results of an analysis at odds with the underlying assumptions. Specific
data analysis methods that can be used to check whether or not the
assumptions hold in a particular case are discussed in Section 4.4.4.

http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd144.htm (5 of 5) [11/14/2003 5:50:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd2.htm (1 of 2) [11/14/2003 5:50:18 PM]


4.2. Underlying Assumptions for Process Modeling 4.2.1. What are the typical underlying assumptions in process modeling?

Contents of 1. What are the typical underlying assumptions in process


Section 4.2 modeling?
1. The process is a statistical process.
4. Process Modeling
2. The means of the random errors are zero. 4.2. Underlying Assumptions for Process Modeling
3. The random errors have a constant standard deviation.
4. The random errors follow a normal distribution.
4.2.1. What are the typical underlying
5. The data are randomly sampled from the process.
6. The explanatory variables are observed without error.
assumptions in process modeling?
Overview of This section lists the typical assumptions underlying most process
Section 4.2.1 modeling methods. On each of the following pages, one of the six
major assumptions is described individually; the reasons for it's
importance are also briefly discussed; and any methods that are not
subject to that particular assumption are noted. As discussed on the
previous page, these are implicit assumptions based on properties
inherent to the process modeling methods themselves. Successful use
of these methods in any particular application hinges on the validity of
the underlying assumptions, whether their existence is acknowledged
or not. Section 4.4.4 discusses methods for checking the validity of
these assumptions.

Typical 1. The process is a statistical process.


Assumptions 2. The means of the random errors are zero.
for Process
Modeling 3. The random errors have a constant standard deviation.
4. The random errors follow a normal distribution.
5. The data are randomly sampled from the process.
6. The explanatory variables are observed without error.

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd2.htm (2 of 2) [11/14/2003 5:50:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd21.htm [11/14/2003 5:50:18 PM]


4.2.1.1. The process is a statistical process. 4.2.1.1. The process is a statistical process.

This Fortunately this assumption is valid for most physical processes.


Assumption There will be random error in the measurements almost any time
Usually Valid things need to be measured. In fact, there are often other sources of
4. Process Modeling random error, over and above measurement error, in complex, real-life
4.2. Underlying Assumptions for Process Modeling processes. However, examples of non-statistical processes include
4.2.1. What are the typical underlying assumptions in process modeling? 1. physical processes in which the random error is negligible
compared to the systematic errors,
2. processes based on deterministic computer simulations,
4.2.1.1. The process is a statistical process. 3. processes based on theoretical calculations.
"Statistical" The most basic assumption inherent to all statistical methods for If models of these types of processes are needed, use of mathematical
Implies process modeling is that the process to be described is actually a rather than statistical process modeling tools would be more
Random statistical process. This assumption seems so obvious that it is appropriate.
Variation sometimes overlooked by analysts immersed in the details of a
process or in a rush to uncover information of interest from an Distinguishing One sure indicator that a process is statistical is if repeated
exciting new data set. However, in order to successfully model a Process Types observations of the process response under a particular fixed condition
process using statistical methods, it must include random variation. yields different results. The converse, repeated observations of the
Random variation is what makes the process statistical rather than process response always yielding the same value, is not a sure
purely deterministic. indication of a non-statistical process, however. For example, in some
types of computations in which complex numerical methods are used
Role of The overall goal of all statistical procedures, including those designed to approximate the solutions of theoretical equations, the results of a
Random for process modeling, is to enable valid conclusions to be drawn from computation might deviate from the true solution in an essentially
Variation noisy data. As a result, statistical procedures are designed to compare random way because of the interactions of round-off errors, multiple
apparent effects found in a data set to the noise in the data in order to levels of approximation, stopping rules, and other sources of error.
determine whether the effects are more likely to be caused by a Even so, the result of the computation might be the same each time it
repeatable underlying phenomenon of some sort or by fluctuations in is repeated because all of the initial conditions of the calculation are
the data that happened by chance. Thus the random variation in the reset to the same values each time the calculation is made. As a result,
process serves as a baseline for drawing conclusions about the nature scientific or engineering knowledge of the process must also always
of the deterministic part of the process. If there were no random noise be used to determine whether or not a given process is statistical.
in the process, then conclusions based on statistical methods would no
longer make sense or be appropriate.

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd211.htm (1 of 2) [11/14/2003 5:50:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd211.htm (2 of 2) [11/14/2003 5:50:18 PM]


4.2.1.2. The means of the random errors are zero. 4.2.1.2. The means of the random errors are zero.

Other processes may be less easily dealt with, being subject to


measurement drift or other systematic errors. For these processes it
may be possible to eliminate or at least reduce the effects of the
4. Process Modeling systematic errors by using good experimental design techniques, such
4.2. Underlying Assumptions for Process Modeling as randomization of the measurement order. Randomization can
4.2.1. What are the typical underlying assumptions in process modeling? effectively convert systematic measurement errors into additional
random process error. While adding to the random error of the process
is undesirable, this will provide the best possible information from the
4.2.1.2. The means of the random errors are data about the regression function, which is the current goal.
zero. In the most difficult processes even good experimental design may not
be able to salvage a set of data that includes a high level of systematic
Parameter To be able to estimate the unknown parameters in the regression error. In these situations the best that can be hoped for is recognition of
Estimation function, it is necessary to know how the data at each point in the the fact that the true regression function has not been identified by the
Requires explanatory variable space relate to the corresponding value of the analysis. Then effort can be put into finding a better way to solve the
Known regression function. For example, if the measurement system used to problem by correcting for the systematic error using additional
Relationship observe the values of the response variable drifts over time, then the information, redesigning the measurement system to eliminate the
Between deterministic variation in the data would be the sum of the drift systematic errors, or reformulating the problem to obtain the needed
Data and function and the true regression function. As a result, either the data information another way.
Regression would need to be adjusted prior to fitting the model or the fitted model
Function would need to be adjusted after the fact to obtain the regression Assumption Another more subtle violation of this assumption occurs when the
function. In either case, information about the form of the drift function Violated by explanatory variables are observed with random error. Although it
would be needed. Since it would be difficult to generalize an activity Errors in intuitively seems like random errors in the explanatory variables should
like drift correction to a generic process, and since it would also be Observation cancel out on average, just as random errors in the observation of the
unnecessary for many processes, most process modeling methods rely of response variable do, that is unfortunately not the case. The direct
on having data in which the observed responses are directly equal, on linkage between the unknown parameters and the explanatory variables
average, to the regression function values. Another way of expressing in the functional part of the model makes this situation much more
this idea is to say the mean of the random errors at each combination of complicated than it is for the random errors in the response variable .
explanatory variable values is zero. More information on why this occurs can be found in Section 4.2.1.6.

Validity of The validity of this assumption is determined by both the nature of the
Assumption process and, to some extent, by the data collection methods used. The
Improved by process may be one in which the data are easily measured and it will be
Experimental clear that the data have a direct relationship to the regression function.
Design When this is the case, use of optimal methods of data collection are not
critical to the success of the modeling effort. Of course, it is rarely
known that this will be the case for sure, so it is usually worth the effort
to collect the data in the best way possible.

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd212.htm (1 of 2) [11/14/2003 5:50:19 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd212.htm (2 of 2) [11/14/2003 5:50:19 PM]


4.2.1.3. The random errors have a constant standard deviation. 4.2.1.3. The random errors have a constant standard deviation.

Assumption The assumption that the random errors have constant standard deviation
Not Needed is not implicit to weighted least squares regression. Instead, it is
for Weighted assumed that the weights provided in the analysis correctly indicate the
4. Process Modeling Least differing levels of variability present in the response variables. The
4.2. Underlying Assumptions for Process Modeling Squares weights are then used to adjust the amount of influence each data point
4.2.1. What are the typical underlying assumptions in process modeling? has on the estimates of the model parameters to an appropriate level.
They are also used to adjust prediction and calibration uncertainties to
the correct levels for different regions of the data set.
4.2.1.3. The random errors have a constant
Assumption Even though it uses weighted least squares to estimate the model
standard deviation. Does Apply parameters, LOESS still relies on the assumption of a constant standard
to LOESS deviation. The weights used in LOESS actually reflect the relative level
All Data Due to the presence of random variation, it can be difficult to determine of similarity between mean response values at neighboring points in the
Treated whether or not all of the data in a data set are of equal quality. As a explanatory variable space rather than the level of response precision at
Equally by result, most process modeling procedures treat all of the data equally each set of explanatory variable values. Actually, because LOESS uses
Most when using it to estimate the unknown parameters in the model. Most separate parameter estimates in each localized subset of data, it does not
Process methods also use a single estimate of the amount of random variability require the assumption of a constant standard deviation of the data for
Modeling in the data for computing prediction and calibration uncertainties. parameter estimation. The subsets of data used in LOESS are usually
Methods Treating all of the data in the same way also yields simpler, small enough that the precision of the data is roughly constant within
easier-to-use models. Not surprisingly, however, the decision to treat the each subset. LOESS normally makes no provisions for adjusting
data like this can have a negative effect on the quality of the resulting uncertainty computations for differing levels of precision across a data
model too, if it turns out the data are not all of equal quality. set, however.

Data Of course data quality can't be measured point-by-point since it is clear


Quality from direct observation of the data that the amount of error in each point
Measured by varies. Instead, points that have the same underlying average squared
Standard error, or variance, are considered to be of equal quality. Even though
Deviation the specific process response values observed at points that meet this
criterion will have different errors, the data collected at such points will
be of equal quality over repeated data collections. Since the standard
deviation of the data at each set of explanatory variable values is simply
the square root of its variance, the standard deviation of the data for
each different combination of explanatory variables can also be used to
measure data quality. The standard deviation is preferred, in fact,
because it has the advantage of being measured in the same units as the
response variable, making it easier to relate to this statistic.

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd213.htm (1 of 2) [11/14/2003 5:50:19 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd213.htm (2 of 2) [11/14/2003 5:50:19 PM]


4.2.1.4. The random errors follow a normal distribution. 4.2.1.4. The random errors follow a normal distribution.

Non-Normal Of course, if it turns out that the random errors in the process are not
Random normally distributed, then any inferences made about the process may
Errors May be incorrect. If the true distribution of the random errors is such that
4. Process Modeling Result in the scatter in the data is less than it would be under a normal
4.2. Underlying Assumptions for Process Modeling Incorrect distribution, it is possible that the intervals used to capture the values
4.2.1. What are the typical underlying assumptions in process modeling? Inferences of the process parameters will simply be a little longer than necessary.
The intervals will then contain the true process parameters more often
than expected. It is more likely, however, that the intervals will be too
4.2.1.4. The random errors follow a normal short or will be shifted away from the true mean value of the process
parameter being estimated. This will result in intervals that contain the
distribution. true process parameters less often than expected. When this is the case,
the intervals produced under the normal distribution assumption will
Primary Need After fitting a model to the data and validating it, scientific or likely lead to incorrect conclusions being drawn about the process.
for engineering questions about the process are usually answered by
Distribution computing statistical intervals for relevant process quantities using the Parameter The methods used for parameter estimation can also imply the
Information is model. These intervals give the range of plausible values for the Estimation assumption of normally distributed random errors. Some methods, like
Inference process parameters based on the data and the underlying assumptions Methods Can maximum likelihood, use the distribution of the random errors directly
about the process. Because of the statistical nature of the process, Require to obtain parameter estimates. Even methods that do not use
however, the intervals cannot always be guaranteed to include the true Gaussian distributional methods for parameter estimation directly, like least
process parameters and still be narrow enough to be useful. Instead the Errors squares, often work best for data that are free from extreme random
intervals have a probabilistic interpretation that guarantees coverage of fluctuations. The normal distribution is one of the probability
the true process parameters a specified proportion of the time. In order distributions in which extreme random errors are rare. If some other
for these intervals to truly have their specified probabilistic distribution actually describes the random errors better than the normal
interpretations, the form of the distribution of the random errors must distribution does, then different parameter estimation methods might
be known. Although the form of the probability distribution must be need to be used in order to obtain good estimates of the values of the
known, the parameters of the distribution can be estimated from the unknown parameters in the model.
data.

Of course the random errors from different types of processes could be


described by any one of a wide range of different probability
distributions in general, including the uniform, triangular, double
exponential, binomial and Poisson distributions. With most process
modeling methods, however, inferences about the process are based on
the idea that the random errors are drawn from a normal distribution.
One reason this is done is because the normal distribution often
describes the actual distribution of the random errors in real-world
processes reasonably well. The normal distribution is also used
because the mathematical theory behind it is well-developed and
supports a broad array of inferences on functions of the data relevant
to different types of questions about the process.

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd214.htm (1 of 2) [11/14/2003 5:50:19 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd214.htm (2 of 2) [11/14/2003 5:50:19 PM]


4.2.1.5. The data are randomly sampled from the process. 4.2.1.5. The data are randomly sampled from the process.

This Obtaining data is of course something that is actually done by the


Assumption analyst rather than being a feature of the process itself. This gives the
Relatively analyst some ability to ensure that this assumption will be valid. Paying
4. Process Modeling Controllable careful attention to data collection procedures and employing
4.2. Underlying Assumptions for Process Modeling experimental design principles like randomization of the run order will
4.2.1. What are the typical underlying assumptions in process modeling? yield a sample of data that is as close as possible to being perfectly
randomly sampled from the process. Section 4.3.3 has additional
discussion of some of the principles of good experimental design.
4.2.1.5. The data are randomly sampled
from the process.
Data Must Since the random variation inherent in the process is critical to
Reflect the obtaining satisfactory results from most modeling methods, it is
Process important that the data reflect that random variation in a representative
way. Because of the nearly infinite number of ways non-representative
sampling might be done, however, few, if any, statistical methods
would ever be able to correct for the effects that would have on the data.
Instead, these methods rely on the assumption that the data will be
representative of the process. This means that if the variation in the data
is not representative of the process, the nature of the deterministic part
of the model, described by the function, , will be incorrect.
This, in turn, is likely to lead to incorrect conclusions being drawn
when the model is used to answer scientific or engineering questions
about the process.

Data Best Given that we can never determine what the actual random errors in a
Reflects the particular data set are, representative samples of data are best obtained
Process Via by randomly sampling data from the process. In a simple random
Unbiased sample, every response from the population(s) being sampled has an
Sampling equal chance of being observed. As a result, while it cannot guarantee
that each sample will be representative of the process, random sampling
does ensure that the act of data collection does not leave behind any
biases in the data, on average. This means that most of the time, over
repeated samples, the data will be representative of the process. In
addition, under random sampling, probability theory can be used to
quantify how often particular modeling procedures will be affected by
relatively extreme variations in the data, allowing us to control the error
rates experienced when answering questions about the process.

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd215.htm (1 of 2) [11/14/2003 5:50:19 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd215.htm (2 of 2) [11/14/2003 5:50:19 PM]


4.2.1.6. The explanatory variables are observed without error. 4.2.1.6. The explanatory variables are observed without error.

are [Seber (1989)]

4. Process Modeling
4.2. Underlying Assumptions for Process Modeling
4.2.1. What are the typical underlying assumptions in process modeling?

4.2.1.6. The explanatory variables are


observed without error.
Assumption As discussed earlier in this section, the random errors (the 's) in the
Needed for basic model, with denoting the random error associated with the basic form of
Parameter
Estimation the model,
,
,
must have a mean of zero at each combination of explanatory variable
values to obtain valid estimates of the parameters in the functional part
under all of the usual assumptions (denoted here more carefully than is
of the process model (the 's). Some of the more obvious sources of
random errors with non-zero means include usually necessary), and is a value between and . This
1. drift in the process, extra term in the expression of the random error, ,
2. drift in the measurement system used to obtain the process data, complicates matters because is typically not a constant.
and
For most functions, will depend on the explanatory
3. use of a miscalibrated measurement system.
However, the presence of random errors in the measured values of the variable values and, more importantly, on . This is the source of the
explanatory variables is another, more subtle, source of 's with problem with observing the explanatory variable values with random
non-zero means. error.

Explanatory The values of explanatory variables observed with independent, Correlated Because each of the components of , denoted by , are functions
Variables with
normally distributed random errors, , can be differentiated from their of the components of , similarly denoted by , whenever any of the
Observed
true values using the definition components of simplify to expressions that are not
with Random
Error Add constant, the random variables and will be correlated.
Terms to . This correlation will then usually induce a non-zero mean in the
product .
Then applying the mean value theorem from multivariable calculus
shows that the random errors in a model based on ,

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd216.htm (1 of 5) [11/14/2003 5:50:30 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd216.htm (2 of 5) [11/14/2003 5:50:30 PM]


4.2.1.6. The explanatory variables are observed without error. 4.2.1.6. The explanatory variables are observed without error.

For example, a positive correlation between and means Berkson There is one type of model in which errors in the measurement of the
Model Does explanatory variables do not bias the parameter estimates. The Berkson
that when is large, will also tend to be large. Similarly, Not Depend model [Berkson (1950)] is a model in which the observed values of the
when is small, will also tend to be small. This could on this explanatory variables are directly controlled by the experimenter while
Assumption their true values vary for each observation. The differences between
cause and to always have the same sign, which would the observed and true values for each explanatory variable are assumed
preclude their product having a mean of zero since all of the values of to be independent random variables from a normal distribution with a
would be greater than or equal to zero. A negative mean of zero. In addition, the errors associated with each explanatory
correlation, on the other hand, could mean that these two random variable must be independent of the errors associated with all of the
variables would always have opposite signs, resulting in a negative other explanatory variables, and also independent of the observed
values of each explanatory variable. Finally, the Berkson model
mean for . These examples are extreme, but illustrate requires the functional part of the model to be a straight line, a plane,
how correlation can cause trouble even if both and have or a higher-dimension first-order model in the explanatory variables.
zero means individually. What will happen in any particular modeling When these conditions are all met, the errors in the explanatory
variables can be ignored.
situation will depend on the variability of the 's, the form of the
function, the true values of the 's, and the values of the explanatory Applications for which the Berkson model correctly describes the data
variables. are most often situations in which the experimenter can adjust
equipment settings so that the observed values of the explanatory
Biases Can Even if the 's have zero means, observation of the explanatory variables will be known ahead of time. For example, in a study of the
Affect variables with random error can still bias the parameter estimates. relationship between the temperature used to dry a sample for chemical
Parameter Depending on the method used to estimate the parameters, the analysis and the resulting concentration of a volatile consituent, an
Estimates explanatory variables can be used in the computation of the parameter oven might be used to prepare samples at temperatures of 300 to 500
When Means degrees in 50 degree increments. In reality, however, the true
estimates in ways that keep the 's from canceling out. One
of 's are 0 temperature inside the oven will probably not exactly equal 450
unfortunate example of this phenomenon is the use of least squares to
estimate the parameters of a straight line. In this case, because of the degrees each time that setting is used (or 300 when that setting is used,
simplicity of the model, etc). The Berkson model would apply, though, as long as the errors in
measuring the temperature randomly differed from one another each
time an observed value of 450 degrees was used and the mean of the
, true temperatures over many repeated runs at an oven setting of 450
degrees really was 450 degrees. Then, as long as the model was also a
the term simplifies to . Because this term does not straight line relating the concentration to the observed values of
temperature, the errors in the measurement of temperature would not
involve , it does not induce non-zero means in the 's. With the way bias the estimates of the parameters.
the explanatory variables enter into the formulas for the estimates of
the 's, the random errors in the explanatory variables do not cancel
out on average. This results in parameter estimators that are biased and
will not approach the true parameter values no matter how much data
are collected.

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd216.htm (3 of 5) [11/14/2003 5:50:30 PM] http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd216.htm (4 of 5) [11/14/2003 5:50:30 PM]


4.2.1.6. The explanatory variables are observed without error. 4.3. Data Collection for Process Modeling

Assumption The validity of this assumption requires careful consideration in


Validity scientific and engineering applications. In these types of applications it
Requires is most often the case that the response variable and the explanatory
Careful variables will all be measured with some random error. Fortunately, 4. Process Modeling
Consideration however, there is also usually some knowledge of the relative amount
of information in the observed values of each variable. This allows a
rough assessment of how much bias there will be in the estimated 4.3. Data Collection for Process Modeling
values of the parameters. As long as the biases in the parameter
estimators have a negligible effect on the intended use of the model,
Collecting This section lays out some general principles for collecting data for
then this assumption can be considered valid from a practical
viewpoint. Section 4.4.4, which covers model validation, points to a Good Data construction of process models. Using well-planned data collection
procedures is often the difference between successful and unsuccessful
discussion of a practical method for checking the validity of this
experiments. In addition, well-designed experiments are often less
assumption.
expensive than those that are less well thought-out, regardless of overall
success or failure.
Specifically, this section will answer the question:
What can the analyst do even prior to collecting the data (that is,
at the experimental design stage) that would allow the analyst to
do an optimal job of modeling the process?

Contents: This section deals with the following five questions:


Section 3 1. What is design of experiments (aka DEX or DOE)?
2. Why is experimental design important for process modeling?
3. What are some general design principles for process modeling?
4. I've heard some people refer to "optimal" designs, shouldn't I use
those?
5. How can I tell if a particular experimental design is good for my
application?

http://www.itl.nist.gov/div898/handbook/pmd/section2/pmd216.htm (5 of 5) [11/14/2003 5:50:30 PM] http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd3.htm [11/14/2003 5:50:31 PM]


4.3.1. What is design of experiments (aka DEX or DOE)? 4.3.1. What is design of experiments (aka DEX or DOE)?

Optimizing In the fourth case, the engineer is interested in determining optimal


settings of the process factors; that is, to determine for each factor
the level of the factor that optimizes the process response.
4. Process Modeling In this section, we focus on case 3: modeling.
4.3. Data Collection for Process Modeling

4.3.1. What is design of experiments (aka


DEX or DOE)?
Systematic Design of experiments (DEX or DOE) is a systematic, rigorous
Approach to approach to engineering problem-solving that applies principles and
Data Collection techniques at the data collection stage so as to ensure the generation
of valid, defensible, and supportable engineering conclusions. In
addition, all of this is carried out under the constraint of a minimal
expenditure of engineering runs, time, and money.

DEX Problem There are 4 general engineering problem areas in which DEX may
Areas be applied:
1. Comparative
2. Screening/Characterizing
3. Modeling
4. Optimizing

Comparative In the first case, the engineer is interested in assessing whether a


change in a single factor has in fact resulted in a
change/improvement to the process as a whole.

Screening In the second case, the engineer is interested in "understanding" the


Characterization process as a whole in the sense that he/she wishes (after design and
analysis) to have in hand a ranked list of important through
unimportant factors (most important to least important) that affect
the process.

Modeling In the third case, the engineer is interested in functionally modeling


the process with the output being a good-fitting (= high predictive
power) mathematical function, and to have good (= maximal
accuracy) estimates of the coefficients in that function.

http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd31.htm (1 of 2) [11/14/2003 5:50:31 PM] http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd31.htm (2 of 2) [11/14/2003 5:50:31 PM]


4.3.2. Why is experimental design important for process modeling? 4.3.2. Why is experimental design important for process modeling?

Least For a given data set (e.g., 10 ( , ) pairs), the most common procedure
Squares for obtaining the coefficients for is the least squares
Criterion
estimation criterion. This criterion yields coefficients with predicted
4. Process Modeling
values that are closest to the raw data in the sense that the sum of the
4.3. Data Collection for Process Modeling
squared differences between the raw data and the predicted values is as
small as possible.
4.3.2. Why is experimental design The overwhelming majority of regression programs today use the least
important for process modeling? squares criterion for estimating the model coefficients. Least squares
estimates are popular because
Output from The output from process modeling is a fitted mathematical function 1. the estimators are statistically optimal (BLUEs: Best Linear
Process with estimated coefficients. For example, in modeling resistivity, , as Unbiased Estimators);
Model is a function of dopant density, , an analyst may suggest the function 2. the estimation algorithm is mathematically tractable, in closed
Fitted form, and therefore easily programmable.
Mathematical How then can this be improved? For a given set of values it cannot
Function be; but frequently the choice of the values is under our control. If we
in which the coefficients to be estimated are , , and . Even for
can select the values, the coefficients will have less variability than if
a given functional form, there is an infinite number of potential the are not controlled.
coefficient values that potentially may be used. Each of these
coefficient values will in turn yield predicted values. Design of As to what values should be used for the 's, we look to established
Experiment experimental design principles for guidance.
What are Poor values of the coefficients are those for which the resulting Principles
Good predicted values are considerably different from the observed raw data
Coefficient . Good values of the coefficients are those for which the resulting Principle 1: The first principle of experimental design is to control the values
Values? predicted values are close to the observed raw data . The best values Minimize within the vector such that after the data are collected, the
of the coefficients are those for which the resulting predicted values are Coefficient subsequent model coefficients are as good, in the sense of having the
close to the observed raw data , and the statistical uncertainty Estimation smallest variation, as possible.
connected with each coefficient is small. Variation
The key underlying point with respect to design of experiments and
process modeling is that even though (for simple ( , ) fitting, for
There are two considerations that are useful for the generation of "best"
coefficients: example) the least squares criterion may yield optimal (minimal
variation) estimators for a given distribution of values, some
1. Least squares criterion distributions of data in the vector may yield better (smaller variation)
2. Design of experiment principles coefficient estimates than other vectors. If the analyst can specify the
values in the vector, then he or she may be able to drastically change
and reduce the noisiness of the subsequent least squares coefficient
estimates.

http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd32.htm (1 of 4) [11/14/2003 5:50:31 PM] http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd32.htm (2 of 4) [11/14/2003 5:50:31 PM]


4.3.2. Why is experimental design important for process modeling? 4.3.2. Why is experimental design important for process modeling?

Five Designs To see the effect of experimental design on process modeling, consider Therefore to obtain minimum variance estimators, one maximizes the
the following simplest case of fitting a line: denominator on the right. To maximize the denominator, it is (for an
arbitrarily fixed ), best to position the 's as far away from as
possible. This is done by positioning half of the 's at the lower
Suppose the analyst can afford 10 observations (that is, 10 ( , ) pairs) extreme and the other half at the upper extreme. This is design #3
for the purpose of determining optimal (that is, minimal variation) above, and this "dumbbell" design (half low and half high) is in fact the
estimators of and . What 10 values should be used for the best possible design for fitting a line. Upon reflection, this is intuitively
arrived at by the adage that "2 points define a line", and so it makes the
purpose of collecting the corresponding 10 values? Colloquially, most sense to determine those 2 points as far apart as possible (at the
where should the 10 values be sprinkled along the horizontal axis so extremes) and as well as possible (having half the data at each
as to minimize the variation of the least squares estimated coefficients extreme). Hence the design of experiment solution to model processing
for and ? Should the 10 values be: when the model is a line is the "dumbbell" design--half the X's at each
1. ten equi-spaced values across the range of interest? extreme.
2. five replicated equi-spaced values across the range of interest?
What is the What is the worst design in the above case? Of the five designs, the
3. five values at the minimum of the range and five values at the Worst worst design is the one that has maximum variation. In the
maximum of the range? Design? mathematical expression above, it is the one that minimizes the
4. one value at the minimum, eight values at the mid-range, and denominator, and so this is design #4 above, for which almost all of the
one value at the maximum? data are located at the mid-range. Clearly the estimated line in this case
5. four values at the minimum, two values at mid-range, and four is going to chase the solitary point at each end and so the resulting
values at the maximum? linear fit is intuitively inferior.
or (in terms of "quality" of the resulting estimates for and )
Designs 1, 2, What about the other 3 designs? Designs 1, 2, and 5 are useful only for
perhaps it doesn't make any difference?
and 5 the case when we think the model may be linear, but we are not sure,
For each of the above five experimental designs, there will of course be and so we allow additional points that permit fitting a line if
data collected, followed by the generation of least squares estimates appropriate, but build into the design the "capacity" to fit beyond a line
for and , and so each design will in turn yield a fitted line. (e.g., quadratic, cubic, etc.) if necessary. In this regard, the ordering of
the designs would be
● design 5 (if our worst-case model is quadratic),
Are the Fitted But are the fitted lines, i.e., the fitted process models, better for some
Lines Better designs than for others? Are the coefficient estimator variances smaller ● design 2 (if our worst-case model is quartic)

for Some for some designs than for others? For given estimates, are the resulting ● design 1 (if our worst-case model is quintic and beyond)
Designs? predicted values better (that is, closer to the observed values) than for
other designs? The answer to all of the above is YES. It DOES make a
difference.
The most popular answer to the above question about which design to
use for linear modeling is design #1 with ten equi-spaced points. It can
be shown, however, that the variance of the estimated slope parameter
depends on the design according to the relationship

http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd32.htm (3 of 4) [11/14/2003 5:50:31 PM] http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd32.htm (4 of 4) [11/14/2003 5:50:31 PM]


4.3.3. What are some general design principles for process modeling? 4.3.3. What are some general design principles for process modeling?

Minimum For a given model, make sure the design has the property of
Variance of minimizing the variation of the least squares estimated coefficients.
Coefficient This is a general principle that is always in effect but which in
4. Process Modeling Estimators practice is hard to implement for many models beyond the simpler
4.3. Data Collection for Process Modeling 1-factor models. For more complicated 1-factor
models, and for most multi-factor models, the
4.3.3. What are some general design expressions for the variance of the least squares estimators, although
available, are complicated and assume more than the analyst typically
principles for process modeling? knows. The net result is that this principle, though important, is harder
to apply beyond the simple cases.
Experimental There are six principles of experimental design as applied to process
Design modeling: Sample Where Regardless of the simplicity or complexity of the model, there are
Principles 1. Capacity for Primary Model the Variation situations in which certain regions of the curve are noisier than others.
Applied to Is (Non A simple case is when there is a linear relationship between and
2. Capacity for Alternative Model
Process Constant but the recording device is proportional rather than absolute and so
Modeling 3. Minimum Variance of Coefficient Estimators Variance larger values of are intrinsically noisier than smaller values of . In
4. Sample where the Variation Is Case) such cases, sampling where the variation is means to have more
5. Replication replicated points in those regions that are noisier. The practical
6. Randomization answer to how many such replicated points there should be is
We discuss each in detail below.

Capacity for For your best-guess model, make sure that the design has the capacity
Primary for estimating the coefficients of that model. For a simple example of with denoting the theoretical standard deviation for that given
Model this, if you are fitting a quadratic model, then make sure you have at region of the curve. Usually is estimated by a-priori guesses for
least three distinct horixontal axis points.
what the local standard deviations are.
Capacity for If your best-guess model happens to be inadequate, make sure that the
Sample Where
Alternative design has the capacity to estimate the coefficients of your best-guess A common occurence for non-linear models is for some regions of the
the Variation
Model back-up alternative model (which means implicitly that you should curve to be steeper than others. For example, in fitting an exponential
Is (Steep
have already identified such a model). For a simple example, if you model (small corresponding to large , and large corresponding
Curve Case)
suspect (but are not positive) that a linear model is appropriate, then it to small ) it is often the case that the data in the steep region are
is best to employ a globally robust design (say, four points at each intrinsically noisier than the data in the relatively flat regions. The
extreme and three points in the middle, for a ten-point design) as reason for this is that commonly the values themselves have a bit of
opposed to the locally optimal design (such as five points at each noise and this -noise gets translated into larger -noise in the steep
extreme). The locally optimal design will provide a best fit to the line,
sections than in the shallow sections. In such cases, when we know
but have no capacity to fit a quadratic. The globally robust design will
the shape of the response curve well enough to identify
provide a good (though not optimal) fit to the line and additionally
steep-versus-shallow regions, it is often a good idea to sample more
provide a good (though not optimal) fit to the quadratic.
heavily in the steep regions than in the shallow regions. A practical
rule-of-thumb for where to position the values in such situations is
to
1. sketch out your best guess for what the resulting curve will be;

http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd33.htm (1 of 3) [11/14/2003 5:50:32 PM] http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd33.htm (2 of 3) [11/14/2003 5:50:32 PM]


4.3.3. What are some general design principles for process modeling? 4.3.4. I've heard some people refer to "optimal" designs, shouldn't I use those?

2. partition the vertical (that is the ) axis into equi-spaced


points (with denoting the total number of data points that you
can afford);
3. draw horizontal lines from each vertical axis point to where it
hits the sketched-in curve. 4. Process Modeling
4. drop a vertical projection line from the curve intersection point 4.3. Data Collection for Process Modeling
to the horizontal axis.
These will be the recommended values to use in the design.
4.3.4. I've heard some people refer to
The above rough procedure for an exponentially decreasing curve
would thus yield a logarithmic preponderance of points in the steep
"optimal" designs, shouldn't I use
region of the curve and relatively few points in the flatter part of the those?
curve.
Classical The most heavily used designs in industry are the "classical designs"
Replication If affordable, replication should be part of every design. Replication
Designs Heavily (full factorial designs, fractional factorial designs, Latin square
allows us to compute a model-independent estimate of the process
Used in Industry designs, Box-Behnken designs, etc.). They are so heavily used
standard deviation. Such an estimate may then be used as a criterion
because they are optimal in their own right and have served superbly
in an objective lack-of-fit test to assess whether a given model is
well in providing efficient insight into the underlying structure of
adequate. Such an objective lack-of-fit F-test can be employed only if industrial processes.
the design has built-in replication. Some replication is essential;
replication at every point is ideal.
Reasons Cases do arise, however, for which the tabulated classical designs do
Classical not cover a particular practical situation. That is, user constraints
Randomization Just because the 's have some natural ordering does not mean that Designs May preclude the use of tabulated classical designs because such classical
the data should be collected in the same order as the 's. Some aspect Not Work designs do not accommodate user constraints. Such constraints
of randomization should enter into every experiment, and experiments
include:
for process modeling are no exception. Thus if your are sampling ten
points on a curve, the ten values should not be collected by 1. Limited maximum number of runs:
sequentially stepping through the values from the smallest to the User constraints in budget and time may dictate a maximum
largest. If you do so, and if some extraneous drifting or wear occurs in allowable number of runs that is too small or too "irregular"
the machine, the operator, the environment, the measuring device, (e.g., "13") to be accommodated by classical designs--even
etc., then that drift will unwittingly contaminate the values and in fractional factorial designs.
turn contaminate the final fit. To minimize the effect of such potential 2. Impossible factor combinations:
drift, it is best to randomize (use random number tables) the sequence
of the values. This will not make the drift go away, but it will The user may have some factor combinations that are
spread the drift effect evenly over the entire curve, realistically impossible to run. Such combinations may at times be
inflating the variation of the fitted values, and providing some specified (to maintain balance and orthogonality) as part of a
mechanism after the fact (at the residual analysis model validation recommeded classical design. If the user simply omits this
stage) for uncovering or discovering such a drift. If you do not impossible run from the design, the net effect may be a
randomize the run sequence, you give up your ability to detect such a reduction in the quality and optimaltiy of the classical design.
drift if it occurs. 3. Too many levels:
The number of factors and/or the number of levels of some
factors intended for use may not be included in tabulations of
classical designs.

http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd33.htm (3 of 3) [11/14/2003 5:50:32 PM] http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd34.htm (1 of 3) [11/14/2003 5:50:32 PM]


4.3.4. I've heard some people refer to "optimal" designs, shouldn't I use those? 4.3.4. I've heard some people refer to "optimal" designs, shouldn't I use those?

4. Complicated underlying model: engineering conclusions will be flawed and invalid. Hence one price
for obtaining an in-hand generated design is the designation of a
The user may be assuming an underlying model that is too
model. All optimal designs need a model; without a model, the
complicated (or too non-linear), so that classical designs
optimal design-generation methodology cannot be used, and general
would be inappropriate.
design principles must be reverted to.
What to Do If If user constraints are such that classical designs do not exist to
Need 2: a The other price for using optimal design methodology is a
Classical accommodate such constraints, then what is the user to do?
Candidate Set of user-specified set of candidate points. Optimal designs will not
Designs Do Not
The previous section's list of design criteria (capability for the Points generate the best design points from some continuous region--that is
Exist?
primary model, capability for the alternate model, minimum too much to ask of the mathematics. Optimal designs will generate
variation of estimated coefficients, etc.) is a good passive target to the best subset of points from a larger superset of candidate
aim for in terms of desirable design properties, but provides little points. The user must specify this candidate set of points. Most
help in terms of an active formal construction methodology for commonly, the superset of candidate points is the full factorial
generating a design. design over a fine-enough grid of the factor space with which the
analyst is comfortable. If the grid is too fine, and the resulting
Common To satisfy this need, an "optimal design" methodology has been superset overly large, then the optimal design methodology may
Optimality developed to generate a design when user constraints preclude the prove computationally challenging.
Criteria use of tabulated classical designs. Optimal designs may be optimal
in many different ways, and what may be an optimal design Optimal The optimal design-generation methodology is computationally
according to one criterion may be suboptimal for other criteria. Designs are intensive. Some of the designs (e.g., D-optimal) are better than other
Competing criteria have led to a literal alphabet-soup collection of Computationally designs (such as A-optimal and G-optimal) in regard to efficiency of
optimal design methodologies. The four most popular ingredients in Intensive the underlying search algorithm. Like most mathematical
that "soup" are: optimization techniques, there is no iron-clad guarantee that the
result from the optimal design methodology is in fact the true
D-optimal designs: minimize the generalized variance of the optimum. However, the results are usually satisfactory from a
parameter estimators. practical point of view, and are far superior than any ad hoc designs.
A-optimal designs: minimize the average variance of the parameter
estimators. For further details about optimal designs, the analyst is referred to
G-optimal designs: minimize the maximum variance of the Montgomery (2001).
predicted values.
V-optimal designs: minimize the average variance of the predicted
values.

Need 1: a Model The motivation for optimal designs is the practical constraints that
the user has. The advantage of optimal designs is that they do
provide a reasonable design-generating methodology when no other
mechanism exists. The disadvantage of optimal designs is that they
require a model from the user. The user may not have this model.
All optimal designs are model-dependent, and so the quality of the
final engineering conclusions that result from the ensuing design,
data, and analysis is dependent on the correctness of the analyst's
assumed model. For example, if the responses from a particular
process are actually being drawn from a cubic model and the analyst
assumes a linear model and uses the corresponding optimal design
to generate data and perform the data analysis, then the final

http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd34.htm (2 of 3) [11/14/2003 5:50:32 PM] http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd34.htm (3 of 3) [11/14/2003 5:50:32 PM]


4.3.5. How can I tell if a particular experimental design is good for my application? 4.3.5. How can I tell if a particular experimental design is good for my application?

Graphically If you have a design that is purported to be globally good in k factors,


Check for then generally that design should be locally good in all pairs of the
Bivariate individual k factors. Graphically check for such 2-way balance by
4. Process Modeling Balance generating plots for all pairs of factors, where the horizontal axis of a
4.3. Data Collection for Process Modeling given plot is and the vertical axis is . The response variable does
NOT come into play in these plots. We are only interested in
characteristics of the design, and so only the variables are involved.
4.3.5. How can I tell if a particular The 2-way plots of most good designs have a certain symmetric and
balanced look about them--all combination points should be covered
experimental design is good for my and each combination point should have about the same number of
application? points.

Check for For optimal designs, metrics exist (D-efficiency, A-efficiency, etc.) that
Assess If you have a design, generated by whatever method, in hand, how can
Minimal can be computed and that reflect the quality of the design. Further,
Relative to you assess its after-the-fact goodness? Such checks can potentially
Variation relative ratios of standard deviations of the coefficient estimators and
the Six parallel the list of the six general design principles. The design can be
relative ratios of predicted values can be computed and compared for
Design assessed relative to each of these six principles. For example, does it such designs. Such calculations are commonly performed in computer
Principles have capacity for the primary model, does it have capacity for an packages which specialize in the generation of optimal designs.
alternative model, etc.

Some of these checks are quantitative and complicated; other checks


are simpler and graphical. The graphical checks are the most easily
done and yet are among the most informative. We include two such
graphical checks and one quantitative check.

Graphically If you have a design that claims to be globally good in k factors, then
Check for generally that design should be locally good in each of the individual k
Univariate factors. Checking high-dimensional global goodness is difficult, but
Balance checking low-dimensional local goodness is easy. Generate k counts
plots, with the levels of factors plotted on the horizontal axis of each
plot and the number of design points for each level in factor on the
vertical axis. For most good designs, these counts should be about the
same (= balance) for all levels of a factor. Exceptions exist, but such
balance is a low-level characteristic of most good designs.

http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd35.htm (1 of 2) [11/14/2003 5:50:32 PM] http://www.itl.nist.gov/div898/handbook/pmd/section3/pmd35.htm (2 of 2) [11/14/2003 5:50:32 PM]


4.4. Data Analysis for Process Modeling 4.4. Data Analysis for Process Modeling

5. If my current model does not fit the data well, how can I improve
it?
1. Updating the Function Based on Residual Plots
2. Accounting for Non-Constant Variation Across the Data
4. Process Modeling
3. Accounting for Errors with a Non-Normal Distribution

4.4. Data Analysis for Process Modeling


Building a This section contains detailed discussions of the necessary steps for
Good Model developing a good process model after data have been collected. A
general model-building framework, applicable to multiple statistical
methods, is described with method-specific points included when
necessary.

Contents: 1. What are the basic steps for developing an effective process
Section 4 model?
2. How do I select a function to describe my process?
1. Incorporating Scientific Knowledge into Function Selection
2. Using the Data to Select an Appropriate Function
3. Using Methods that Do Not Require Function Specification
3. How are estimates of the unknown parameters obtained?
1. Least Squares
2. Weighted Least Squares
4. How can I tell if a model fits my data?
1. How can I assess the sufficiency of the functional part of
the model?
2. How can I detect non-constant variation across the data?
3. How can I tell if there was drift in the measurement
process?
4. How can I assess whether the random errors are
independent from one to the next?
5. How can I test whether or not the random errors are
normally distributed?
6. How can I test whether any significant terms are missing or
misspecified in the functional part of the model?
7. How can I test whether all of the terms in the functional
part of the model are necessary?

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd4.htm (1 of 2) [11/14/2003 5:50:32 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd4.htm (2 of 2) [11/14/2003 5:50:32 PM]


4.4.1. What are the basic steps for developing an effective process model? 4.4.1. What are the basic steps for developing an effective process model?

4. Process Modeling
4.4. Data Analysis for Process Modeling

4.4.1. What are the basic steps for developing an


effective process model?
Basic Steps The basic steps used for model-building are the same across all modeling methods. The
Provide details vary somewhat from method to method, but an understanding of the common steps,
Universal combined with the typical underlying assumptions needed for the analysis, provides a
Framework framework in which the results from almost any method can be interpreted and understood.

Basic Steps The basic steps of the model-building process are:


of Model 1. model selection
Building
2. model fitting, and
3. model validation.
These three basic steps are used iteratively until an appropriate model for the data has been
developed. In the model selection step, plots of the data, process knowledge and
assumptions about the process are used to determine the form of the model to be fit to the
data. Then, using the selected model and possibly information about the data, an
appropriate model-fitting method is used to estimate the unknown parameters in the model.
When the parameter estimates have been made, the model is then carefully assessed to see
if the underlying assumptions of the analysis appear plausible. If the assumptions seem
valid, the model can be used to answer the scientific or engineering questions that prompted
the modeling effort. If the model validation identifies problems with the current model,
however, then the modeling process is repeated using information from the model Model
validation step to select and/or fit an improved model. Building
Sequence
A The three basic steps of process modeling described in the paragraph above assume that the
Variation data have already been collected and that the same data set can be used to fit all of the
on the candidate models. Although this is often the case in model-building situations, one variation
Basic Steps on the basic model-building sequence comes up when additional data are needed to fit a
newly hypothesized model based on a model fit to the initial data. In this case two
additional steps, experimental design and data collection, can be added to the basic
sequence between model selection and model-fitting. The flow chart below shows the basic
model-fitting sequence with the integration of the related data collection steps into the
model-building process.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd41.htm (1 of 3) [11/14/2003 5:50:33 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd41.htm (2 of 3) [11/14/2003 5:50:33 PM]


4.4.1. What are the basic steps for developing an effective process model? 4.4.2. How do I select a function to describe my process?

4. Process Modeling
4.4. Data Analysis for Process Modeling

4.4.2. How do I select a function to describe


my process?
Synthesis of Selecting a model of the right form to fit a set of data usually requires
Process the use of empirical evidence in the data, knowledge of the process and
Information some trial-and-error experimentation. As mentioned on the previous
Examples illustrating the model-building sequence in real applications can be found in the
Necessary page, model building is always an iterative process. Much of the need to
case studies in Section 4.6. The specific tools and techniques used in the basic
iterate stems from the difficulty in initially selecting a function that
model-building steps are described in the remainder of this section.
describes the data well. Details about the data are often not easily visible
in the data as originally observed. The fine structure in the data can
Design of Of course, considering the model selection and fitting before collecting the initial data is
Initial also a good idea. Without data in hand, a hypothesis about what the data will look like is
usually only be elicited by use of model-building tools such as residual
Experiment needed in order to guess what the initial model should be. Hypothesizing the outcome of an plots and repeated refinement of the model form. As a result, it is
experiment is not always possible, of course, but efforts made in the earliest stages of a important not to overlook any of the sources of information that indicate
project often maximize the efficiency of the whole model-building process and result in the what the form of the model should be.
best possible models for the process. More details about experimental design can be found
in Section 4.3 and in Chapter 5: Process Improvement. Answer Not Sometimes the different sources of information that need to be
Provided by integrated to find an effective model will be contradictory. An open
Statistics mind and a willingness to think about what the data are saying is
Alone important. Maintaining balance and looking for alternate sources for
unusual effects found in the data are also important. For example, in the
load cell calibration case study the statistical analysis pointed out that
the model initially thought to be appropriate did not account for all of
the structure in the data. A refined model was developed, but the
appearance of an unexpected result brings up the question of whether
the original understanding of the problem was inaccurate, or whether the
need for an alternate model was due to experimental artifacts. In the
load cell problem it was easy to accept that the refined model was closer
to the truth, but in a more complicated case additional experiments
might have been needed to resolve the issue.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd41.htm (3 of 3) [11/14/2003 5:50:33 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd42.htm (1 of 2) [11/14/2003 5:50:33 PM]


4.4.2. How do I select a function to describe my process? 4.4.2.1. Incorporating Scientific Knowledge into Function Selection

Knowing Another helpful ingredient in model selection is a wide knowledge of


Function the shapes that different mathematical functions can assume. Knowing
Types Helps something about the models that have been found to work well in the
past for different application types also helps. A menu of different 4. Process Modeling
functions on the next page, Section 4.4.2.1. (links provided below), 4.4. Data Analysis for Process Modeling
provides one way to learn about the function shapes and flexibility. 4.4.2. How do I select a function to describe my process?
Section 4.4.2.2. discusses how general function features and qualitative
scientific information can be combined to help with model selection.
Finally, Section 4.4.2.3. points to methods that don't require 4.4.2.1. Incorporating Scientific Knowledge
specification of a particular function to be fit to the data, and how
models of those types can be refined. into Function Selection
1. Incorporating Scientific Knowledge into Function Selection Choose Incorporating scientific knowledge into selection of the function
2. Using the Data to Select an Appropriate Function Functions used in a process model is clearly critical to the success of the
Whose model. When a scientific theory describing the mechanics of a
3. Using Methods that Do Not Require Function Specification Properties physical system can provide a complete functional form for the
Match the process, then that type of function makes an ideal starting point for
Process model development. There are many cases, however, for which there
is incomplete scientific information available. In these cases it is
considerably less clear how to specify a functional form to initiate
the modeling process. A practical approach is to choose the simplest
possible functions that have properties ascribed to the process.

Example: For example, if you are modeling concrete strength as a function of


Concrete curing time, scientific knowledge of the process indicates that the
Strength Versus strength will increase rapidly at first, but then level off as the
Curing Time hydration reaction progresses and the reactants are converted to their
new physical form. The leveling off of the strength occurs because
the speed of the reaction slows down as the reactants are converted
and unreacted materials are less likely to be in proximity all of the
time. In theory, the reaction will actually stop altogether when the
reactants are fully hydrated and are completely consumed. However,
a full stop of the reaction is unlikely in reality because there is
always some unreacted material remaining that reacts increasingly
slowly. As a result, the process will approach an asymptote at its
final strength.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd42.htm (2 of 2) [11/14/2003 5:50:33 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd421.htm (1 of 3) [11/14/2003 5:50:33 PM]


4.4.2.1. Incorporating Scientific Knowledge into Function Selection 4.4.2.1. Incorporating Scientific Knowledge into Function Selection

Polynomial Considering this general scientific information, modeling this Focus on the Although the concrete strength example makes a good case for
Models for process using a straight line would not reflect the physical aspects of Region of incorporating scientific knowledge into the model, it is not
Concrete this process very well. For example, using the straight-line model, Interest necessarily a good idea to force a process model to follow all of the
Strength the concrete strength would be predicted to continue increasing at physical properties that the process must follow. At first glance it
Deficient the same rate over its entire lifetime, though we know that is not seems like incorporating physical properties into a process model
how it behaves. The fact that the response variable in a straight-line could only improve it; however, incorporating properties that occur
model is unbounded as the predictor variable becomes extreme is outside the region of interest for a particular application can actually
another indication that the straight-line model is not realistic for sacrifice the accuracy of the model "where it counts" for increased
concrete strength. In fact, this relationship between the response and accuracy where it isn't important. As a result, physical properties
predictor as the predictor becomes extreme is common to all should only be incorporated into process models when they directly
polynomial models, so even a higher-degree polynomial would affect the process in the range of the data used to fit the model or in
probably not make a good model for describing concrete strength. A the region in which the model will be used.
higher-degree polynomial might be able to curve toward the data as
the strength leveled off, but it would eventually have to diverge from Information on In order to translate general process properties into mathematical
the data because of its mathematical properties. Function functions whose forms may be useful for model development, it is
Shapes necessary to know the different shapes that various mathematical
Rational A more reasonable function for modeling this process might be a functions can assume. Unfortunately there is no easy, systematic
Function rational function. A rational function, which is a ratio of two way to obtain this information. Families of mathematical functions,
Accommodates polynomials of the same predictor variable, approaches an like polynomials or rational functions, can assume quite different
Scientific asymptote if the degrees of the polynomials in the numerator and shapes that depend on the parameter values that distinguish one
Knowledge denominator are the same. It is still a very simple model, although it member of the family from another. Because of the wide range of
about Concrete is nonlinear in the unknown parameters. Even if a rational function potential shapes these functions may have, even determining and
Strength does not ultimately prove to fit the data well, it makes a good listing the general properties of relatively simple families of
starting point for the modeling process because it incorporates the functions can be complicated. Section 8 of this chapter gives some
general scientific knowledge we have of the process, without being of the properties of a short list of simple functions that are often
overly complicated. Within the family of rational functions, the useful for process modeling. Another reference that may be useful is
simplest model is the "linear over linear" rational function the Handbook of Mathematical Functions by Abramowitz and
Stegun [1964]. The Digital Library of Mathematical Functions, an
electronic successor to the Handbook of Mathematical Functions
that is under development at NIST, may also be helpful.

so this would probably be the best model with which to start. If the
linear-over-linear model is not adequate, then the initial fit can be
followed up using a higher-degree rational function, or some other
type of model that also has a horizontal asymptote.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd421.htm (2 of 3) [11/14/2003 5:50:33 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd421.htm (3 of 3) [11/14/2003 5:50:33 PM]


4.4.2.2. Using the Data to Select an Appropriate Function 4.4.2.2. Using the Data to Select an Appropriate Function

Start with Least A key point when selecting a model is to start with the simplest function that looks as though it
Complex will describe the structure in the data. Complex models are fine if required, but they should not be
Functions First used unnecessarily. Fitting models that are more complex than necessary means that random
4. Process Modeling noise in the data will be modeled as deterministic structure. This will unnecessarily reduce the
4.4. Data Analysis for Process Modeling amount of data available for estimation of the residual standard deviation, potentially increasing
4.4.2. How do I select a function to describe my process? the uncertainties of the results obtained when the model is used to answer engineering or
scientific questions. Fortunately, many physical systems can be modeled well with straight-line,
polynomial, or simple nonlinear functions.
4.4.2.2. Using the Data to Select an Appropriate Function
Quadratic
Plot the Data The best way to select an initial model is to plot the data. Even if you have a good idea of what Polynomial a
the form of the regression function will be, plotting allows a preliminary check of the underlying Good Starting
assumptions required for the model fitting to succeed. Looking at the data also often provides Point
other insights about the process or the methods of data collection that cannot easily be obtained
from numerical summaries of the data alone.

Example The data from the Pressure/Temperature example is plotted below. From the plot it looks like a
straight-line model will fit the data well. This is as expected based on Charles' Law. In this case
there are no signs of any problems with the process or data collection.

Straight-Line
Model Looks
Appropriate

Developing When the function describing the deterministic variability in the response variable depends on
Models in several predictor (input) variables, it can be difficult to see how the different variables relate to
Higher one another. One way to tackle this problem that often proves useful is to plot cross-sections of
Dimensions the data and build up a function one dimension at a time. This approach will often shed more light
on the relationships between the different predictor variables and the response than plots that
lump different levels of one or more predictor variables together on plots of the response variable
versus another predictor variable.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd422.htm (1 of 7) [11/14/2003 5:50:34 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd422.htm (2 of 7) [11/14/2003 5:50:34 PM]


4.4.2.2. Using the Data to Select an Appropriate Function 4.4.2.2. Using the Data to Select an Appropriate Function

Polymer For example, materials scientists are interested in how cylindrical polymer samples that have
Relaxation been twisted by a fixed amount relax over time. They are also interested in finding out how
Example temperature may affect this process. As a result, both time and temperature are thought to be
important factors for describing the systematic variation in the relaxation data plotted below.
When the torque is plotted against time, however, the nature of the relationship is not clearly
shown. Similarly, when torque is plotted versus the temperature the effect of temperature is also
unclear. The difficulty in interpreting these plots arises because the plot of torque versus time
includes data for several different temperatures and the plot of torque versus temperature includes
data observed at different times. If both temperature and time are necessary parts of the function
that describes the data, these plots are collapsing what really should be displayed as a
three-dimensional surface onto a two-dimensional plot, muddying the picture of the data.

Polymer
Relaxation
Data

Multiplots If cross-sections of the data are plotted in multiple plots instead of lumping different explanatory
Reveal variable values together, the relationships between the variables can become much clearer. Each
Structure cross-sectional plot below shows the relationship between torque and time for a particular
temperature. Now the relationship between torque and time for each temperature is clear. It is
also easy to see that the relationship differs for different temperatures. At a temperature of 25
degrees there is a sharp drop in torque between 0 and 20 minutes and then the relaxation slows.
At a temperature of 75 degrees, however, the relaxation drops at a rate that is nearly constant over
the whole experimental time period. The fact that the profiles of torque versus time vary with
temperature confirms that any functional description of the polymer relaxation process will need
to include temperature.

Cross-Sections
of the Data

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd422.htm (3 of 7) [11/14/2003 5:50:34 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd422.htm (4 of 7) [11/14/2003 5:50:34 PM]


4.4.2.2. Using the Data to Select an Appropriate Function 4.4.2.2. Using the Data to Select an Appropriate Function

Cross-Sectional Further insight into the appropriate function to use can be obtained by separately modeling each
Models Provide cross-section of the data and then relating the individual models to one another. Fitting the
Further Insight accepted stretched exponential relationship between torque ( ) and time ( ),

to each cross-section of the polymer data and then examining plots of the estimated parameters
versus temperature roughly indicates how temperature should be incorporated into a model of the
polymer relaxation data. The individual stretched exponentials fit to each cross-section of the data
are shown in the plot above as solid curves through the data. Plots of the estimated values of each
of the four parameters in the stretched exponential versus temperature are shown below.

Cross-Section
Parameters vs.
Temperature

The solid line near the center of each plot of the cross-sectional parameters from the stretched
exponential is the mean of the estimated parameter values across all six levels of temperature.
The dashed lines above and below the solid reference line provide approximate bounds on how
much the parameter estimates could vary due to random variation in the data. These bounds are
based on the typical value of the standard deviations of the estimates from each individual
stretched exponential fit. From these plots it is clear that only the values of significantly differ
from one another across the temperature range. In addition, there is a clear increasing trend in the
parameter estimates for . For each of the other parameters, the estimate at each temperature
falls within the uncertainty bounds and no clear structure is visible.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd422.htm (5 of 7) [11/14/2003 5:50:34 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd422.htm (6 of 7) [11/14/2003 5:50:34 PM]


4.4.2.2. Using the Data to Select an Appropriate Function 4.4.2.3. Using Methods that Do Not Require Function Specification

Based on the plot of estimated values above, augmenting the term in the standard stretched
exponential so that the new denominator is quadratic in temperature (denoted by ) should
provide a good starting model for the polymer relaxation process. The choice of a quadratic in
temperature is suggested by the slight curvature in the plot of the individually estimated 4. Process Modeling
parameter values. The resulting model is 4.4. Data Analysis for Process Modeling
4.4.2. How do I select a function to describe my process?

. 4.4.2.3. Using Methods that Do Not Require Function


Specification
Functional Although many modern regression methods, like LOESS, do not require the user to specify a
Form Not single type of function to fit the entire data set, some initial information still usually needs to be
Needed, but provided by the user. Because most of these types of regression methods fit a series of simple
Some Input local models to the data, one quantity that usually must be specified is the size of the
Required neighborhood each simple function will describe. This type of parameter is usually called the
bandwidth or smoothing parameter for the method. For some methods the form of the simple
functions must also be specified, while for others the functional form is a fixed property of the
method.

Input The smoothing parameter controls how flexible the functional part of the model will be. This, in
Parameters turn, controls how closely the function will fit the data, just as the choice of a straight line or a
Control polynomial of higher degree determines how closely a traditional regression model will track the
Function deterministic structure in a set of data. The exact information that must be specified in order to fit
Shape the regression function to the data will vary from method to method. Some methods may require
other user-specified parameters require, in addition to a smoothing parameter, to fit the regression
function. However, the purpose of the user-supplied information is similar for all methods.

Starting As for more traditional methods of regression, simple regression functions are better than
Simple still complicated ones in local regression. The complexity of a regression function can be gauged by
Best its potential to track the data. With traditional modeling methods, in which a global function that
describes the data is given explictly, it is relatively easy to differentiate between simple and
complicated models. With local regression methods, on the other hand, it can sometimes difficult
to tell how simple a particular regression function actually is based on the inputs to the procedure.
This is because of the different ways of specifying local functions, the effects of changes in the
smoothing parameter, and the relationships between the different inputs. Generally, however, any
local functions should be as simple as possible and the smoothing parameter should be set so that
each local function is fit to a large subset of the data. For example, if the method offers a choice
of local functions, a straight line would typically be a better starting point than a higher-order
polynomial or a statistically nonlinear function.

Function To use LOESS, the user must specify the degree, d, of the local polynomial to be fit to the data,
Specification and the fraction of the data, q, to be used in each fit. In this case, the simplest possible initial
for LOESS function specification is d=1 and q=1. While it is relatively easy to understand how the degree of
the local polynomial affects the simplicity of the initial model, it is not as easy to determine how
the smoothing parameter affects the function. However, plots of the data from the computational
example of LOESS in Section 1 with four potential choices of the initial regression function show
that the simplest LOESS function, with d=1 and q=1, is too simple to capture much of the
structure in the data.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd422.htm (7 of 7) [11/14/2003 5:50:34 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd423.htm (1 of 2) [11/14/2003 5:50:34 PM]


4.4.2.3. Using Methods that Do Not Require Function Specification 4.4.3. How are estimates of the unknown parameters obtained?

LOESS
Regression
Functions
with Different
Initial 4. Process Modeling
Parameter 4.4. Data Analysis for Process Modeling
Specifications

4.4.3. How are estimates of the unknown


parameters obtained?
Parameter After selecting the basic form of the functional part of the model, the
Estimation next step in the model-building process is estimation of the unknown
in General parameters in the function. In general, this is accomplished by solving
an optimization problem in which the objective function (the function
being minimized or maximized) relates the response variable and the
functional part of the model containing the unknown parameters in a
way that will produce parameter estimates that will be close to the true,
unknown parameter values. The unknown parameters are, loosely
speaking, treated as variables to be solved for in the optimization, and
the data serve as known coefficients of the objective function in this
stage of the modeling process.

Experience Although the simplest possible LOESS function is not flexible enough to describe the data well,
In theory, there are as many different ways of estimating parameters as
Suggests any of the other functions shown in the figure would be reasonable choices. All of the latter there are objective functions to be minimized or maximized. However, a
Good Values functions track the data well enough to allow assessment of the different assumptions that need to few principles have dominated because they result in parameter
to Use be checked before deciding that the model really describes the data well. None of these functions estimators that have good statistical properties. The two major methods
is probably exactly right, but they all provide a good enough fit to serve as a starting point for of parameter estimation for process models are maximum likelihood and
model refinement. The fact that there are several LOESS functions that are similar indicates that least squares. Both of these methods provide parameter estimators that
additional information is needed to determine the best of these functions. Although it is debatable,
have many good properties. Both maximum likelihood and least squares
experience indicates that it is probably best to keep the initial function simple and set the
smoothing parameter so each local function is fit to a relatively small subset of the data. are sensitive to the presence of outliers, however. There are also many
Accepting this principle, the best of these initial models is the one in the upper right corner of the newer methods of parameter estimation, called robust methods, that try
figure with d=1 and q=0.5. to balance the efficiency and desirable properties of least squares and
maximum likelihood with a lower sensitivity to outliers.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd423.htm (2 of 2) [11/14/2003 5:50:34 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd43.htm (1 of 2) [11/14/2003 5:50:34 PM]


4.4.3. How are estimates of the unknown parameters obtained? 4.4.3.1. Least Squares

Overview of Although robust techniques are valuable, they are not as well developed
Section 4.3 as the more traditional methods and often require specialized software
that is not readily available. Maximum likelihood also requires
4. Process Modeling
specialized algorithms in general, although there are important special 4.4. Data Analysis for Process Modeling
cases that do not have such a requirement. For example, for data with 4.4.3. How are estimates of the unknown parameters obtained?
normally distributed random errors, the least squares and maximum
likelihood parameter estimators are identical. As a result of these
software and developmental issues, and the coincidence of maximum 4.4.3.1. Least Squares
likelihood and least squares in many applications, this section currently
focuses on parameter estimation only by least squares methods. The General LS In least squares (LS) estimation, the unknown values of the parameters, , in the
remainder of this section offers some intuition into how least squares Criterion regression function, , are estimated by finding numerical values for the parameters that
works and illustrates the effectiveness of this method. minimize the sum of the squared deviations between the observed responses and the functional
portion of the model. Mathematically, the least (sum of) squares criterion that is minimized to
Contents of 1. Least Squares obtain the parameter estimates is
Section 4.3 2. Weighted Least Squares

As previously noted, are treated as the variables in the optimization and the predictor
variable values, are treated as coefficients. To emphasize the fact that the estimates
of the parameter values are not the same as the true values of the parameters, the estimates are
denoted by . For linear models, the least squares minimization is usually done
analytically using calculus. For nonlinear models, on the other hand, the minimization must
almost always be done using iterative numerical algorithms.

LS for To illustrate, consider the straight-line model,


Straight
Line
.

For this model the least squares estimates of the parameters would be computed by minimizing

Doing this by
1. taking partial derivatives of with respect to and ,
2. setting each partial derivative equal to zero, and
3. solving the resulting system of two equations with two unknowns
yields the following estimators for the parameters:

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd43.htm (2 of 2) [11/14/2003 5:50:34 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd431.htm (1 of 4) [11/14/2003 5:50:35 PM]


4.4.3.1. Least Squares 4.4.3.1. Least Squares

These formulas are instructive because they show that the parameter estimators are functions of
both the predictor and response variables and that the estimators are not independent of each
other unless . This is clear because the formula for the estimator of the intercept depends
directly on the value of the estimator of the slope, except when the second term in the formula for
drops out due to multiplication by zero. This means that if the estimate of the slope deviates a
lot from the true slope, then the estimate of the intercept will tend to deviate a lot from its true
value too. This lack of independence of the parameter estimators, or more specifically the
correlation of the parameter estimators, becomes important when computing the uncertainties of
predicted values from the model. Although the formulas discussed in this paragraph only apply to
the straight-line model, the relationship between the parameter estimators is analogous for more
complicated models, including both statistically linear and statistically nonlinear models.

Quality of From the preceding discussion, which focused on how the least squares estimates of the model
Least parameters are computed and on the relationship between the parameter estimates, it is difficult to
Squares picture exactly how good the parameter estimates are. They are, in fact, often quite good. The plot
Estimates below shows the data from the Pressure/Temperature example with the fitted regression line and
the true regression line, which is known in this case because the data were simulated. It is clear
from the plot that the two lines, the solid one estimated by least squares and the dashed being the
true line obtained from the inputs to the simulation, are almost identical over the range of the
data. Because the least squares line approximates the true line so well in this case, the least
squares line will serve as a useful description of the deterministic portion of the variation in the
data, even though it is not a perfect description. While this plot is just one example, the
relationship between the estimated and true regression functions shown here is fairly typical. Quantifying From the plot above it is easy to see that the line based on the least squares estimates of and
the Quality is a good estimate of the true line for these simulated data. For real data, of course, this type of
Comparison of the Fit direct comparison is not possible. Plots comparing the model to the data can, however, provide
of LS Line for Real valuable information on the adequacy and usefulness of the model. In addition, another measure
and True Data of the average quality of the fit of a regression function to a set of data by least squares can be
Line quantified using the remaining parameter in the model, , the standard deviation of the error term
in the model.

Like the parameters in the functional part of the model, is generally not known, but it can also
be estimated from the least squares equations. The formula for the estimate is

with denoting the number of observations in the sample and is the number of parameters in
the functional part of the model. is often referred to as the "residual standard deviation" of the
process.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd431.htm (2 of 4) [11/14/2003 5:50:35 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd431.htm (3 of 4) [11/14/2003 5:50:35 PM]


4.4.3.1. Least Squares 4.4.3.2. Weighted Least Squares

Because measures how the individual values of the response variable vary with respect to their
true values under , it also contains information about how far from the truth quantities
derived from the data, such as the estimated values of the parameters, could be. Knowledge of the
approximate value of plus the values of the predictor variable values can be combined to
4. Process Modeling
provide estimates of the average deviation between the different aspects of the model and the
corresponding true values, quantities that can be related to properties of the process generating 4.4. Data Analysis for Process Modeling
the data that we would like to know. 4.4.3. How are estimates of the unknown parameters obtained?

More information on the correlation of the parameter estimators and computing uncertainties for
different functions of the estimated regression parameters can be found in Section 5. 4.4.3.2. Weighted Least Squares
As mentioned in Section 4.1, weighted least squares (WLS) regression
is useful for estimating the values of model parameters when the
response values have differing degrees of variability over the
combinations of the predictor values. As suggested by the name,
parameter estimation by the method of weighted least squares is closely
related to parameter estimation by "ordinary", "regular", "unweighted"
or "equally-weighted" least squares.

General In weighted least squares parameter estimation, as in regular least


WLS squares, the unknown values of the parameters, , in the
Criterion regression function are estimated by finding the numerical values for the
parameter estimates that minimize the sum of the squared deviations
between the observed responses and the functional portion of the model.
Unlike least squares, however, each term in the weighted least squares
criterion includes an additional weight, , that determines how much
each observation in the data set influences the final parameter estimates.
The weighted least squares criterion that is minimized to obtain the
parameter estimates is

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd431.htm (4 of 4) [11/14/2003 5:50:35 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd432.htm (1 of 2) [11/14/2003 5:50:35 PM]


4.4.3.2. Weighted Least Squares 4.4.4. How can I tell if a model fits my data?

Some Points Like regular least squares estimators:


Mostly in
1. The weighted least squares estimators are denoted by
Common
with to emphasize the fact that the estimators are not the same as the 4. Process Modeling
true values of the parameters. 4.4. Data Analysis for Process Modeling
Regular LS
(But Not
Always!!!) 2. are treated as the "variables" in the optimization, 4.4.4. How can I tell if a model fits my data?
while values of the response and predictor variables and the
Is Not Model validation is possibly the most important step in the model building sequence. It is also one
weights are treated as constants.
Enough! of the most overlooked. Often the validation of a model seems to consist of nothing more than
quoting the statistic from the fit (which measures the fraction of the total variability in the
3. The parameter estimators will be functions of both the predictor response that is accounted for by the model). Unfortunately, a high value does not guarantee
and response variables and will generally be correlated with one that the model fits the data well. Use of a model that does not fit the data well cannot provide good
another. (WLS estimators are also functions of the weights, .) answers to the underlying engineering or scientific questions under investigation.

4. Weighted least squares minimization is usually done analytically Main There are many statistical tools for model validation, but the primary tool for most process
Tool: modeling applications is graphical residual analysis. Different types of plots of the residuals (see
for linear models and numerically for nonlinear models. Graphical definition below) from a fitted model provide information on the adequacy of different aspects of
Residual
the model. Numerical methods for model validation, such as the statistic, are also useful, but
Analysis
usually to a lesser degree than graphical methods. Graphical methods have an advantage over
numerical methods for model validation because they readily illustrate a broad range of complex
aspects of the relationship between the model and the data. Numerical methods for model validation
tend to be narrowly focused on a particular aspect of the relationship between the model and the
data and often try to compress that information into a single descriptive number or test result.

Numerical Numerical methods do play an important role as confirmatory methods for graphical techniques,
Methods' however. For example, the lack-of-fit test for assessing the correctness of the functional part of the
Forte model can aid in interpreting a borderline residual plot. There are also a few modeling situations in
which graphical methods cannot easily be used. In these cases, numerical methods provide a
fallback position for model validation. One common situation when numerical validation methods
take precedence over graphical methods is when the number of parameters being estimated is
relatively close to the size of the data set. In this situation residual plots are often difficult to
interpret due to constraints on the residuals imposed by the estimation of the unknown parameters.
One area in which this typically happens is in optimization applications using designed
experiments. Logistic regression with binary data is another area in which graphical residual
analysis can be difficult.

Residuals The residuals from a fitted model are the differences between the responses observed at each
combination values of the explanatory variables and the corresponding prediction of the response
computed using the regression function. Mathematically, the definition of the residual for the ith
observation in the data set is written

with denoting the ith response in the data set and represents the list of explanatory variables,
each set at the corresponding values found in the ith observation in the data set.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd432.htm (2 of 2) [11/14/2003 5:50:35 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd44.htm (1 of 3) [11/14/2003 5:50:36 PM]


4.4.4. How can I tell if a model fits my data? 4.4.4. How can I tell if a model fits my data?

36 4 23.669 25.838 110.987 109.295 1.692


Example The data listed below are from the Pressure/Temperature example introduced in Section 4.1.1. The 37 4 23.965 49.127 202.662 200.826 1.835
first column shows the order in which the observations were made, the second column indicates the 38 4 22.917 54.936 224.773 223.653 1.120
day on which each observation was made, and the third column gives the ambient temperature 39 4 23.546 50.917 216.058 207.859 8.199
recorded when each measurement was made. The fourth column lists the temperature of the gas 40 4 24.450 41.976 171.469 172.720 -1.251
itself (the explanatory variable) and the fifth column contains the observed pressure of the gas (the
response variable). Finally, the sixth column gives the corresponding values from the fitted Why Use If the model fit to the data were correct, the residuals would approximate the random errors that
straight-line regression function. Residuals? make the relationship between the explanatory variables and the response variable a statistical
relationship. Therefore, if the residuals appear to behave randomly, it suggests that the model fits
the data well. On the other hand, if non-random structure is evident in the residuals, it is a clear sign
that the model fits the data poorly. The subsections listed below detail the types of plots to use to
test different aspects of a model and give guidance on the correct interpretations of different results
and the last column lists the residuals, the difference between columns five and six. that could be observed for each type of plot.

Data, Model 1. How can I assess the sufficiency of the functional part of the model?
Fitted Run Ambient Fitted Validation 2. How can I detect non-constant variation across the data?
Values & Order Day Temperature Temperature Pressure Value Residual Specifics
Residuals 1 1 23.820 54.749 225.066 222.920 2.146 3. How can I tell if there was drift in the process?
2 1 24.120 23.323 100.331 99.411 0.920 4. How can I assess whether the random errors are independent from one to the next?
3 1 23.434 58.775 230.863 238.744 -7.881 5. How can I test whether or not the random errors are distributed normally?
4 1 23.993 25.854 106.160 109.359 -3.199
5 1 23.375 68.297 277.502 276.165 1.336 6. How can I test whether any significant terms are missing or misspecified in the functional
6 1 23.233 37.481 148.314 155.056 -6.741 part of the model?
7 1 24.162 49.542 197.562 202.456 -4.895 7. How can I test whether all of the terms in the functional part of the model are necessary?
8 1 23.667 34.101 138.537 141.770 -3.232
9 1 24.056 33.901 137.969 140.983 -3.014
10 1 22.786 29.242 117.410 122.674 -5.263
11 2 23.785 39.506 164.442 163.013 1.429
12 2 22.987 43.004 181.044 176.759 4.285
13 2 23.799 53.226 222.179 216.933 5.246
14 2 23.661 54.467 227.010 221.813 5.198
15 2 23.852 57.549 232.496 233.925 -1.429
16 2 23.379 61.204 253.557 248.288 5.269
17 2 24.146 31.489 139.894 131.506 8.388
18 2 24.187 68.476 273.931 276.871 -2.940
19 2 24.159 51.144 207.969 208.753 -0.784
20 2 23.803 68.774 280.205 278.040 2.165
21 3 24.381 55.350 227.060 225.282 1.779
22 3 24.027 44.692 180.605 183.396 -2.791
23 3 24.342 50.995 206.229 208.167 -1.938
24 3 23.670 21.602 91.464 92.649 -1.186
25 3 24.246 54.673 223.869 222.622 1.247
26 3 25.082 41.449 172.910 170.651 2.259
27 3 24.575 35.451 152.073 147.075 4.998
28 3 23.803 42.989 169.427 176.703 -7.276
29 3 24.660 48.599 192.561 198.748 -6.188
30 3 24.097 21.448 94.448 92.042 2.406
31 4 22.816 56.982 222.794 231.697 -8.902
32 4 24.167 47.901 199.003 196.008 2.996
33 4 22.712 40.285 168.668 166.077 2.592
34 4 23.611 25.609 109.387 108.397 0.990
35 4 23.354 22.971 98.445 98.029 0.416

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd44.htm (2 of 3) [11/14/2003 5:50:36 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd44.htm (3 of 3) [11/14/2003 5:50:36 PM]


4.4.4.1. How can I assess the sufficiency of the functional part of the model? 4.4.4.1. How can I assess the sufficiency of the functional part of the model?

4. Process Modeling
4.4. Data Analysis for Process Modeling
4.4.4. How can I tell if a model fits my data?

4.4.4.1. How can I assess the sufficiency of the


functional part of the model?
Main Tool: Scatter plots of the residuals versus the predictor variables in the model and versus potential
Scatter Plots predictors that are not included in the model are the primary plots used to assess sufficiency of
the functional part of the model. Plots in which the residuals do not exhibit any systematic
structure indicate that the model fits the data well. Plots of the residuals versus other predictor
variables, or potential predictors, that exhibit systematic structure indicate that the form of the
function can be improved in some way.

Pressure / The residual scatter plot below, of the residuals from a straight line fit to the
Temperature Pressure/Temperature data introduced in Section 4.1.1. and also discussed in the previous section,
Example does not indicate any problems with the model. The reference line at 0 emphasizes that the
residuals are split about 50-50 between positive and negative. There are no systematic patterns
apparent in this plot. Of course, just as the statistic cannot justify a particular model on its
own, no single residual plot can completely justify the adoption of a particular model either. If a
plot of these residuals versus another variable did show systematic structure, the form of model
with respect to that variable would need to be changed or that variable, if not in the model, would Importance of One important class of potential predictor variables that is often overlooked is environmental
need to be added to the model. It is important to plot the residuals versus every available variable Environmental variables. Environmental variables include things like ambient temperature in the area where
to ensure that a candidate model is the best model possible. Variables measurements are being made and ambient humidity. In most cases environmental variables are
not expected to have any noticeable effect on the process, but it is always good practice to check
for unanticipated problems caused by environmental conditions. Sometimes the catch-all
environmental variables can also be used to assess the validity of a model. For example, if an
experiment is run over several days, a plot of the residuals versus day can be used to check for
differences in the experimental conditions at different times. Any differences observed will not
necessarily be attributable to a specific cause, but could justify further experiments to try to
identify factors missing from the model, or other model misspecifications. The two residual plots
below show the pressure/temperature residuals versus ambient lab temperature and day. In both
cases the plots provide further evidence that the straight line model gives an adequate description
of the data. The plot of the residuals versus day does look a little suspicious with a slight cyclic
pattern between days, but doesn't indicate any overwhelming problems. It is likely that this
apparent difference between days is just due to the random variation in the data.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd441.htm (1 of 6) [11/14/2003 5:50:36 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd441.htm (2 of 6) [11/14/2003 5:50:36 PM]


4.4.4.1. How can I assess the sufficiency of the functional part of the model? 4.4.4.1. How can I assess the sufficiency of the functional part of the model?

Pressure /
Temperature
Residuals vs
Environmental
Variables

Residual The examples of residual plots given above are for the simplest possible case, straight line
Scatter Plots regression via least squares, but the residual plots are used in exactly the same way for almost all
Work Well for of the other statistical methods used for model building. For example, the residual plot below is
All Methods for the LOESS model fit to the thermocouple calibration data introduced in Section 4.1.3.2. Like
the plots above, this plot does not signal any problems with the fit of the LOESS model to the
data. The residuals are scattered both above and below the reference line at all temperatures.
Residuals adjacent to one another in the plot do not tend to have similar signs. There are no
obvious systematic patterns of any type in this plot.

Validation of
LOESS Model
for
Thermocouple
Calibration

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd441.htm (3 of 6) [11/14/2003 5:50:36 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd441.htm (4 of 6) [11/14/2003 5:50:36 PM]


4.4.4.1. How can I assess the sufficiency of the functional part of the model? 4.4.4.1. How can I assess the sufficiency of the functional part of the model?

An Alternative Based on the plot of voltage (response) versus the temperature (predictor) for the thermocouple
to the LOESS calibration data, a quadratic model would have been a reasonable initial model for these data. The
Model quadratic model is the simplest possible model that could account for the curvature in the data.
The scatter plot of the residuals versus temperature for a quadratic model fit to the data clearly
indicates that it is a poor fit, however. This residual plot shows strong cyclic structure in the
residuals. If the quadratic model did fit the data, then this structure would not be left behind in the
residuals. One thing to note in comparing the residual plots for the quadratic and LOESS models,
besides the amount of structure remaining in the data in each case, is the difference in the scales
of the two plots. The residuals from the quadratic model have a range that is approximately fifty
times the range of the LOESS residuals.

Validation of
the Quadratic
Model

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd441.htm (5 of 6) [11/14/2003 5:50:36 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd441.htm (6 of 6) [11/14/2003 5:50:36 PM]


4.4.4.2. How can I detect non-constant variation across the data? 4.4.4.2. How can I detect non-constant variation across the data?

Modification To illustrate how the residuals from the Pressure/Temperature data would look if the standard
of Example deviation was not constant across the different temperature levels, a modified version of the data
was simulated. In the modified version, the standard deviation increases with increasing values of
4. Process Modeling pressure. Situations like this, in which the standard deviation increases with increasing values of
4.4. Data Analysis for Process Modeling the response, are among the most common ways that non-constant random variation occurs in
4.4.4. How can I tell if a model fits my data? physical science and engineering applications. A plot of the data is shown below. Comparison of
these two versions of the data is interesting because in the original units of the data they don't
look strikingly different.
4.4.4.2. How can I detect non-constant variation across
Pressure
the data? Data with
Non-Constant
Scatter Plots Similar to their use in checking the sufficiency of the functional form of the model, scatter plots Residual
Allow of the residuals are also used to check the assumption of constant standard deviation of random Standard
Comparison errors. Scatter plots of the residuals versus the explanatory variables and versus the predicted Deviation
of Random values from the model allow comparison of the amount of random variation in different parts of
Variation the data. For example, the plot below shows residuals from a straight-line fit to the
Across Data Pressure/Temperature data. In this plot the range of the residuals looks essentially constant across
the levels of the predictor variable, temperature. The scatter in the residuals at temperatures
between 20 and 30 degrees is similar to the scatter in the residuals between 40 and 50 degrees and
between 55 and 70 degrees. This suggests that the standard deviation of the random errors is the
same for the responses observed at each temperature.

Residuals
from Pressure
/ Temperature
Example

Residuals The residual plot from a straight-line fit to the modified data, however, highlights the
Indicate non-constant standard deviation in the data. The horn-shaped residual plot, starting with residuals
Non-Constant close together around 20 degrees and spreading out more widely as the temperature (and the
Standard pressure) increases, is a typical plot indicating that the assumptions of the analysis are not
Deviation satisfied with this model. Other residual plot shapes besides the horn shape could indicate
non-constant standard deviation as well. For example, if the response variable for a data set
peaked in the middle of the range of the predictors and was small for extreme values of the
predictors, the residuals plotted versus the predictors would look like two horns with the bells
facing one another. In a case like this, a plot of the residuals versus the predicted values would
exhibit the single horn shape, however.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd442.htm (1 of 6) [11/14/2003 5:50:37 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd442.htm (2 of 6) [11/14/2003 5:50:37 PM]


4.4.4.2. How can I detect non-constant variation across the data? 4.4.4.2. How can I detect non-constant variation across the data?

Residuals
from Modified
Pressure
Data

Correct One potential pitfall in using residual plots to check for constant standard deviation across the
Residual The use of residual plots to check the assumption of constant standard deviation works in the Function data is that the functional part of the model must adequately describe the systematic variation in
Plots same way for most modeling methods. It is not limited to least squares regression even though Needed to the data. If that is not the case, then the typical horn shape observed in the residuals could be due
Comparing that is almost always the context in which it is explained. The plot below shows the residuals Check for to an artifact of the function fit to the data rather than to non-constant variation. For example, in
Variability from a LOESS fit to the data from the Thermocouple Calibration example. The even spread of the Constant the Polymer Relaxation example it was hypothesized that both time and temperature are related to
Apply to Most residuals across the range of the data does not indicate any changes in the standard deviation, Standard the response variable, torque. However, if a single stretched exponential model in time was the
Methods leading us to the conclusion that this assumption is not unreasonable for these data. Deviation initial model used for the process, the residual plots could be misinterpreted fairly easily, leading
to the false conclusion that the standard deviation is not constant across the data. When the
Residuals functional part of the model does not fit the data well, the residuals do not reflect purely random
from LOESS variations in the process. Instead, they reflect the remaining structure in the data not accounted
Fit to for by the function. Because the residuals are not random, they cannot be used to answer
Thermocouple questions about the random part of the model. This also emphasizes the importance of plotting the
Calibration data before fitting the initial model, even if a theoretical model for the data is available. Looking
Data at the data before fitting the initial model, at least in this case, would likely forestall this potential
problem.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd442.htm (3 of 6) [11/14/2003 5:50:37 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd442.htm (4 of 6) [11/14/2003 5:50:37 PM]


4.4.4.2. How can I detect non-constant variation across the data? 4.4.4.2. How can I detect non-constant variation across the data?

Polymer
Relaxation
Data Modeled
as a Single
Stretched
Exponential

Getting Back Fortunately, even if the initial model were incorrect, and the residual plot above was made, there
Residuals on Course are clues in this plot that indicate that the horn shape (pointing left this time) is not caused by
from Single After a Bad non-constant standard deviation. The cluster of residuals at time zero that have a residual torque
Stretched Start near one indicate that the functional part of the model does not fit the data. In addition, even when
Exponential the residuals occur with equal frequency above and below zero, the spacing of the residuals at
Model each time does not really look random. The spacing is too regular to represent random
measurement errors. At measurement times near the low end of the scale, the spacing of the
points increases as the residuals decrease and at the upper end of the scale the spacing decreases
as the residuals decrease. The patterns in the spacing of the residuals also points to the fact that
the functional form of the model is not correct and needs to be corrected before drawing
conclusions about the distribution of the residuals.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd442.htm (5 of 6) [11/14/2003 5:50:37 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd442.htm (6 of 6) [11/14/2003 5:50:37 PM]


4.4.4.3. How can I tell if there was drift in the measurement process? 4.4.4.3. How can I tell if there was drift in the measurement process?

4. Process Modeling
4.4. Data Analysis for Process Modeling
4.4.4. How can I tell if a model fits my data?

4.4.4.3. How can I tell if there was drift in the


measurement process?
Run Order "Run order" or "run sequence" plots of the residuals are used to check for drift in the process. The
Plots Reveal run order residual plot is a special type of scatter plot in which each residual is plotted versus an
Drift in the index that indicates the order (in time) in which the data were collected. This plot is useful,
Process however, only if data have been collected in a randomized run order, or some other order that is
not increasing or decreasing in any of the predictor variables used in the model. If the data have
been collected in a time order that is increasing or decreasing with the predictor variables, then
any drift in the process may not be able to be separated from the functional relationship between
the predictors and the response. This is why randomization is emphasized in experiment design.

Pressure / To show in a more concrete way how run order plots work, the plot below shows the residuals
Temperature from a straight-line fit to the Pressure/Temperature data plotted in run order. Comparing the run
Example order plot to a listing of the data with the residuals shows how the residual for the first data point
collected is plotted versus the run order index value 1, the second residual is plotted versus an
index value of 2, and so forth.
No Drift Taken as a whole, this plot essentially shows that there is only random scatter in the relationship
Run Indicated between the observed pressures and order in which the data were collected, rather than any
Sequence systematic relationship. Although there appears to be a slight trend in the residuals when plotted
Plot for the in run order, the trend is small when measured against short-term random variation in the data,
Pressure / indicating that it is probably not a real effect. The presence of this apparent trend does emphasize,
Temperature however, that practice and judgment are needed to correctly interpret these plots. Although
Data residual plots are a very useful tool, if critical judgment is not used in their interpretation, you can
see things that aren't there or miss things that are. One hint that the slight slope visible in the data
is not worrisome in this case is the fact that the residuals overlap zero across all runs. If the
process was drifting significantly, it is likely that there would be some parts of the run sequence
in which the residuals would not overlap zero. If there is still some doubt about the slight trend
visible in the data after using this graphical procedure, a term describing the drift can be added to
the model and tested numerically to see if it has a significant impact on the results.

Modification To illustrate how the residuals from the Pressure/Temperature data would look if there were drift
of Example in the process, a modified version of the data was simulated. A small drift of 0.3
units/measurement was added to the process. A plot of the data is shown below. In this run
sequence plot a clear, strong trend is visible and there are portions of the run order where the
residuals do not overlap zero. Because the structure is so evident in this case, it is easy to
conclude that some sort of drift is present. Then, of course, its cause needs to be determined so
that appropriate steps can be taken to eliminate the drift from the process or to account for it in
the model.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd443.htm (1 of 4) [11/14/2003 5:50:37 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd443.htm (2 of 4) [11/14/2003 5:50:37 PM]


4.4.4.3. How can I tell if there was drift in the measurement process? 4.4.4.3. How can I tell if there was drift in the measurement process?

Run Run
Sequence Sequence
Plot for Plot for
Pressure / Polymer
Temperature Relaxation
Data with Data
Drift

As in the case when the standard deviation was not constant across the data set, comparison of
these two versions of the data is interesting because the drift is not apparent in either data set
when viewed in the scale of the data. This highlights the need for graphical residual analysis
when developing process models.

Applicable The run sequence plot, like most types of residual plots, can be used to check for drift in many
to Most regression methods. It is not limited to least squares fitting or one particular type of model. The
Regression run sequence plot below shows the residuals from the fit of the nonlinear model
Methods

to the data from the Polymer Relaxation example. The even spread of the residuals across the
range of the data indicates that there is no apparent drift in this process.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd443.htm (3 of 4) [11/14/2003 5:50:37 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd443.htm (4 of 4) [11/14/2003 5:50:37 PM]


4.4.4.4. How can I assess whether the random errors are independent from one to the next? 4.4.4.4. How can I assess whether the random errors are independent from one to the next?

4. Process Modeling
4.4. Data Analysis for Process Modeling
4.4.4. How can I tell if a model fits my data?

4.4.4.4. How can I assess whether the random errors are


independent from one to the next?
Lag Plot The lag plot of the residuals, another special type of scatter plot, suggests whether or not the
Shows errors are independent. If the errors are not independent, then the estimate of the error standard
Dependence deviation will be biased, potentially leading to improper inferences about the process. The lag
Between plot works by plotting each residual value versus the value of the successive residual (in
Residuals chronological order of observation). The first residual is plotted versus the second, the second
versus the third, etc. Because of the way the residuals are paired, there will be one less point on
this plot than on most other types of residual plots.

Interpretation If the errors are independent, there should be no pattern or structure in the lag plot. In this case
the points will appear to be randomly scattered across the plot in a scattershot fashion. If there is
significant dependence between errors, however, some sort of deterministic pattern will likely be
evident.

Examples Lag plots for the Pressure/Temperature example, the Thermocouple Calibration example, and the
Polymer Relaxation example are shown below. The lag plots for these three examples suggest Lag Plot:
that the errors from each fit are independent. In each case, the residuals are randomly scattered Thermocouple
about the origin with no apparent structure. The last plot, for the Polymer Relaxation data, shows Calibration
an apparent slight correlation between the residuals and the lagged residuals, but experience Example
suggests that this could easily be due to random error and is not likely to be a real issue. In fact,
the lag plot can also emphasize outlying observations and a few of the larger residuals (in
absolute terms) may be pulling our eyes unduly. The normal probability plot, which is also good
at identifying outliers, will be discussed next, and will shed further light on any unusual points in
the data set.

Lag Plot:
Temperature /
Pressure
Example

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd444.htm (1 of 4) [11/14/2003 5:50:38 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd444.htm (2 of 4) [11/14/2003 5:50:38 PM]


4.4.4.4. How can I assess whether the random errors are independent from one to the next? 4.4.4.4. How can I assess whether the random errors are independent from one to the next?

Lag Plot: Next Steps Some of the different patterns that might be found in the residuals when the errors are not
Polymer independent are illustrated in the general discussion of the lag plot. If the residuals are not
Relaxation random, then time series methods might be required to fully model the data. Some time series
Example basics are given in Section 4 of the chapter on Process Monitoring. Before jumping to
conclusions about the need for time series methods, however, be sure that a run order plot does
not show any trends, or other structure, in the data. If there is a trend in the run order plot,
whether caused by drift or by the use of the wrong functional form, the source of the structure
shown in the run order plot will also induce structure in the lag plot. Structure induced in the lag
plot in this way does not necessarily indicate dependence in successive random errors. The lag
plot can only be interpreted clearly after accounting for any structure in the run order plot.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd444.htm (3 of 4) [11/14/2003 5:50:38 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd444.htm (4 of 4) [11/14/2003 5:50:38 PM]


4.4.4.5. How can I test whether or not the random errors are distributed normally? 4.4.4.5. How can I test whether or not the random errors are distributed normally?

4. Process Modeling
4.4. Data Analysis for Process Modeling
4.4.4. How can I tell if a model fits my data?

4.4.4.5. How can I test whether or not the random errors


are distributed normally?
Histogram The histogram and the normal probability plot are used to check whether or not it is reasonable to
and Normal assume that the random errors inherent in the process have been drawn from a normal
Probability distribution. The normality assumption is needed for the error rates we are willing to accept when
Plot Used for making decisions about the process. If the random errors are not from a normal distribution,
Normality incorrect decisions will be made more or less frequently than the stated confidence levels for our
Checks inferences indicate.

Normal The normal probability plot is constructed by plotting the sorted values of the residuals versus the
Probability associated theoretical values from the standard normal distribution. Unlike most residual scatter
Plot plots, however, a random scatter of points does not indicate that the assumption being checked is
met in this case. Instead, if the random errors are normally distributed, the plotted points will lie
close to straight line. Distinct curvature or other signficant deviations from a straight line indicate
that the random errors are probably not normally distributed. A few points that are far off the line
suggest that the data has some outliers in it.
Normal
Examples Normal probability plots for the Pressure/Temperature example, the Thermocouple Calibration Probability
example, and the Polymer Relaxation example are shown below. The normal probability plots for Plot:
these three examples indicate that that it is reasonable to assume that the random errors for these Thermocouple
processes are drawn from approximately normal distributions. In each case there is a strong linear Calibration
relationship between the residuals and the theoretical values from the standard normal Example
distribution. Of course the plots do show that the relationship is not perfectly deterministic (and it
never will be), but the linear relationship is still clear. Since none of the points in these plots
deviate much from the linear relationship defined by the residuals, it is also reasonable to
conclude that there are no outliers in any of these data sets.

Normal
Probability
Plot:
Temperature /
Pressure
Example

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd445.htm (1 of 7) [11/14/2003 5:50:38 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd445.htm (2 of 7) [11/14/2003 5:50:38 PM]


4.4.4.5. How can I test whether or not the random errors are distributed normally? 4.4.4.5. How can I test whether or not the random errors are distributed normally?

Normal Further If the random errors from one of these processes were not normally distributed, then significant
Probability Discussion curvature may have been visible in the relationship between the residuals and the quantiles from
Plot: Polymer and Examples the standard normal distribution, or there would be residuals at the upper and/or lower ends of the
Relaxation line that clearly did not fit the linear relationship followed by the bulk of the data. Examples of
Example some typical cases obtained with non-normal random errors are illustrated in the general
discussion of the normal probability plot.

Histogram The normal probability plot helps us determine whether or not it is reasonable to assume that the
random errors in a statistical process can be assumed to be drawn from a normal distribution. An
advantage of the normal probability plot is that the human eye is very sensitive to deviations from
a straight line that might indicate that the errors come from a non-normal distribution. However,
when the normal probability plot suggests that the normality assumption may not be reasonable, it
does not give us a very good idea what the distribution does look like. A histogram of the
residuals from the fit, on the other hand, can provide a clearer picture of the shape of the
distribution. The fact that the histogram provides more general distributional information than
does the normal probability plot suggests that it will be harder to discern deviations from
normality than with the more specifically-oriented normal probability plot.

Examples Histograms for the three examples used to illustrate the normal probability plot are shown below.
The histograms are all more-or-less bell-shaped, confirming the conclusions from the normal
probability plots. Additional examples can be found in the gallery of graphical techniques.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd445.htm (3 of 7) [11/14/2003 5:50:38 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd445.htm (4 of 7) [11/14/2003 5:50:38 PM]


4.4.4.5. How can I test whether or not the random errors are distributed normally? 4.4.4.5. How can I test whether or not the random errors are distributed normally?

Histogram:
Temperature /
Pressure
Example

Histogram:
Histogram: Polymer
Thermocouple Relaxation
Calibration Example
Example

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd445.htm (5 of 7) [11/14/2003 5:50:38 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd445.htm (6 of 7) [11/14/2003 5:50:38 PM]


4.4.4.5. How can I test whether or not the random errors are distributed normally? 4.4.4.6. How can I test whether any significant terms are missing or misspecified in the functional part of the model?

4. Process Modeling
4.4. Data Analysis for Process Modeling
4.4.4. How can I tell if a model fits my data?

4.4.4.6. How can I test whether any


significant terms are missing or
misspecified in the functional part
of the model?
Statistical Although the residual plots discussed on pages 4.4.4.1 and 4.4.4.3 will
Tests Can often indicate whether any important variables are missing or
Augment misspecified in the functional part of the model, a statistical test of the
Ambiguous hypothesis that the model is sufficient may be helpful if the plots leave
Residual Plots any doubt. Although it may seem tempting to use this type of
statistical test in place of residual plots since it apparently assesses the
fit of the model objectively, no single test can provide the rich
Important One important detail to note about the normal probability plot and the histogram is that they feedback to the user that a graphical analysis of the residuals can
Note provide information on the distribution of the random errors from the process only if provide. Furthermore, while model completeness is one of the most
1. the functional part of the model is correctly specified, important aspects of model adequacy, this type of test does not address
2. the standard deviation is constant across the data, other important aspects of model quality. In statistical jargon, this type
3. there is no drift in the process, and of test for model adequacy is usually called a "lack-of-fit" test.
4. the random errors are independent from one run to the next.
If the other residual plots indicate problems with the model, the normal probability plot and General The most common strategy used to test for model adequacy is to
histogram will not be easily interpretable. Strategy compare the amount of random variation in the residuals from the data
used to fit the model with an estimate of the random variation in the
process using data that are independent of the model. If these two
estimates of the random variation are similar, that indicates that no
significant terms are likely to be missing from the model. If the
model-dependent estimate of the random variation is larger than the
model-independent estimate, then significant terms probably are
missing or misspecified in the functional part of the model.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd445.htm (7 of 7) [11/14/2003 5:50:38 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd446.htm (1 of 4) [11/14/2003 5:50:39 PM]


4.4.4.6. How can I test whether any significant terms are missing or misspecified in the functional part of the model? 4.4.4.6. How can I test whether any significant terms are missing or misspecified in the functional part of the model?

depends only on the data and not on the functional part of the
Testing Model The need for a model-independent estimate of the random variation model. This shows that will be a good estimator of , regardless of
Adequacy means that replicate measurements made under identical experimental whether the model is a complete description of the process or not.
Requires conditions are required to carry out a lack-of-fit test. If no replicate
Replicate measurements are available, then there will not be any baseline
Estimating Unlike the formula for , the formula for
Measurements estimate of the random process variation to compare with the results
Using the
from the model. This is the main reason that the use of replication is
Model
emphasized in experimental design.

Data Used to Although it might seem like two sets of data would be needed to carry
Fit Model out the lack-of-fit test using the strategy described above, one set of
Can Be data to fit the model and compute the residual standard deviation and
Partitioned to (with denoting the number of unknown parameters in the model)
the other to compute the model-independent estimate of the random
Compute variation, that is usually not necessary. In most regression does depend on the functional part of the model. If the model were
Lack-of-Fit applications, the same data used to fit the model can also be used to correct, the value of the function would be a good estimate of the
Statistic carry out the lack-of-fit test, as long as the necessary replicate mean value of the response for every combination of predictor variable
measurements are available. In these cases, the lack-of-fit statistic is values. When the function provides good estimates of the mean
computed by partitioning the residual standard deviation into two response at the ith combination, then should be close in value to
independent estimators of the random variation in the process. One and should also be a good estimate of . If, on the other hand, the
estimator depends on the model and the sample means of the function is missing any important terms (within the range of the data),
replicated sets of data ( ), while the other estimator is a pooled or if any terms are misspecified, then the function will provide a poor
standard deviation based on the variation observed in each set of estimate of the mean response for some combinations of the predictors
replicated measurements ( ). The squares of these two estimators of and will tend to be greater than .
the random variation are often called the "mean square for lack-of-fit"
and the "mean square for pure error," respectively, in statistics texts. Carrying Out Combining the ideas presented in the previous two paragraphs,
the Test for following the general strategy outlined above, the adequacy of the
The notation and is used here instead to emphasize the fact Lack-of-Fit functional part of the model can be assessed by comparing the values
that, if the model fits the data, these quantities should both be good
estimators of . of and . If , then one or more important terms may be
missing or misspecified in the functional part of the model. Because of
Estimating The model-independent estimator of is computed using the formula the random error in the data, however, we know that will
Using
sometimes be larger than even when the model is adequate. To
Replicate make sure that the hypothesis that the model is adequate is not rejected
Measurements
by chance, it is necessary to understand how much greater than the
value of might typically be when the model does fit the data. Then
the hypothesis can be rejected only when is significantly greater
with denoting the sample size of the data set used to fit the model,
is the number of unique combinations of predictor variable levels, than .
is the number of replicated observations at the ith combination of
predictor variable levels, the are the regression responses indexed
by their predictor variable levels and number of replicate
measurements, and is the mean of the responses at the itth
combination of predictor variable levels. Notice that the formula for

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd446.htm (2 of 4) [11/14/2003 5:50:39 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd446.htm (3 of 4) [11/14/2003 5:50:39 PM]


4.4.4.6. How can I test whether any significant terms are missing or misspecified in the functional part of the model? 4.4.4.7. How can I test whether all of the terms in the functional part of the model are necessary?

When the model does fit the data, it turns out that the ratio

4. Process Modeling
4.4. Data Analysis for Process Modeling
4.4.4. How can I tell if a model fits my data?

follows an F distribution. Knowing the probability distribution that 4.4.4.7. How can I test whether all of the terms in the
describes the behavior of the statistic, , we can control the
probability of rejecting the hypothesis that the model is adequate in functional part of the model are necessary?
cases when the model actually is adequate. Rejecting the hypothesis
that the model is adequate only when is greater than an upper-tail Unnecessary Models that are generally correct in form, but that include extra, unnecessary terms are said to
Terms in the "over-fit" the data. The term over-fitting is used to describe this problem because the extra terms
cut-off value from the F distribution with a user-specified probability
Model Affect in the model make it more flexible than it should be, allowing it to fit some of the random
of wrongly rejecting the hypothesis gives us a precise, objective, Inferences variation in the data as if it were deterministic structure. Because the parameters for any
probabilistic definition of when is significantly greater than . unnecessary terms in the model usually have estimated values near zero, it may seem as though
The user-specified probability used to obtain the cut-off value from the leaving them in the model would not hurt anything. It is true, actually, that having one or two
extra terms in the model does not usually have much negative impact. However, if enough extra
F distribution is called the "significance level" of the test. The
terms are left in the model, the consequences can be serious. Among other things, including
significance level for most statistical tests is denoted by . The most unnecessary terms in the model can cause the uncertainties estimated from the data to be larger
commonly used value for the significance level is , which than necessary, potentially impacting scientific or engineering conclusions to be drawn from the
means that the hypothesis of an adequate model will only be rejected analysis of the data.
in 5% of tests for which the model really is adequate. Cut-off values
can be computed using most statistical software or from tables of the F Empirical Over-fitting is especially likely to occur when developing purely empirical models for processes
distribution. In addition to needing the significance level to obtain the and Local when there is no external understanding of how much of the total variation in the data might be
Models systematic and how much is random. It also happens more frequently when using regression
cut-off value, the F distribution is indexed by the degrees of freedom Most Prone methods that fit the data locally instead of using an explicitly specified function to describe the
associated with each of the two estimators of . , which appears in to structure in the data. Explicit functions are usually relatively simple and have few terms. It is
Over-fitting usually difficult to know how to specify an explicit function that fits the noise in the data, since
the numerator of , has degrees of freedom. , which the Data noise will not typically display much structure. This is why over-fitting is not usually a problem
appears in the denominator of , has degrees of freedom. with these types of models. Local models, on the other hand, can easily be made to fit very
complex patterns, allowing them to find apparent structure in process noise if care is not
exercised.
Alternative Although the formula given above more clearly shows the nature of
Formula for , the numerically equivalent formula below is easier to use in Statistical Just as statistical tests can be used to check for significant missing or misspecified terms in the
computations Tests for functional part of a model, they can also be used to determine if any unnecessary terms have
Over-fitting been included. In fact, checking for over-fitting of the data is one area in which statistical tests
are more effective than residual plots. To test for over-fitting, however, individual tests of the
importance of each parameter in the model are used rather than following using a single test as
done when testing for terms that are missing or misspecified in the model.
.
Tests of Most output from regression software also includes individual statistical tests that compare the
Individual hypothesis that each parameter is equal to zero with the alternative that it is not zero. These tests
Parameters are convenient because they are automatically included in most computer output, do not require
replicate measurements, and give specific information about each parameter in the model.
However, if the different predictor variables included in the model have values that are
correlated, these tests can also be quite difficult to interpret. This is because these tests are
actually testing whether or not each parameter is zero given that all of the other predictors are
included in the model.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd446.htm (4 of 4) [11/14/2003 5:50:39 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd447.htm (1 of 3) [11/14/2003 5:50:39 PM]


4.4.4.7. How can I test whether all of the terms in the functional part of the model are necessary? 4.4.4.7. How can I test whether all of the terms in the functional part of the model are necessary?

Test The test statistics for testing whether or not each parameter is zero are typically based on Dataplot
Statistics Student's t distribution. Each parameter estimate in the model is measured in terms of how many Output: LEAST SQUARES POLYNOMIAL FIT
Based on standard deviations it is from its hypothesized value of zero. If the parameter's estimated value is Pressure / SAMPLE SIZE N = 40
Student's t close enough to the hypothesized value that any deviation can be attributed to random error, the Temperature DEGREE = 1
Distribution hypothesis that the parameter's true value is zero is not rejected. If, on the other hand, the Example NO REPLICATION CASE
parameter's estimated value is so far away from the hypothesized value that the deviation cannot
be plausibly explained by random error, the hypothesis that the true value of the parameter is
zero is rejected. PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
1 A0 7.74899 ( 2.354 ) 3.292
Because the hypothesized value of each parameter is zero, the test statistic for each of these tests 2 A1 3.93014 (0.5070E-01) 77.51
is simply the estimated parameter value divided by its estimated standard deviation,
RESIDUAL STANDARD DEVIATION = 4.299098
RESIDUAL DEGREES OF FREEDOM = 38

Looking up the cut-off value from the tables of the t distribution using a significance level of
and 38 degrees of freedom yields a cut-off value of 2.024 (the cut-off is obtained
from the column labeled "0.025" since this is a two-sided test and 0.05/2 = 0.025). Since both of
which provides a measure of the distance between the estimated and hypothesized values of the the test statistics are larger in absolute value than the cut-off value of 2.024, the appropriate
parameter in standard deviations. Based on the assumptions that the random errors are normally conclusion is that both the slope and intercept are significantly different from zero at the 95%
distributed and the true value of the parameter is zero (as we have hypothesized), the test statistic confidence level.
has a Student's t distribution with degrees of freedom. Therefore, cut-off values for the t
distribution can be used to determine how extreme the test statistic must be in order for each
parameter estimate to be too far away from its hypothesized value for the deviation to be
attributed to random error. Because these tests are generally used to simultaneously test whether
or not a parameter value is greater than or less than zero, the tests should each be used with
cut-off values with a significance level of . This will guarantee that the hypothesis that each
parameter equals zero will be rejected by chance with probability . Because of the symmetry of
the t distribution, only one cut-off value, the upper or the lower one, needs to be determined, and
the other will be it's negative. Equivalently, many people simply compare the absolute value of
the test statistic to the upper cut-off value.

Parameter To illustrate the use of the individual tests of the significance of each parameter in a model, the
Tests for the Dataplot output for the Pressure/Temperature example is shown below. In this case a
Pressure / straight-line model was fit to the data, so the output includes tests of the significance of the
Temperature intercept and slope. The estimates of the intercept and the slope are 7.75 and 3.93, respectively.
Example Their estimated standard deviations are listed in the next column followed by the test statistics to
determine whether or not each parameter is zero. At the bottom of the output the estimate of the
residual standard deviation, , and its degrees of freedom are also listed.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd447.htm (2 of 3) [11/14/2003 5:50:39 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd447.htm (3 of 3) [11/14/2003 5:50:39 PM]


4.4.5. If my current model does not fit the data well, how can I improve it? 4.4.5.1. Updating the Function Based on Residual Plots

4. Process Modeling
4. Process Modeling 4.4. Data Analysis for Process Modeling
4.4. Data Analysis for Process Modeling 4.4.5. If my current model does not fit the data well, how can I improve it?

4.4.5. If my current model does not fit the 4.4.5.1. Updating the Function Based on Residual Plots
data well, how can I improve it? Residual If the plots of the residuals used to check the adequacy of the functional part of the model indicate
Plots Guide problems, the structure exhibited in the plots can often be used to determine how to improve the
Model functional part of the model. For example, suppose the initial model fit to the thermocouple
What Next? Validating a model using residual plots, formal hypothesis tests and Refinement calibration data was a quadratic polynomial. The scatter plot of the residuals versus temperature
descriptive statistics would be quite frustrating if discovery of a showed that there was structure left in the data when this model was used.
problem meant restarting the modeling process back at square one.
Fortunately, however, there are also techniques and tools to remedy Residuals vs
many of the problems uncovered using residual analysis. In some cases Temperature:
the model validation methods themselves suggest appropriate changes Quadratic
to a model at the same time problems are uncovered. This is especially Model
true of the graphical tools for model validation, though tests on the
parameters in the regression function also offer insight into model
refinement. Treatments for the various model deficiencies that were
diagnosed in Section 4.4.4. are demonstrated and discussed in the
subsections listed below.

Methods for 1. Updating the Function Based on Residual Plots


Model 2. Accounting for Non-Constant Variation Across the Data
Improvement
3. Accounting for Errors with a Non-Normal Distribution

The shape of the residual plot, which looks like a cubic polynomial, suggests that adding another
term to the polynomial might account for the structure left in the data by the quadratic model.
After fitting the cubic polynomial, the magnitude of the residuals is reduced by a factor of about
30, indicating a big improvement in the model.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd45.htm [11/14/2003 5:50:39 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd451.htm (1 of 2) [11/14/2003 5:50:40 PM]


4.4.5.1. Updating the Function Based on Residual Plots 4.4.5.2. Accounting for Non-Constant Variation Across the Data

Residuals vs
Temperature:
Cubic Model
4. Process Modeling
4.4. Data Analysis for Process Modeling
4.4.5. If my current model does not fit the data well, how can I improve it?

4.4.5.2. Accounting for Non-Constant Variation Across the


Data
Two Basic There are two basic approaches to obtaining improved parameter estimators for data in which the
Approaches: standard deviation of the error is not constant across all combinations of predictor variable values:
Transformation and 1. transforming the data so it meets the standard assumptions, and
Weighting
2. using weights in the parameter estimation to account for the unequal standard deviations.
Both methods work well in a wide range of situations. The choice of which to use often hinges on
personal preference because in many engineering and industrial applications the two methods
often provide practically the same results. In fact, in most experiments there is usually not enough
data to determine which of the two models works better. Sometimes, however, when there is
scientific information about the nature of the model, one method or the other may be preferred
because it is more consistent with an existing theory. In other cases, the data may make one of the
methods more convenient to use than the other.

Using The basic steps for using transformations to handle data with unequal subpopulation standard
Transformations deviations are:
1. Transform the response variable to equalize the variation across the levels of the predictor
variables.
Increasing Although the model is improved, there is still structure in the residuals. Based on this structure, a 2. Transform the predictor variables, if necessary, to attain or restore a simple functional form
Residual higher-degree polynomial looks like it would fit the data. Polynomial models become numerically for the regression function.
Complexity unstable as their degree increases, however. Therfore, after a few iterations like this, leading to 3. Fit and validate the model in the transformed variables.
Suggests polynomials of ever-increasing degree, the structure in the residuals is indicating that a 4. Transform the predicted values back into the original units using the inverse of the
LOESS polynomial does not actually describe the data very well. As a result, a different type of model, transformation applied to the response variable.
Model such as a nonlinear model or a LOESS model, is probably more appropriate for these data. The
type of model needed to describe the data, however, can be arrived at systematically using the Typical Appropriate transformations to stabilize the variability may be suggested by scientific knowledge
structure in the residuals at each step. Transformations for or selected using the data. Three transformations that are often effective for equalizing the
Stabilization of standard deviations across the values of the predictor variables are:
Variation 1. ,

2. (note: the base of the logarithm does not really matter), and

3. .

Other transformations can be considered, of course, but in a surprisingly wide range of problems
one of these three transformations will work well. As a result, these are good transformations to
start with, before moving on to more specialized transformations.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd451.htm (2 of 2) [11/14/2003 5:50:40 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (1 of 14) [11/14/2003 5:50:41 PM]


4.4.5.2. Accounting for Non-Constant Variation Across the Data 4.4.5.2. Accounting for Non-Constant Variation Across the Data

Modified Pressure / To illustrate how to use transformations to stabilize the variation in the data, we will return to the
Temperature Example modified version of the Pressure/Temperature example. The residuals from a straight-line fit to
that data clearly showed that the standard deviation of the measurements was not constant across
the range of temperatures.

Residuals from
Modified Pressure
Data

Inverse Pressure Has After comparing the effects of the different transformations, it looks like using the inverse of the
Constant Variation pressure will make the standard deviation approximately constant across all temperatures.
However, it is somewhat difficult to tell how the standard deviations really compare on a plot of
this size and scale. To better see the variation, a full-sized plot of temperature versus the inverse
of the pressure is shown below. In that plot it is easier to compare the variation across
temperatures. For example, comparing the variation in the pressure values at a temperature of
about 25 with the variation in the pressure values at temperatures near 45 and 70, this plot shows
Stabilizing the The first step in the process is to compare different transformations of the response variable,
about the same level of variation at all three temperatures. It will still be critical to look at
Variation pressure, to see which one, if any, stabilizes the variation across the range of temperatures. The
residual plots after fitting the model to the transformed variables, however, to really see whether
straight-line relationship will not hold for all of the transformations, but at this stage of the
or not the transformation we've chosen is effective. The residual scale is really the only scale that
process that is not a concern. The functional relationship can usually be corrected after stabilizing
can reveal that level of detail.
the variation. The key for this step is to find a transformation that makes the uncertainty in the
data approximately the same at the lowest and highest temperatures (and in between). The plot
below shows the modified Pressure/Temperature data in its original units, and with the response Enlarged View of
variable transformed using each of the three typical transformations. Remember you can click on Temperature Versus
the plot to see a larger view for easier comparison. 1/Pressure

Transformations of
the Pressure

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (2 of 14) [11/14/2003 5:50:41 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (3 of 14) [11/14/2003 5:50:41 PM]
4.4.5.2. Accounting for Non-Constant Variation Across the Data 4.4.5.2. Accounting for Non-Constant Variation Across the Data

Transforming Having found a transformation that appears to stabilize the standard deviations of the Comparing the plots of the various transformations of the temperature versus the inverse of the
Temperature to measurements, the next step in the process is to find a transformation of the temperature that will pressure, it appears that the straight-line relationship between the variables is restored when the
Linearity restore the straight-line relationship, or some other simple relationship, between the temperature inverse of the temperature is used. This makes intuitive sense because if the temperature and
and pressure. The same three basic transformations that can often be used to stabilize the pressure are related by a straight line, then the same transformation applied to both variables
variation are also usually able to transform the predictor to restore the original relationship should change them both similarly, retaining their original relationship. Now, after fitting a
between the variables. Plots of the temperature and the three transformations of the temperature straight line to the transformed data, the residuals plotted versus both the transformed and original
versus the inverse of the pressure are shown below. values of temperature indicate that the straight-line model fits the data and that the random
variation no longer increases with increasing temperature. Additional diagnostic plots of the
Transformations of residuals confirm that the model fits the data well.
the Temperature
Residuals From the
Fit to the
Transformed Data

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (4 of 14) [11/14/2003 5:50:41 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (5 of 14) [11/14/2003 5:50:41 PM]
4.4.5.2. Accounting for Non-Constant Variation Across the Data 4.4.5.2. Accounting for Non-Constant Variation Across the Data

Using Weighted Least As discussed in the overview of different methods for building process models, the goal when
Squares using weighted least squares regression is to ensure that each data point has an appropriate level
of influence on the final parameter estimates. Using the weighted least squares fitting criterion,
the parameter estimates are obtained by minimizing

Optimal results, which minimize the uncertainty in the parameter estimators, are obtained when
the weights, , used to estimate the values of the unknown parameters are inversely proportional
to the variances at each combination of predictor variable values:

Unfortunately, however, these optimal weights, which are based on the true variances of each
data point, are never known. Estimated weights have to be used instead. When estimated weights
are used, the optimality properties associated with known weights no longer strictly apply.
However, if the weights can be estimated with high enough precision, their use can significantly
improve the parameter estimates compared to the results that would be obtained if all of the data
points were equally weighted.

Direct Estimation of If there are replicates in the data, the most obvious way to estimate the weights is to set the
Weights weight for each data point equal to the reciprocal of the sample variance obtained from the set of
replicate measurements to which the data point belongs. Mathematically, this would be

where
● are the weights indexed by their predictor variable levels and replicate measurements,
● indexes the unique combinations of predictor variable values,
● indexes the replicates within each combination of predictor variable values,
● is the sample standard deviation of the response variable at the ith combination of
predictor variable values,
● is the number of replicate observations at the ith combination of predictor variable
values,
● are the individual data points indexed by their predictor variable levels and replicate
measurements,
● is the mean of the responses at the ith combination of predictor variable levels.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (6 of 14) [11/14/2003 5:50:41 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (7 of 14) [11/14/2003 5:50:41 PM]
4.4.5.2. Accounting for Non-Constant Variation Across the Data 4.4.5.2. Accounting for Non-Constant Variation Across the Data

Unfortunately, although this method is attractive, it rarely works well. This is because when the Power Function One particular function that often works well for modeling the variances is a power of the mean
weights are estimated this way, they are usually extremely variable. As a result, the estimated Model for the Weights at each combination of predictor variable values,
weights do not correctly control how much each data point should influence the parameter
estimates. This method can work, but it requires a very large number of replicates at each
combination of predictor variables. In fact, if this method is used with too few replicate
measurements, the parameter estimates can actually be more variable than they would have been
if the unequal variation were ignored.
.
A Better Strategy for A better strategy for estimating the weights is to find a function that relates the standard deviation
Estimating the of the response at each combination of predictor variable values to the predictor variables Iterative procedures for simultaneously fitting a weighted least squares model to the original data
Weights themselves. This means that if and a power function model for the weights are discussed in Carroll and Ruppert (1988), and
Ryan (1997).

Fitting the Model for When fitting the model for the estimation of the weights,
Estimation of the
(denoting the unknown parameters in the function by ), then the weights can be set to Weights
,

it is important to note that the usual regression assumptions do not hold. In particular, the
variation of the random errors is not constant across the different sets of replicates and their
distribution is not normal. However, this can be often be accounted for by using transformations
This approach to estimating the weights usually provides more precise estimates than direct (the ln transformation often stabilizes the variation), as described above.
estimation because fewer quantities have to be estimated and there is more data to estimate each
one. Validating the Model Of course, it is always a good idea to check the assumptions of the analysis, as in any
for Estimation of the model-building effort, to make sure the model of the weights seems to fit the weight data
Estimating Weights If there are only very few or no replicate measurements for each combination of predictor Weights reasonably well. The fit of the weights model often does not need to meet all of the usual
Without Replicates variable values, then approximate replicate groups can be formed so that weights can be standards to be effective, however.
estimated. There are several possible approaches to forming the replicate groups.
1. One method is to manually form the groups based on plots of the response against the Using Weighted Once the weights have been estimated and the model has been fit to the original data using
predictor variables. Although this allows a lot of flexibility to account for the features of a Residuals to Validate weighted least squares, the validation of the model follows as usual, with one exception. In a
specific data set, it often impractical. However, this approach may be useful for relatively WLS Models weighted analysis, the distribution of the residuals can vary substantially with the different values
small data sets in which the spacing of the predictor variable values is very uneven. of the predictor variables. This necessitates the use of weighted residuals [Graybill and Iyer
(1994)] when carrying out a graphical residual analysis so that the plots can be interpreted as
2. Another approach is to divide the data into equal-sized groups of observations after sorting usual. The weighted residuals are given by the formula
by the values of the response variable. It is important when using this approach not to make
the size of the replicate groups too large. If the groups are too large, the standard deviations
of the response in each group will be inflated because the approximate replicates will differ .
from each other too much because of the deterministic variation in the data. Again, plots of
the response variable versus the predictor variables can be used as a check to confirm that
It is important to note that most statistical software packages do not compute and return weighted
the approximate sets of replicate measurements look reasonable.
residuals when a weighted fit is done, so the residuals will usually have to be weighted manually
in an additional step. If after computing a weighted least squares fit using carefully estimated
3. A third approach is to choose the replicate groups based on ranges of predictor variable weights, the residual plots still show the same funnel-shaped pattern as they did for the initial
values. That is, instead of picking groups of a fixed size, the ranges of the predictor equally-weighted fit, it is likely that you may have forgotten to compute or plot the weighted
variables are divided into equal size increments or bins and the responses in each bin are residuals.
treated as replicates. Because the sizes of the groups may vary, there is a tradeoff in this
case between defining the intervals for approximate replicates to be too narrow or too wide.
As always, plots of the response variable against the predictor variables can serve as a
guide.
Although the exact estimates of the weights will be somewhat dependent on the approach used to
define the replicate groups, the resulting weighted fit is typically not particularly sensitive to
small changes in the definition of the weights when the weights are based on a simple, smooth
function.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (8 of 14) [11/14/2003 5:50:41 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (9 of 14) [11/14/2003 5:50:41 PM]
4.4.5.2. Accounting for Non-Constant Variation Across the Data 4.4.5.2. Accounting for Non-Constant Variation Across the Data

Example of WLS The power function model for the weights, mentioned above, is often especially convenient when Data with
Using the Power there is only one predictor variable. In this situation the general model given above can usually be Approximate Rounded Standard
Function Model simplified to the power function Replicates Temperature Temperature Pressure Deviation
---------------------------------------------
21.602 21 91.423 0.192333
, 21.448 21 91.695 0.192333
23.323 24 98.883 1.102380
which does not require the use of iterative fitting methods. This model will be used with the 22.971 24 97.324 1.102380
modified version of the Pressure/Temperature data, plotted below, to illustrate the steps needed to 25.854 27 107.620 0.852080
carry out a weighted least squares fit. 25.609 27 108.112 0.852080
25.838 27 109.279 0.852080
29.242 30 119.933 11.046422
Modified
31.489 30 135.555 11.046422
Pressure/Temperature
34.101 33 139.684 0.454670
Data
33.901 33 139.041 0.454670
37.481 36 150.165 0.031820
35.451 36 150.210 0.031820
39.506 39 164.155 2.884289
40.285 39 168.234 2.884289
43.004 42 180.802 4.845772
41.449 42 172.646 4.845772
42.989 42 169.884 4.845772
41.976 42 171.617 4.845772
44.692 45 180.564 NA
48.599 48 191.243 5.985219
47.901 48 199.386 5.985219
49.127 48 202.913 5.985219
49.542 51 196.225 9.074554
51.144 51 207.458 9.074554
50.995 51 205.375 9.074554
50.917 51 218.322 9.074554
54.749 54 225.607 2.040637
53.226 54 223.994 2.040637
54.467 54 229.040 2.040637
55.350 54 227.416 2.040637
54.673 54 223.958 2.040637
54.936 54 224.790 2.040637
57.549 57 230.715 10.098899
56.982 57 216.433 10.098899
Defining Sets of From the data, plotted above, it is clear that there are not many true replicates in this data set. As 58.775 60 224.124 23.120270
Approximate a result, sets of approximate replicate measurements need to be defined in order to use the power 61.204 60 256.821 23.120270
Replicate function model to estimate the weights. In this case, this was done by rounding a multiple of the 68.297 69 276.594 6.721043
Measurements temperature to the nearest degree and then converting the rounded data back to the original scale. 68.476 69 267.296 6.721043
68.774 69 280.352 6.721043

This is an easy way to identify sets of measurements that have temperatures that are relatively
close together. If this process had produced too few sets of replicates, a smaller factor than three
could have been used to spread the data out further before rounding. If fewer replicate sets were
needed, then a larger factor could have been used. The appropriate value to use is a matter of
judgment. An ideal value is one that doesn't combine values that are too different and that yields
sets of replicates that aren't too different in size. A table showing the original data, the rounded
temperatures that define the approximate replicates, and the replicate standard deviations is listed
below.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (10 of 14) [11/14/2003 5:50:41 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (11 of 14) [11/14/2003 5:50:41 PM]
4.4.5.2. Accounting for Non-Constant Variation Across the Data 4.4.5.2. Accounting for Non-Constant Variation Across the Data
function will not affect the results of the final fit significantly.
Transformation of the With the replicate groups defined, a plot of the ln of the replicate variances versus the ln of the
Weight Data temperature shows the transformed data for estimating the weights does appear to follow the Output from Weight
power function model. This is because the ln-ln transformation linearizes the power function, as Estimation Fit Residual Standard Error = 3.0245
well as stabilizing the variation of the random errors and making their distribution approximately
normal. Multiple R-Square = 0.3642

N = 14,

F-statistic = 6.8744 on 1 and 12 df, p-value = 0.0223

coef std.err t.stat p.value


Intercept -20.5896 8.4994 -2.4225 0.0322
ln(Temperature) 6.0230 2.2972 2.6219 0.0223
Transformed Data for
Weight Estimation Fit of the WLS Model With the weight function estimated, the fit of the model with weighted least squares produces the
with Fitted Model to the Pressure / residual plot below. This plot, which shows the weighted residuals from the fit versus
Temperature Data temperature, indicates that use of the estimated weight function has stabilized the increasing
variation in pressure observed with increasing temperature. The plot of the data with the
estimated regression function and additional residual plots using the weighted residuals confirm
that the model fits the data well.

Weighted Residuals
from WLS Fit of
Pressure /
Temperature Data

Specification of The Splus output from the fit of the weight estimation model is shown below. Based on the output
Weight Function and the associated residual plots, the model of the weights seems reasonable, and

should be an appropriate weight function for the modified Pressure/Temperature data. The weight
function is based only on the slope from the fit to the transformed weight data because the
weights only need to be proportional to the replicate variances. As a result, we can ignore the
estimate of in the power function since it is only a proportionality constant (in original units of
the model). The exponent on the temperature in the weight function is usually rounded to the
nearest digit or single decimal place for convenience, since that small change in the weight

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (12 of 14) [11/14/2003 5:50:41 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (13 of 14) [11/14/2003 5:50:41 PM]
4.4.5.2. Accounting for Non-Constant Variation Across the Data 4.4.5.3. Accounting for Errors with a Non-Normal Distribution

Comparison of Having modeled the data using both transformed variables and weighted least squares to account
Transformed and for the non-constant standard deviations observed in pressure, it is interesting to compare the two
Weighted Results resulting models. Logically, at least one of these two models cannot be correct (actually, probably
neither one is exactly correct). With the random error inherent in the data, however, there is no 4. Process Modeling
way to tell which of the two models actually describes the relationship between pressure and 4.4. Data Analysis for Process Modeling
temperature better. The fact that the two models lie right on top of one another over almost the 4.4.5. If my current model does not fit the data well, how can I improve it?
entire range of the data tells us that. Even at the highest temperatures, where the models diverge
slightly, both models match the small amount of data that is available reasonably well. The only
way to differentiate between these models is to use additional scientific knowledge or collect a lot 4.4.5.3. Accounting for Errors with a Non-Normal
more data. The good news, though, is that the models should work equally well for predictions or
calibrations based on these data, or for basic understanding of the relationship between Distribution
temperature and pressure.
Basic Approach: Unlike when correcting for non-constant variation in the random errors, there is really only one
Transformation basic approach to handling data with non-normal random errors for most regression methods.
This is because most methods rely on the assumption of normality and the use of linear estimation
methods (like least squares) to make probabilistic inferences to answer scientific or engineering
questions. For methods that rely on normality of the data, direct manipulation of the data to make
the random errors approximately normal is usually the best way to try to bring the data in line
with this assumption. The main alternative to transformation is to use a fitting criterion that
directly takes the distribution of the random errors into account when estimating the unknown
parameters. Using these types of fitting criteria, such as maximum likelihood, can provide very
good results. However, they are often much harder to use than the general fitting criteria used in
most process modeling methods.

Using The basic steps for using transformations to handle data with non-normally distributed random
Transformations errors are essentially the same as those used to handle non-constant variation of the random
errors.
1. Transform the response variable to make the distribution of the random errors
approximately normal.
2. Transform the predictor variables, if necessary, to attain or restore a simple functional form
for the regression function.
3. Fit and validate the model in the transformed variables.
4. Transform the predicted values back into the original units using the inverse of the
transformation applied to the response variable.
The main difference between using transformations to account for non-constant variation and
non-normality of the random errors is that it is harder to directly see the effect of a transformation
on the distribution of the random errors. It is very often the case, however, that non-normality and
non-constant standard deviation of the random errors go together, and that the same
transformation will correct both problems at once. In practice, therefore, if you choose a
transformation to fix any non-constant variation in the data, you will often also improve the
normality of the random errors. If the data appear to have non-normally distributed random
errors, but do have a constant standard deviation, you can always fit models to several sets of
transformed data and then check to see which transformation appears to produce the most
normally distributed residuals.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd452.htm (14 of 14) [11/14/2003 5:50:41 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd453.htm (1 of 7) [11/14/2003 5:50:42 PM]
4.4.5.3. Accounting for Errors with a Non-Normal Distribution 4.4.5.3. Accounting for Errors with a Non-Normal Distribution

Typical Not surprisingly, three transformations that are often effective for making the distribution of the Fit of Model to the A four-plot of the residuals obtained after fitting a straight-line model to the
Transformations for random errors approximately normal are: Untransformed Data Pressure/Temperature data with uniformly distributed random errors is shown below. The
Meeting 1. , histogram and normal probability plot on the bottom row of the four-plot are the most useful plots
Distributional for assessing the distribution of the residuals. In this case the histogram suggests that the
Assumptions distribution is more rectangular than bell-shaped, indicating the random errors a not likely to be
2. (note: the base of the logarithm does not really matter), and normally distributed. The curvature in the normal probability plot also suggests that the random
errors are not normally distributed. If the random errors were normally distributed the normal
probability plots should be a fairly straight line. Of course it wouldn't be perfectly straight, but
3. . smooth curvature or several points lying far from the line are fairly strong indicators of
non-normality.
These are the same transformations often used for stabilizing the variation in the data. Other
appropriate transformations to improve the distributional properties of the random errors may be Residuals from
suggested by scientific knowledge or selected using the data. However, these three Straight-Line Model
transformations are good ones to start with since they work well in so many situations. of Untransformed
Data with Uniform
Example To illustrate how to use transformations to change the distribution of the random errors, we will Random Errors
look at a modified version of the Pressure/Temperature example in which the errors are uniformly
distributed. Comparing the results obtained from fitting the data in their original units and under
different transformations will directly illustrate the effects of the transformations on the
distribution of the random errors.

Modified
Pressure/Temperature
Data with Uniform
Random Errors

Selection of Going through a set of steps similar to those used to find transformations to stabilize the random
Appropriate variation, different pairs of transformations of the response and predictor which have a simple
Transformations functional form and will potentially have more normally distributed residuals are chosen. In the
multiplots below, all of the possible combinations of basic transformations are applied to the
temperature and pressure to find the pairs which have simple functional forms. In this case, which
is typical, the the data with square root-square root, ln-ln, and inverse-inverse tranformations all
appear to follow a straight-line model. The next step will be to fit lines to each of these sets of
data and then to compare the residual plots to see whether any have random errors which appear
to be normally distributed.

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd453.htm (2 of 7) [11/14/2003 5:50:42 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd453.htm (3 of 7) [11/14/2003 5:50:42 PM]


4.4.5.3. Accounting for Errors with a Non-Normal Distribution 4.4.5.3. Accounting for Errors with a Non-Normal Distribution

sqrt(Pressure) vs 1/Pressure vs
Different Different
Tranformations of Tranformations of
Temperature Temperature

log(Pressure) vs Fit of Model to The normal probability plots and histograms below show the results of fitting straight-line models
Different Transformed to the three sets of transformed data. The results from the fit of the model to the data in its
Tranformations of Variables original units are also shown for comparison. From the four normal probability plots it looks like
Temperature the model fit using the ln-ln transformations produces the most normally distributed random
errors. Because the normal probability plot for the ln-ln data is so straight, it seems safe to
conclude that taking the ln of the pressure makes the distribution of the random errors
approximately normal. The histograms seem to confirm this since the histogram of the ln-ln data
looks reasonably bell-shaped while the other histograms are not particularly bell-shaped.
Therefore, assuming the other residual plots also indicated that a straight line model fit this
transformed data, the use of ln-ln tranformations appears to be appropriate for analysis of this
data.

Residuals from the Fit


to the Transformed
Variables

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd453.htm (4 of 7) [11/14/2003 5:50:42 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd453.htm (5 of 7) [11/14/2003 5:50:42 PM]


4.4.5.3. Accounting for Errors with a Non-Normal Distribution 4.4.5.3. Accounting for Errors with a Non-Normal Distribution

Residuals from the Fit


to the Transformed
Variables

http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd453.htm (6 of 7) [11/14/2003 5:50:42 PM] http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd453.htm (7 of 7) [11/14/2003 5:50:42 PM]


4.5. Use and Interpretation of Process Models 4.5.1. What types of predictions can I make using the model?

4. Process Modeling 4. Process Modeling


4.5. Use and Interpretation of Process Models

4.5. Use and Interpretation of Process


4.5.1. What types of predictions can I make
Models
using the model?
Overview of This section covers the interpretation and use of the models developed
Section 4.5 from the collection and analysis of data using the procedures discussed Detailed This section details some of the different types of predictions that can be
in Section 4.3 and Section 4.4. Three of the main uses of such models, Information made using the various process models whose development is discussed
estimation, prediction and calibration, are discussed in detail. on in Section 4.1 through Section 4.4. Computational formulas or
Optimization, another important use of this type of model, is primarily Prediction algorithms are given for each different type of estimation or prediction,
discussed in Chapter 5: Process Improvement. along with simulation examples showing its probabilisitic interpretation.
An introduction to the different types of estimation and prediction can
Contents of 1. What types of predictions can I make using the model? be found in Section 4.1.3.1. A brief description of estimation and
Section 4.5 prediction versus the other uses of process models is given in Section
1. How do I estimate the average response for a particular set
4.1.3.
of predictor variable values?
2. How can I predict the value and and estimate the Different 1. How do I estimate the average response for a particular set of
uncertainty of a single response? Types of predictor variable values?
2. How can I use my process model for calibration? Predictions
2. How can I predict the value and and estimate the uncertainty of a
1. Single-Use Calibration Intervals single response?
3. How can I optimize my process using the process model?

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd5.htm [11/14/2003 5:50:48 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd51.htm [11/14/2003 5:50:48 PM]


4.5.1.1. How do I estimate the average response for a particular set of predictor variable values? 4.5.1.1. How do I estimate the average response for a particular set of predictor variable values?

Uncertainty Knowing that the estimated average pressure is 263.21 at a temperature of 65, or that the
Needed estimated average torque on a polymer sample under particular conditions is 5.26, however, is not
enough information to make scientific or engineering decisions about the process. This is because
4. Process Modeling the pressure value of 263.21 is only an estimate of the average pressure at a temperature of 65.
4.5. Use and Interpretation of Process Models Because of the random error in the data, there is also random error in the estimated regression
4.5.1. What types of predictions can I make using the model? parameters, and in the values predicted using the model. To use the model correctly, therefore, the
uncertainty in the prediction must also be quantified. For example, if the safe operational pressure
of a particular type of gas tank that will be used at a temperature of 65 is 300, different
4.5.1.1. How do I estimate the average response for a engineering conclusions would be drawn from knowing the average actual pressure in the tank is
likely to lie somewhere in the range versus lying in the range .
particular set of predictor variable values?
Confidence In order to provide the necessary information with which to make engineering or scientific
Step 1: Plug Once a model that gives a good description of the process has been developed, it can be used for Intervals decisions, predictions from process models are usually given as intervals of plausible values that
Predictors estimation or prediction. To estimate the average response of the process, or, equivalently, the have a probabilistic interpretation. In particular, intervals that specify a range of values that will
Into value of the regression function, for any particular combination of predictor variable values, the contain the value of the regression function with a pre-specified probability are often used. These
Estimated values of the predictor variables are simply substituted in the estimated regression function itself. intervals are called confidence intervals. The probability with which the interval will capture the
Function These estimated function values are often called "predicted values" or "fitted values". true value of the regression function is called the confidence level, and is most often set by the
user to be 0.95, or 95% in percentage terms. Any value between 0% and 100% could be specified,
Pressure / For example, in the Pressure/Temperature process, which is well described by a straight-line though it would almost never make sense to consider values outside a range of about 80% to 99%.
Temperature model relating pressure ( ) to temperature ( ), the estimated regression function is found to be The higher the confidence level is set, the more likely the true value of the regression function is
Example to be contained in the interval. The trade-off for high confidence, however, is wide intervals. As
the sample size is increased, however, the average width of the intervals typically decreases for
any fixed confidence level. The confidence level of an interval is usually denoted symbolically
using the notation , with denoting a user-specified probability, called the significance
by substituting the estimated parameter values into the functional part of the model. Then to level, that the interval will not capture the true value of the regression function. The significance
estimate the average pressure at a temperature of 65, the predictor value of interest is subsituted in level is most often set to be 5% so that the associated confidence level will be 95%.
the estimated regression function, yielding an estimated pressure of 263.21.
Computing Confidence intervals are computed using the estimated standard deviations of the estimated
Confidence regression function values and a coverage factor that controls the confidence level of the interval
Intervals and accounts for the variation in the estimate of the residual standard deviation.

The standard deviations of the predicted values of the estimated regression function depend on the
standard deviation of the random errors in the data, the experimental design used to collect the
This estimation process works analogously for nonlinear models, LOESS models, and all other data and fit the model, and the values of the predictor variables used to obtain the predicted
types of functional process models. values. These standard deviations are not simple quantities that can be read off of the output
summarizing the fit of the model, but they can often be obtained from the software used to fit the
Polymer Based on the output from fitting the stretched exponential model in time ( ) and temperature ( model. This is the best option, if available, because there are a variety of numerical issues that can
Relaxation ), the estimated regression function for the polymer relaxation data is arise when the standard deviations are calculated directly using typical theoretical formulas.
Example Carefully written software should minimize the numerical problems encountered. If necessary,
however, matrix formulas that can be used to directly compute these values are given in texts such
as Neter, Wasserman, and Kutner.
.

Therefore, the estimated torque ( ) on a polymer sample after 60 minutes at a temperature of 40 is


5.26.

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd511.htm (1 of 6) [11/14/2003 5:50:49 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd511.htm (2 of 6) [11/14/2003 5:50:49 PM]


4.5.1.1. How do I estimate the average response for a particular set of predictor variable values? 4.5.1.1. How do I estimate the average response for a particular set of predictor variable values?

The coverage factor used to control the confidence level of the intervals depends on the Polymer
Lower 95% Upper 95%
distributional assumption about the errors and the amount of information available to estimate the Relaxation Confidence Confidence
residual standard deviation of the fit. For procedures that depend on the assumption that the Example Bound Bound
random errors have a normal distribution, the coverage factor is typically a cut-off value from the
Student's t distribution at the user's pre-specified confidence level and with the same number of
20 25 5.586307 0.028402 2.000298 0.056812 5.529 5.643
degrees of freedom as used to estimate the residual standard deviation in the fit of the model.
Tables of the t distribution (or functions in software) may be indexed by the confidence level ( 80 25 4.998012 0.012171 2.000298 0.024346 4.974 5.022
) or the significance level ( ). It is also important to note that since these are two-sided 20 50 6.960607 0.013711 2.000298 0.027427 6.933 6.988
intervals, half of the probability denoted by the significance level is usually assigned to each side
80 50 5.342600 0.010077 2.000298 0.020158 5.322 5.363
of the interval, so the proper entry in a t table or in a software function may also be labeled with
the value of , or , if the table or software is not exclusively designed for use with 20 75 7.521252 0.012054 2.000298 0.024112 7.497 7.545
two-sided tests. 80 75 6.220895 0.013307 2.000298 0.026618 6.194 6.248

The estimated values of the regression function, their standard deviations, and the coverage factor
Interpretation As mentioned above, confidence intervals capture the true value of the regression function with a
are combined using the formula
of Confidence user-specified probability, the confidence level, using the estimated regression function and the
Intervals associated estimate of the error. Simulation of many sets of data from a process model provides a
good way to obtain a detailed understanding of the probabilistic nature of these intervals. The
advantage of using simulation is that the true model parameters are known, which is never the
case for a real process. This allows direct comparison of how confidence intervals constructed
with denoting the estimated value of the regression function, is the coverage factor, from a limited amount of data relate to the true values that are being estimated.
indexed by a function of the significance level and by its degrees of freedom, and is the
standard deviation of . Some software may provide the total uncertainty for the confidence The plot below shows 95% confidence intervals computed using 50 independently generated data
interval given by the equation above, or may provide the lower and upper confidence bounds by sets that follow the same model as the data in the Pressure/Temperature example. Random errors
adding and subtracting the total uncertainty from the estimate of the average response. This can from a normal distribution with a mean of zero and a known standard deviation are added to each
save some computational effort when making predictions, if available. Since there are many types set of true temperatures and true pressures that lie on a perfect straight line to obtain the simulated
of predictions that might be offered in a software package, however, it is a good idea to test the data. Then each data set is used to compute a confidence interval for the average pressure at a
software on an example for which confidence limits are already available to make sure that the temperature of 65. The dashed reference line marks the true value of the average pressure at a
software is computing the expected type of intervals. temperature of 65.

Confidence Computing confidence intervals for the average pressure in the Pressure/Temperature example, Confidence
Intervals for for temperatures of 25, 45, and 65, and for the average torque on specimens from the polymer Intervals
the Example relaxation example at different times and temperatures gives the results listed in the tables below. Computed
Applications from 50 Sets
Note: the number of significant digits shown in the tables below is larger than would normally be
of Simulated
reported. However, as many significant digits as possible should be carried throughout all
Data
calculations and results should only be rounded for final reporting. If reported numbers may be
used in further calculations, they should not be rounded even when finally reported. A useful rule
for rounding final results that will not be used for further computation is to round all of the
reported values to one or two significant digits in the total uncertainty, . This is the
convention for rounding that has been used in the tables below.

Pressure /
Lower 95% Upper 95%
Temperature Confidence Confidence
Example Bound Bound

25 106.0025 1.1976162 2.024394 2.424447 103.6 108.4


45 184.6053 0.6803245 2.024394 1.377245 183.2 186.0
65 263.2081 1.2441620 2.024394 2.518674 260.7 265.7

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd511.htm (3 of 6) [11/14/2003 5:50:49 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd511.htm (4 of 6) [11/14/2003 5:50:49 PM]


4.5.1.1. How do I estimate the average response for a particular set of predictor variable values? 4.5.1.1. How do I estimate the average response for a particular set of predictor variable values?

Confidence From the plot it is easy to see that not all of the intervals contain the true value of the average
Level pressure. Data sets 16, 26, and 39 all produced intervals that did not cover the true value of the
Specifies average pressure at a temperature of 65. Sometimes the interval may fail to cover the true value
Long-Run because the estimated pressure is unusually high or low because of the random errors in the data
Interval set. In other cases, the variability in the data may be underestimated, leading to an interval that is
Coverage too short to cover the true value. However, for 47 out of 50, or approximately 95% of the data
sets, the confidence intervals did cover the true average pressure. When the number of data sets
was increased to 5000, confidence intervals computed for 4723, or 94.46%, of the data sets
covered the true average pressure. Finally, when the number of data sets was increased to 10000,
95.12% of the confidence intervals computed covered the true average pressure. Thus, the
simulation shows that although any particular confidence interval might not cover its associated
true value, in repeated experiments this method of constructing intervals produces intervals that
cover the true value at the rate specified by the user as the confidence level. Unfortunately, when
dealing with real processes with unknown parameters, it is impossible to know whether or not a
particular confidence interval does contain the true value. It is nice to know that the error rate can
be controlled, however, and can be set so that it is far more likely than not that each interval
produced does contain the true value.

Interpretation To summarize the interpretation of the probabilistic nature of confidence intervals in words: in
Summary independent, repeated experiments, of the intervals will cover the true values,
given that the assumptions needed for the construction of the intervals hold.

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd511.htm (5 of 6) [11/14/2003 5:50:49 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd511.htm (6 of 6) [11/14/2003 5:50:49 PM]


4.5.1.2. How can I predict the value and and estimate the uncertainty of a single response? 4.5.1.2. How can I predict the value and and estimate the uncertainty of a single response?

Standard The estimate of the standard deviation of the predicted value, , is obtained as described earlier.
Deviation of Because the residual standard deviation describes the random variation in each individual
Prediction
measurement or observation from the process, , the estimate of the residual standard deviation
4. Process Modeling obtained when fitting the model to the data, is used to account for the extra uncertainty needed to
4.5. Use and Interpretation of Process Models predict a measurement value. Since the new observation is independent of the data used to fit the
4.5.1. What types of predictions can I make using the model? model, the estimates of the two standard deviations are then combined by "root-sum-of-squares" or "in
quadrature", according to standard formulas for computing variances, to obtain the standard deviation
of the prediction of the new measurement, . The formula for is
4.5.1.2. How can I predict the value and and estimate the
uncertainty of a single response?
.
A Different In addition to estimating the average value of the response variable for a given combination of preditor
Type of values, as discussed on the previous page, it is also possible to make predictions of the values of new
Prediction measurements or observations from a process. Unlike the true average response, a new measurement is Coverage Because both and are mathematically nothing more than different scalings of , and coverage
often actually observable in the future. However, there are a variety of different situations in which a Factor and
prediction of a measurement value may be more desirable than actually making an observation from factors from the t distribution only depend on the amount of data available for estimating , the
Prediction
the process. coverage factors are the same for confidence and prediction intervals. Combining the coverage factor
Interval
and the standard deviation of the prediction, the formula for constructing prediction intervals is given
Formula
Example For example, suppose that a concrete supplier needs to supply concrete of a specified measured by
strength for a particular contract, but knows that strength varies systematically with the ambient
temperature when the concrete is poured. In order to be sure that the concrete will meet the
.
specification, prior to pouring, samples from the batch of raw materials can be mixed, poured, and
measured in advance, and the relationship between temperature and strength can be modeled. Then
predictions of the strength across the range of possible field temperatures can be used to ensure the As with the computation of confidence intervals, some software may provide the total uncertainty for
product is likely to meet the specification. Later, after the concrete is poured (and the temperature is the prediction interval given the equation above, or may provide the lower and upper prediction
recorded), the accuracy of the prediction can be verified. bounds. As suggested before, however, it is a good idea to test the software on an example for which
prediction limits are already available to make sure that the software is computing the expected type of
The mechanics of predicting a new measurement value associated with a combination of predictor intervals.
variable values are similar to the steps used in the estimation of the average response value. In fact, the
actual estimate of the new measured value is obtained by evaluating the estimated regression function Prediction Computing prediction intervals for the measured pressure in the Pressure/Temperature example, at
at the relevant predictor variable values, exactly as is done for the average response. The estimates are Intervals for temperatures of 25, 45, and 65, and for the measured torque on specimens from the polymer relaxation
the same for these two quantities because, assuming the model fits the data, the only difference the Example example at different times and temperatures, gives the results listed in the tables below. Note: the
between the average response and a particular measured response is a random error. Because the error Applications number of significant digits shown is larger than would normally be reported. However, as many
is random, and has a mean of zero, there is no additional information in the model that can be used to significant digits as possible should be carried throughout all calculations and results should only be
predict the particular response beyond the information that is available when predicting the average rounded for final reporting. If reported numbers may be used in further calculations, then they should
response. not be rounded even when finally reported. A useful rule for rounding final results that will not be
used for further computation is to round all of the reported values to one or two significant digits in the
Uncertainties As when estimating the average response, a probabilistic interval is used when predicting a new total uncertainty, . This is the convention for rounding that has been used in the tables
Do Differ measurement to provide the information needed to make engineering or scientific conclusions. below.
However, even though the estimates of the average response and particular response values are the
same, the uncertainties of the two estimates do differ. This is because the uncertainty of the measured
Pressure /
response must include both the uncertainty of the estimated average response and the uncertainty of Lower 95% Upper 95%
Temperature Prediction Prediction
the new measurement that could conceptually be observed. This uncertainty must be included if the
Example Bound Bound
interval that will be used to summarize the prediction result is to contain the new measurement with
the specified confidence. To help distinguish the two types of predictions, the probabilistic intervals
for estimation of a new measurement value are called prediction intervals rather than confidence 25 106.0025 4.299099 1.1976162 4.462795 2.024394 9.034455 97.0 115.0
intervals. 45 184.6053 4.299099 0.6803245 4.352596 2.024394 8.811369 175.8 193.5
65 263.2081 4.299099 1.2441620 4.475510 2.024394 9.060197 254.1 272.3

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd512.htm (1 of 5) [11/14/2003 5:50:49 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd512.htm (2 of 5) [11/14/2003 5:50:49 PM]


4.5.1.2. How can I predict the value and and estimate the uncertainty of a single response? 4.5.1.2. How can I predict the value and and estimate the uncertainty of a single response?

Polymer
Lower Upper
Relaxation 95% 95%
Example Prediction Prediction
Bound Bound

20 25 5.586307 0.04341221 0.02840153 0.05187742 2.000298 0.10377030 5.48 5.69


80 25 4.998012 0.04341221 0.01217109 0.04508609 2.000298 0.09018560 4.91 5.09
20 50 6.960607 0.04341221 0.01371149 0.04552609 2.000298 0.09106573 6.87 7.05
80 50 5.342600 0.04341221 0.01007761 0.04456656 2.000298 0.08914639 5.25 5.43
20 75 7.521252 0.04341221 0.01205401 0.04505462 2.000298 0.09012266 7.43 7.61
80 75 6.220895 0.04341221 0.01330727 0.04540598 2.000298 0.09082549 6.13 6.31

Interpretation Simulation of many sets of data from a process model provides a good way to obtain a detailed
of Prediction understanding of the probabilistic nature of the prediction intervals. The main advantage of using
Intervals simulation is that it allows direct comparison of how prediction intervals constructed from a limited
amount of data relate to the measured values that are being estimated.

The plot below shows 95% prediction intervals computed from 50 independently generated data sets
that follow the same model as the data in the Pressure/Temperature example. Random errors from the
normal distribution with a mean of zero and a known standard deviation are added to each set of true
temperatures and true pressures that lie on a perfect straight line to produce the simulated data. Then
each data set is used to compute a prediction interval for a newly observed pressure at a temperature of
65. The newly observed measurements, observed after making the prediction, are noted with an "X"
for each data set.
Confidence From the plot it is easy to see that not all of the intervals contain the pressure values observed after the
Level prediction was made. Data set 4 produced an interval that did not capture the newly observed pressure
Prediction Specifies measurement at a temperature of 65. However, for 49 out of 50, or not much over 95% of the data sets,
Intervals Long-Run the prediction intervals did capture the measured pressure. When the number of data sets was
Computed Interval increased to 5000, prediction intervals computed for 4734, or 94.68%, of the data sets covered the new
from 50 Sets Coverage measured values. Finally, when the number of data sets was increased to 10000, 94.92% of the
of Simulated confidence intervals computed covered the true average pressure. Thus, the simulation shows that
Data although any particular prediction interval might not cover its associated new measurement, in
repeated experiments this method produces intervals that contain the new measurements at the rate
specified by the user as the confidence level.

Comparison It is also interesting to compare these results to the analogous results for confidence intervals. Clearly
with the most striking difference between the two plots is in the sizes of the uncertainties. The uncertainties
Confidence for the prediction intervals are much larger because they must include the standard deviation of a
Intervals single new measurement, as well as the standard deviation of the estimated average response value.
The standard deviation of the estimated average response value is lower because a lot of the random
error that is in each measurement cancels out when the data are used to estimate the unknown
parameters in the model. In fact, if as the sample size increases, the limit on the width of a confidence
interval approaches zero while the limit on the width of the prediction interval as the sample size
increases approaches . Understanding the different types of intervals and the bounds on
interval width can be important when planning an experiment that requires a result to have no more
than a specified level of uncertainty to have engineering value.

Interpretation To summarize the interpretation of the probabilistic nature of confidence intervals in words: in
Summary independent, repeated experiments, of the intervals will be expected cover their true
values, given that the assumptions needed for the construction of the intervals hold.

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd512.htm (3 of 5) [11/14/2003 5:50:49 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd512.htm (4 of 5) [11/14/2003 5:50:49 PM]


4.5.1.2. How can I predict the value and and estimate the uncertainty of a single response? 4.5.2. How can I use my process model for calibration?

4. Process Modeling
4.5. Use and Interpretation of Process Models

4.5.2. How can I use my process model for


calibration?
Detailed This section details some of the different types of calibrations that can
Calibration be made using the various process models whose development was
Information discussed in previous sections. Computational formulas or algorithms
are given for each different type of calibration, along with simulation
examples showing its probabilistic interpretation. An introduction to
calibration can be found in Section 4.1.3.2. A brief comparison of
calibration versus the other uses of process models is given in Section
4.1.3. Additional information on calibration is available in Section 3 of
Chapter 2: Measurement Process Characterization.

Calibration
1. Single-Use Calibration Intervals
Procedures

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd512.htm (5 of 5) [11/14/2003 5:50:49 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd52.htm [11/14/2003 5:50:50 PM]


4.5.2.1. Single-Use Calibration Intervals 4.5.2.1. Single-Use Calibration Intervals

4. Process Modeling
Although this is a simple process for the straight-line model, note that even for this simple
4.5. Use and Interpretation of Process Models
regression function the estimate of the temperature is not linear in the parameters of the model.
4.5.2. How can I use my process model for calibration?

Numerical To set this up to be solved numerically, the equation simply has to be set up in the form
4.5.2.1. Single-Use Calibration Intervals Approach

Calibration As mentioned in Section 1.3, the goal of calibration (also called inverse prediction by some
authors) is to quantitatively convert measurements made on one of two measurement scales to the and then the function of temperature ( ) defined by the left-hand side of the equation can be used
other measurement scale. Typically the two scales are not of equal importance, so the conversion as the argument in an arbitrary root-finding function. It is typically necessary to provide the
occurs in only one direction. The model fit to the data that relates the two measurement scales and root-finding software with endpoints on opposite sides of the root. These can be obtained from a
a new measurement made on the secondary scale provide the means for the conversion. The plot of the calibration data and usually do not need to be very precise. In fact, it is often adequate
results from the fit of the model also allow for computation of the associated uncertainty in the to simply set the endpoints equal to the range of the calibration data, since calibration functions
estimate of the true value on the primary measurement scale. Just as for prediction, estimates of tend to be increasing or decreasing functions without local minima or maxima in the range of the
both the value on the primary scale and its uncertainty are needed in order to make sound data. For the pressure/temperature data, the endpoints used in the root-finding software could
engineering or scientific decisions or conclusions. Approximate confidence intervals for the true even be set to values like -5 and 100, broader than the range of the data. This choice of end points
value on the primary measurement scale are typically used to summarize the results would even allow for extrapolation if new pressure values outside the range of the original
probabilistically. An example, which will help make the calibration process more concrete, is calibration data were observed.
given in Section 4.1.3.2. using thermocouple calibration data.
Thermocouple For the more realistic thermocouple calibration example, which is well fit by a LOESS model that
Calibration Like prediction estimates, calibration estimates can be computed relatively easily using the Calibration does not require an explicit functional form, the numerical approach must be used to obtain
Estimates regression equation. They are computed by setting a newly observed value of the response Example calibration estimates. The LOESS model is set up identically to the straight-line model for the
variable, , which does not have an accompanying value of the predictor variable, equal to the numerical solution, using the estimated regression function from the software used to fit the
estimated regression function and solving for the unknown value of the predictor variable. model.
Depending on the complexity of the regression function, this may be done analytically, but
sometimes numerical methods are required. Fortunatel, the numerical methods needed are not
complicated, and once implemented are often easier to use than analytical methods, even for
simple regression functions.
Again the function of temperature ( ) on the left-hand side of the equation would be used as the
Pressure / main argument in an arbitrary root-finding function. If for some reason were not
Temperature
In the Pressure/Temperature example, pressure measurements could be used to measure the available in the software used to fit the model, it could always be created manually since LOESS
Example
temperature of the system by observing a new pressure value, setting it equal to the estimated can ultimately be reduced to a series of weighted least squares fits. Based on the plot of the
regression function, thermocouple data, endpoints of 100 and 600 would probably work well for all calibration
estimates. Wider values for the endpoints are not useful here since extrapolations do not make
much sense for this type of local model.

Dataplot Since the verbal descriptions of these numerical techniques can be hard to follow, these ideas may
and solving for the temperature. If a pressure of 178 were measured, the associated temperature Code become clearer by looking at the actual Dataplot computer code for a quadratic calibration, which
would be estimated to be about 43. can be found in the Load Cell Calibration case study. If you have downloaded Dataplot and
installed it, you can run the computations yourself.

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd521.htm (1 of 5) [11/14/2003 5:50:50 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd521.htm (2 of 5) [11/14/2003 5:50:50 PM]


4.5.2.1. Single-Use Calibration Intervals 4.5.2.1. Single-Use Calibration Intervals

Calibration As in prediction, the data used to fit the process model can also be used to determine the Interpretation Although calibration confidence intervals have some unique features, viewed as confidence
Uncertainties uncertainty of the calibration. Both the variation in the average response and in the new of Calibration intervals, their interpretation is essentially analogous to that of confidence intervals for the true
observation of the response value need to be accounted for. This is similar to the uncertainty for Intervals average response. Namely, in repeated calibration experiments, when one calibration is made for
the prediction of a new measurement. In fact, approximate calibration confidence intervals are each set of data used to fit a calibration function and each single new observation of the response,
actually computed by solving for the predictor variable value in the formulas for prediction
then approximately of the intervals computed as described above will capture
interval end points [Graybill (1976)]. Because , the standard deviation of the prediction of a
the true value of the predictor variable, which is a measurement on the primary measurement
measured response, is a function of the predictor variable, like the regression function itself, the scale.
inversion of the prediction interval endpoints is usually messy. However, like the inversion of the
regression function to obtain estimates of the predictor variable, it can be easily solved The plot below shows 95% confidence intervals computed using 50 independently generated data
numerically. sets that follow the same model as the data in the Thermocouple calibration example. Random
errors from a normal distribution with a mean of zero and a known standard deviation are added
The equations to be solved to obtain approximate lower and upper calibration confidence limits, to each set of true temperatures and true voltages that follow a model that can be
are, respectively, well-approximated using LOESS to produce the simulated data. Then each data set and a newly
observed voltage measurement are used to compute a confidence interval for the true temperature
that produced the observed voltage. The dashed reference line marks the true temperature under
, which the thermocouple measurements were made. It is easy to see that most of the intervals do
contain the true value. In 47 out of 50 data sets, or approximately 95%, the confidence intervals
and covered the true temperature. When the number of data sets was increased to 5000, the
confidence intervals computed for 4657, or 93.14%, of the data sets covered the true temperature.
Finally, when the number of data sets was increased to 10000, 93.53% of the confidence intervals
computed covered the true temperature. While these intervals do not exactly attain their stated
, coverage, as the confidence intervals for the average response do, the coverage is reasonably
close to the specified level and is probably adequate from a practical point of view.
with denoting the estimated standard deviation of the prediction of a new measurement.
and are both denoted as functions of the predictor variable, , here to make it clear Confidence
Intervals
that those terms must be written as functions of the unknown value of the predictor variable. The Computed
left-hand sides of the two equations above are used as arguments in the root-finding software, just from 50 Sets
of Simulated
as the expression is used when computing the estimate of the predictor variable. Data

Confidence Confidence intervals for the true predictor variable values associated with the observed values of
Intervals for pressure (178) and voltage (1522) are given in the table below for the Pressure/Temperature
the Example example and the Thermocouple Calibration example, respectively. The approximate confidence
Applications limits and estimated values of the predictor variables were obtained numerically in both cases.

Estimated
Lower 95% Predictor Upper 95%
Confidence Variable Confidence
Example Bound Value Bound

Pressure/Temperature 178 41.07564 43.31925 45.56146


Thermocouple Calibration 1522 553.0026 553.0187 553.0349

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd521.htm (3 of 5) [11/14/2003 5:50:50 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd521.htm (4 of 5) [11/14/2003 5:50:50 PM]


4.5.2.1. Single-Use Calibration Intervals 4.5.3. How can I optimize my process using the process model?

4. Process Modeling
4.5. Use and Interpretation of Process Models

4.5.3. How can I optimize my process using


the process model?
Detailed Process optimization using models fit to data collected using response
Information surface designs is primarily covered in Section 5.5.3 of Chapter 5:
on Process Process Improvement. In that section detailed information is given on
Optimization how to determine the correct process inputs to hit a target output value
or to maximize or minimize process output. Some background on the
use of process models for optimization can be found in Section 4.1.3.3
of this chapter, however, and information on the basic analysis of data
from optimization experiments is covered along with that of other types
of models in Section 4.1 through Section 4.4 of this chapter.

Contents of 1. Optimizing a Process


Chapter 5 1. Single response case
Section 5.5.3.
1. Path of steepest ascent
2. Confidence region for search path
3. Choosing the step length
4. Optimization when there is adequate quadratic fit
5. Effect of sampling error on optimal solution
6. Optimization subject to experimental region
constraints
2. Multiple response case
1. Path of steepest ascent
2. Desirability function approach
3. Mathematical programming approach

http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd521.htm (5 of 5) [11/14/2003 5:50:50 PM] http://www.itl.nist.gov/div898/handbook/pmd/section5/pmd53.htm [11/14/2003 5:50:50 PM]


4.6. Case Studies in Process Modeling 4.6. Case Studies in Process Modeling

3. Ultrasonic Reference Block Study


1. Background and Data
2. Initial Non-Linear Fit

4. Process Modeling
3. Transformations to Improve Fit
4. Weighting to Improve Fit
5. Compare the Fits
4.6. Case Studies in Process Modeling
6. Work This Example Yourself
Detailed, The general points of the first five sections are illustrated in this section 4. Thermal Expansion of Copper Case Study
Realistic using data from physical science and engineering applications. Each 1. Background and Data
Examples example is presented step-by-step in the text and is often cross-linked
2. Exact Rational Models
with the relevant sections of the chapter describing the analysis in
general. Each analysis can also be repeated using a worksheet linked to 3. Initial Plot of Data
the appropriate Dataplot macros. The worksheet is also linked to the 4. Fit Quadratic/Quadratic Model
step-by-step analysis presented in the text for easy reference.
5. Fit Cubic/Cubic Model
Contents: 1. Load Cell Calibration 6. Work This Example Yourself
Section 6
1. Background & Data
2. Selection of Initial Model
3. Model Fitting - Initial Model
4. Graphical Residual Analysis - Initial Model
5. Interpretation of Numerical Output - Initial Model
6. Model Refinement
7. Model Fitting - Model #2
8. Graphical Residual Analysis - Model #2
9. Interpretation of Numerical Output - Model #2
10. Use of the Model for Calibration
11. Work this Example Yourself
2. Alaska Pipeline Ultrasonic Calibration
1. Background and Data
2. Check for Batch Effect
3. Initial Linear Fit
4. Transformations to Improve Fit and Equalize Variances
5. Weighting to Improve Fit
6. Compare the Fits
7. Work This Example Yourself

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd6.htm (1 of 2) [11/14/2003 5:50:51 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd6.htm (2 of 2) [11/14/2003 5:50:51 PM]


4.6.1. Load Cell Calibration 4.6.1.1. Background & Data

4. Process Modeling 4. Process Modeling


4.6. Case Studies in Process Modeling 4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration

4.6.1. Load Cell Calibration


4.6.1.1. Background & Data
Quadratic This example illustrates the construction of a linear regression model for
Calibration load cell data that relates a known load applied to a load cell to the Description The data collected in the calibration experiment consisted of a known
deflection of the cell. The model is then used to calibrate future cell of Data load, applied to the load cell, and the corresponding deflection of the
readings associated with loads of unknown magnitude. Collection cell from its nominal position. Forty measurements were made over a
range of loads from 150,000 to 3,000,000 units. The data were collected
1. Background & Data in two sets in order of increasing load. The systematic run order makes
it difficult to determine whether or not there was any drift in the load
2. Selection of Initial Model
cell or measuring equipment over time. Assuming there is no drift,
3. Model Fitting - Initial Model however, the experiment should provide a good description of the
4. Graphical Residual Analysis - Initial Model relationship between the load applied to the cell and its response.
5. Interpretation of Numerical Output - Initial Model
Resulting
6. Model Refinement Data Deflection Load
7. Model Fitting - Model #2 -------------------------
0.11019 150000
8. Graphical Residual Analysis - Model #2 0.21956 300000
9. Interpretation of Numerical Output - Model #2 0.32949 450000
10. Use of the Model for Calibration 0.43899 600000
0.54803 750000
11. Work This Example Yourself 0.65694 900000
0.76562 1050000
0.87487 1200000
0.98292 1350000
1.09146 1500000
1.20001 1650000
1.30822 1800000
1.41599 1950000
1.52399 2100000
1.63194 2250000
1.73947 2400000
1.84646 2550000
1.95392 2700000
2.06128 2850000
2.16844 3000000
0.11052 150000

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd61.htm [11/14/2003 5:50:51 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd611.htm (1 of 2) [11/14/2003 5:50:51 PM]


4.6.1.1. Background & Data 4.6.1.2. Selection of Initial Model

0.22018 300000
0.32939 450000
0.43886 600000
0.54798 750000
0.65739 900000 4. Process Modeling
0.76596 1050000 4.6. Case Studies in Process Modeling
0.87474 1200000 4.6.1. Load Cell Calibration
0.98300 1350000
1.09150
1.20004
1500000
1650000
4.6.1.2. Selection of Initial Model
1.30818 1800000
Start The first step in analyzing the data is to select a candidate model. In the case of a measurement
1.41613 1950000 Simple system like this one, a fairly simple function should describe the relationship between the load
1.52408 2100000 and the response of the load cell. One of the hallmarks of an effective measurement system is a
1.63159 2250000 straightforward link between the instrumental response and the property being quantified.
1.73965 2400000
1.84696 2550000 Plot the Plotting the data indicates that the hypothesized, simple relationship between load and deflection
1.95445 2700000 Data is reasonable. The plot below shows the data. It indicates that a straight-line model is likely to fit
2.06177 2850000 the data. It does not indicate any other problems, such as presence of outliers or nonconstant
2.16829 3000000 standard deviation of the response.

Initial
Model:
Straight
Line

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd611.htm (2 of 2) [11/14/2003 5:50:51 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd612.htm (1 of 2) [11/14/2003 5:50:51 PM]


4.6.1.2. Selection of Initial Model 4.6.1.3. Model Fitting - Initial Model

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration

4.6.1.3. Model Fitting - Initial Model


Least Using software for computing least squares parameter estimates, the straight-line
Squares model,
Estimation

is easily fit to the data. The computer output from this process is shown below.
Before trying to interpret all of the numerical output, however, it is critical to check
that the assumptions underlying the parameter estimation are met reasonably well.
The next two sections show how the underlying assumptions about the data and
model are checked using graphical and numerical methods.

Dataplot
Output LEAST SQUARES POLYNOMIAL FIT
SAMPLE SIZE N = 40
DEGREE = 1
REPLICATION CASE
REPLICATION STANDARD DEVIATION = 0.2147264895D-03
REPLICATION DEGREES OF FREEDOM = 20
NUMBER OF DISTINCT SUBSETS = 20

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 0.614969E-02 (0.7132E-03) 8.6
2 A1 0.722103E-06 (0.3969E-09) 0.18E+04

RESIDUAL STANDARD DEVIATION = 0.0021712694


RESIDUAL DEGREES OF FREEDOM = 38
REPLICATION STANDARD DEVIATION = 0.0002147265
REPLICATION DEGREES OF FREEDOM = 20
LACK OF FIT F RATIO = 214.7464 = THE 100.0000% POINT OF
THE F DISTRIBUTION WITH 18 AND 20 DEGREES OF FREEDOM

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd612.htm (2 of 2) [11/14/2003 5:50:51 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd613.htm (1 of 2) [11/14/2003 5:50:51 PM]


4.6.1.3. Model Fitting - Initial Model 4.6.1.4. Graphical Residual Analysis - Initial Model

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration

4.6.1.4. Graphical Residual Analysis - Initial Model


Potentially After fitting a straight line to the data, many people like to check the quality of the fit with a plot
Misleading of the data overlaid with the estimated regression function. The plot below shows this for the load
Plot cell data. Based on this plot, there is no clear evidence of any deficiencies in the model.

Avoiding the This type of overlaid plot is useful for showing the relationship between the data and the
Trap predicted values from the regression function; however, it can obscure important detail about the
model. Plots of the residuals, on the other hand, show this detail well, and should be used to
check the quality of the fit. Graphical analysis of the residuals is the single most important
technique for determining the need for model refinement or for verifying that the underlying
assumptions of the analysis are met.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd613.htm (2 of 2) [11/14/2003 5:50:51 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd614.htm (1 of 4) [11/14/2003 5:50:52 PM]


4.6.1.4. Graphical Residual Analysis - Initial Model 4.6.1.4. Graphical Residual Analysis - Initial Model

Residual plots of interest for this model include: Similar


1. residuals versus the predictor variable Residual
Structure
2. residuals versus the regression function values
3. residual run order plot
4. residual lag plot
5. histogram of the residuals
6. normal probability plot

A plot of the residuals versus load is shown below.

Hidden
Structure
Revealed

Additional Further residual diagnostic plots are shown below. The plots include a run order plot, a lag plot, a
Diagnostic histogram, and a normal probability plot. Shown in a two-by-two array like this, these plots
Plots comprise a 4-plot of the data that is very useful for checking the assumptions underlying the
model.

Dataplot
4plot

Scale of Plot The structure in the relationship between the residuals and the load clearly indicates that the
Key functional part of the model is misspecified. The ability of the residual plot to clearly show this
problem, while the plot of the data did not show it, is due to the difference in scale between the
plots. The curvature in the response is much smaller than the linear trend. Therefore the curvature
is hidden when the plot is viewed in the scale of the data. When the linear trend is subtracted,
however, as it is in the residual plot, the curvature stands out.

The plot of the residuals versus the predicted deflection values shows essentially the same
structure as the last plot of the residuals versus load. For more complicated models, however, this
plot can reveal problems that are not clear from plots of the residuals versus the predictor
variables.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd614.htm (2 of 4) [11/14/2003 5:50:52 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd614.htm (3 of 4) [11/14/2003 5:50:52 PM]


4.6.1.4. Graphical Residual Analysis - Initial Model 4.6.1.5. Interpretation of Numerical Output - Initial Model

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration

4.6.1.5. Interpretation of Numerical Output - Initial


Model
Lack-of-Fit The fact that the residual plots clearly indicate a problem with the specification of
Statistic the function describing the systematic variation in the data means that there is little
Interpretable point in looking at most of the numerical results from the fit. However, since there
are replicate measurements in the data, the lack-of-fit test can also be used as part of
the model validation. The numerical results of the fit from Dataplot are list below.

Dataplot
Output LEAST SQUARES POLYNOMIAL FIT
SAMPLE SIZE N = 40
DEGREE = 1
REPLICATION CASE
Interpretation The structure evident in these residual plots also indicates potential problems with different REPLICATION STANDARD DEVIATION = 0.2147264895D-03
of Plots aspects of the model. Under ideal circumstances, the plots in the top row would not show any REPLICATION DEGREES OF FREEDOM = 20
systematic structure in the residuals. The histogram would have a symmetric, bell shape, and the NUMBER OF DISTINCT SUBSETS = 20
normal probability plot would be a straight line. Taken at face value, the structure seen here
indicates a time trend in the data, autocorrelation of the measurements, and a non-normal
distribution of the residuals. PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
1 A0 0.614969E-02 (0.7132E-03) 8.6
It is likely, however, that these plots will look fine once the function describing the systematic
2 A1 0.722103E-06 (0.3969E-09) 0.18E+04
relationship between load and deflection has been corrected. Problems with one aspect of a
regression model often show up in more than one type of residual plot. Thus there is currently no
clear evidence from the 4-plot that the distribution of the residuals from an appropriate model RESIDUAL STANDARD DEVIATION = 0.0021712694
would be non-normal, or that there would be autocorrelation in the process, etc. If the 4-plot still RESIDUAL DEGREES OF FREEDOM = 38
indicates these problems after the functional part of the model has been fixed, however, the REPLICATION STANDARD DEVIATION = 0.0002147265
possibility that the problems are real would need to be addressed. REPLICATION DEGREES OF FREEDOM = 20
LACK OF FIT F RATIO = 214.7464 = THE 100.0000% POINT OF
THE F DISTRIBUTION WITH 18 AND 20 DEGREES OF FREEDOM

Function The lack-of-fit test statistic is 214.7534, which also clearly indicates that the
Incorrect functional part of the model is not right. The 95% cut-off point for the test is 2.15.
Any value greater than that indicates that the hypothesis of a straight-line model for
this data should be rejected.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd614.htm (4 of 4) [11/14/2003 5:50:52 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd615.htm (1 of 2) [11/14/2003 5:50:52 PM]


4.6.1.5. Interpretation of Numerical Output - Initial Model 4.6.1.6. Model Refinement

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration

4.6.1.6. Model Refinement


After ruling out the straight line model for these data, the next task is to decide what function
would better describe the systematic variation in the data.

Reviewing the plots of the residuals versus all potential predictor variables can offer insight into
selection of a new model, just as a plot of the data can aid in selection of an initial model.
Iterating through a series of models selected in this way will often lead to a function that
describes the data well.

Residual
Structure
Indicates
Quadratic

The horseshoe-shaped structure in the plot of the residuals versus load suggests that a quadratic
polynomial might fit the data well. Since that is also the simplest polynomial model, after a
straight line, it is the next function to consider.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd615.htm (2 of 2) [11/14/2003 5:50:52 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd616.htm (1 of 2) [11/14/2003 5:50:52 PM]


4.6.1.6. Model Refinement 4.6.1.7. Model Fitting - Model #2

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration

4.6.1.7. Model Fitting - Model #2


New Based on the residual plots, the function used to describe the data should be the
Function quadratic polynomial:

The computer output from this process is shown below. As for the straight-line
model, however, it is important to check that the assumptions underlying the
parameter estimation are met before trying to interpret the numerical output. The
steps used to complete the graphical residual analysis are essentially identical to
those used for the previous model.

Dataplot
Output LEAST SQUARES POLYNOMIAL FIT
for SAMPLE SIZE N = 40
Quadratic DEGREE = 2
Fit REPLICATION CASE
REPLICATION STANDARD DEVIATION = 0.2147264895D-03
REPLICATION DEGREES OF FREEDOM = 20
NUMBER OF DISTINCT SUBSETS = 20

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 0.673618E-03 (0.1079E-03) 6.2
2 A1 0.732059E-06 (0.1578E-09) 0.46E+04
3 A2 -0.316081E-14 (0.4867E-16) -65.

RESIDUAL STANDARD DEVIATION = 0.0002051768


RESIDUAL DEGREES OF FREEDOM = 37
REPLICATION STANDARD DEVIATION = 0.0002147265
REPLICATION DEGREES OF FREEDOM = 20
LACK OF FIT F RATIO = 0.8107 = THE 33.3818% POINT OF
THE F DISTRIBUTION WITH 17 AND 20 DEGREES OF FREEDOM

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd616.htm (2 of 2) [11/14/2003 5:50:52 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd617.htm (1 of 2) [11/14/2003 5:50:52 PM]


4.6.1.7. Model Fitting - Model #2 4.6.1.8. Graphical Residual Analysis - Model #2

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration

4.6.1.8. Graphical Residual Analysis - Model #2


The data with a quadratic estimated regression function and the residual plots are shown below.

Compare
to Initial
Model

This plot is almost identical to the analogous plot for the straight-line model, again illustrating the
lack of detail in the plot due to the scale. In this case, however, the residual plots will show that
the model does fit well.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd617.htm (2 of 2) [11/14/2003 5:50:52 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd618.htm (1 of 4) [11/14/2003 5:50:53 PM]


4.6.1.8. Graphical Residual Analysis - Model #2 4.6.1.8. Graphical Residual Analysis - Model #2

Plot
Indicates
Model
Fits Well

This plot also looks good. There is no evidence of changes in variability across the range of
The residuals randomly scattered around zero, indicate that the quadratic is a good function to deflection.
describe these data. There is also no indication of non-constant variability over the range of loads.
No
Plot Also Problems
Indicates Indicated
Model
OK

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd618.htm (2 of 4) [11/14/2003 5:50:53 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd618.htm (3 of 4) [11/14/2003 5:50:53 PM]


4.6.1.8. Graphical Residual Analysis - Model #2 4.6.1.9. Interpretation of Numerical Output - Model #2

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration

4.6.1.9. Interpretation of Numerical Output -


Model #2
Quadratic The numerical results from the fit are shown below. For the quadratic model, the
Confirmed lack-of-fit test statistic is 0.8107. The fact that the test statistic is approximately one
indicates there is no evidence to support a claim that the functional part of the model
does not fit the data. The test statistic would have had to have been greater than 2.17
to reject the hypothesis that the quadratic model is correct.

Dataplot
Output LEAST SQUARES POLYNOMIAL FIT
SAMPLE SIZE N = 40
DEGREE = 2
REPLICATION CASE
REPLICATION STANDARD DEVIATION = 0.2147264895D-03
All of these residual plots have become satisfactory by simply by changing the functional form of REPLICATION DEGREES OF FREEDOM = 20
the model. There is no evidence in the run order plot of any time dependence in the measurement
NUMBER OF DISTINCT SUBSETS = 20
process, and the lag plot suggests that the errors are independent. The histogram and normal
probability plot suggest that the random errors affecting the measurement process are normally
distributed.
PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
1 A0 0.673618E-03 (0.1079E-03) 6.2
2 A1 0.732059E-06 (0.1578E-09) 0.46E+04
3 A2 -0.316081E-14 (0.4867E-16) -65.

RESIDUAL STANDARD DEVIATION = 0.0002051768


RESIDUAL DEGREES OF FREEDOM = 37
REPLICATION STANDARD DEVIATION = 0.0002147265
REPLICATION DEGREES OF FREEDOM = 20
LACK OF FIT F RATIO = 0.8107 = THE 33.3818% POINT OF
THE F DISTRIBUTION WITH 17 AND 20 DEGREES OF FREEDOM

Regression From the numerical output, we can also find the regression function that will be used
Function for the calibration. The function, with its estimated parameters, is

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd618.htm (4 of 4) [11/14/2003 5:50:53 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd619.htm (1 of 2) [11/14/2003 5:50:53 PM]


4.6.1.9. Interpretation of Numerical Output - Model #2 4.6.1.10. Use of the Model for Calibration

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.1. Load Cell Calibration
All of the parameters are significantly different from zero, as indicated by the
associated t statistics. The 97.5% cut-off for the t distribution with 37 degrees of
freedom is 2.026. Since all of the t values are well above this cut-off, we can safely 4.6.1.10. Use of the Model for Calibration
conclude that none of the estimated parameters is equal to zero.
Using the Now that a good model has been found for these data, it can be used to estimate load values for
Model new measurements of deflection. For example, suppose a new deflection value of 1.239722 is
observed. The regression function can be solved for load to determine an estimated load value
without having to observe it directly. The plot below illustrates the calibration process
graphically.

Calibration

Finding From the plot, it is clear that the load that produced the deflection of 1.239722 should be about
Bounds on 1,750,000, and would certainly lie between 1,500,000 and 2,000,000. This rough estimate of the
the Load possible load range will be used to compute the load estimate numerically.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd619.htm (2 of 2) [11/14/2003 5:50:53 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd61a.htm (1 of 3) [11/14/2003 5:50:53 PM]


4.6.1.10. Use of the Model for Calibration 4.6.1.10. Use of the Model for Calibration

Obtaining To solve for the numerical estimate of the load associated with the observed deflection, the
a observed value substituting in the regression function and the equation is solved for load.
Numerical Typically this will be done using a root finding procedure in a statistical or mathematical if desired.
Calibration package. That is one reason why rough bounds on the value of the load to be estimated are
Value needed.

Solving the
Regression
Equation

Which Even though the rough estimate of the load associated with an observed deflection is not
Solution? necessary to solve the equation, the other reason is to determine which solution to the equation is
correct, if there are multiple solutions. The quadratic calibration equation, in fact, has two
solutions. As we saw from the plot on the previous page, however, there is really no confusion
over which root of the quadratic function is the correct load. Essentially, the load value must be
between 150,000 and 3,000,000 for this problem. The other root of the regression equation and
the new deflection value correspond to a load of over 229,899,600. Looking at the data at hand, it
is safe to assume that a load of 229,899,600 would yield a deflection much greater than 1.24.

+/- What? The final step in the calibration process, after determining the estimated load associated with the
observed deflection, is to compute an uncertainty or confidence interval for the load. A single-use
95% confidence interval for the load, is obtained by inverting the formulas for the upper and
lower bounds of a 95% prediction interval for a new deflection value. These inequalities, shown
below, are usually solved numerically, just as the calibration equation was, to find the end points
of the confidence interval. For some models, including this one, the solution could actually be
obtained algebraically, but it is easier to let the computer do the work using a generic algorithm.

The three terms on the right-hand side of each inequality are the regression function ( ), a
t-distribution multiplier, and the standard deviation of a new measurement from the process ( ).
Regression software often provides convenient methods for computing these quantities for
arbitrary values of the predictor variables, which can make computation of the confidence interval
end points easier. Although this interval is not symmetric mathematically, the asymmetry is very
small, so for all practical purposes, the interval can be written as

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd61a.htm (2 of 3) [11/14/2003 5:50:53 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd61a.htm (3 of 3) [11/14/2003 5:50:53 PM]


4.6.1.11. Work This Example Yourself 4.6.1.11. Work This Example Yourself

from the model and the and observed values suggests the
data on the same plot. model is ok.

4. Plot the residuals vs. 4. The residuals are not random,


4. Process Modeling load. indicating that a straight line
4.6. Case Studies in Process Modeling is not adequate.
4.6.1. Load Cell Calibration

5. Plot the residuals vs. the 5. This plot echos the information in
4.6.1.11. Work This Example Yourself predicted values. the previous plot.

View This page allows you to repeat the analysis outlined in the case study 6. Make a 4-plot of the 6. All four plots indicate problems
Dataplot description on the previous page using Dataplot, if you have residuals. with the model.
Macro for downloaded and installed it. Output from each analysis step below will
this Case be displayed in one or more of the Dataplot windows. The four main 7. Refer to the numerical output 7. The large lack-of-fit F statistic
Study windows are the Output window, the Graphics window, the Command from the fit. (>214) confirms that the straight-
History window and the Data Sheet window. Across the top of the main
line model is inadequate.
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in.

3. Fit and validate refined model.


Data Analysis Steps Results and Conclusions
1. Refer to the plot of the 1. The structure in the plot indicates
residuals vs. load. a quadratic model would better
Click on the links below to start Dataplot and run this describe the data.
case study yourself. Each step may use results from The links in this column will connect you with more detailed
previous steps, so please be patient. Wait until the information about each analysis step from the case study 2. Fit a quadratic model to 2. The quadratic fit was carried out.
software verifies that the current step is complete description. the data. Remember to do the graphical
before clicking on the next step.
residual analysis before trying to
interpret the numerical output.

3. Plot the predicted values 3. The superposition of the predicted


1. Get set up and started. from the model and the and observed values again suggests
data on the same plot. the model is ok.
1. Read in the data. 1. You have read 2 columns of numbers
into Dataplot, variables Deflection
4. Plot the residuals vs. load. 4. The residuals appear random,
and Load.
suggesting the quadratic model is ok.

5. Plot the residuals vs. the


2. Fit and validate initial model. 5. The plot of the residuals vs. the
predicted values.
predicted values also suggests the
1. Plot deflection vs. load. 1. Based on the plot, a straight-line quadratic model is ok.
model should describe the data well. 6. Do a 4-plot of the
6. None of these plots indicates a
residuals.
2. Fit a straight-line model 2. The straight-line fit was carried problem with the model.
to the data. out. Before trying to interpret the 7. Refer to the numerical
numerical output, do a graphical 7. The small lack-of-fit F statistic
output from the fit.
residual analysis. (<1) confirms that the quadratic
model fits the data.
3. Plot the predicted values 3. The superposition of the predicted

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd61b.htm (1 of 3) [11/14/2003 5:50:54 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd61b.htm (2 of 3) [11/14/2003 5:50:54 PM]


4.6.1.11. Work This Example Yourself 4.6.2. Alaska Pipeline

4. Use the model to make a calibrated


measurement.

1. Observe a new deflection 1. The new deflection is associated with


4. Process Modeling
value. an unobserved and unknown load.
4.6. Case Studies in Process Modeling

2. Determine the associated 2. Solving the calibration equation


load. yields the load value without having
to observe it.
4.6.2. Alaska Pipeline
3. Compute the uncertainty of 3. Computing a confidence interval for Non-Homogeneous This example illustrates the construction of a linear regression
the load estimate. the load value lets us judge the Variances model for Alaska pipeline ultrasonic calibration data. This case
range of plausible load values, study demonstrates the use of transformations and weighted fits to
since we know measurement noise deal with the violation of the assumption of constant standard
affects the process. deviations for the random errors. This assumption is also called
homogeneous variances for the errors.

1. Background and Data


2. Check for a Batch Effect
3. Fit Initial Model
4. Transformations to Improve Fit and Equalize Variances
5. Weighting to Improve Fit
6. Compare the Fits
7. Work This Example Yourself

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd61b.htm (3 of 3) [11/14/2003 5:50:54 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd62.htm [11/14/2003 5:50:54 PM]


4.6.2.1. Background and Data 4.6.2.1. Background and Data

10 14.3 1
50 81.5 1
10 13.7 1
50 81.5 1
15 20.5 1
4. Process Modeling
53 56.0 1
4.6. Case Studies in Process Modeling
4.6.2. Alaska Pipeline
60 80.7 2
18 20.0 2
38 56.5 2
4.6.2.1. Background and Data 15
20
12.1
19.6
2
2
18 15.5 2
Description The Alaska pipeline data consists of in-field ultrasonic measurements of 36 38.8 2
of Data the depths of defects in the Alaska pipeline. The depth of the defects 20 19.5 2
Collection were then re-measured in the laboratory. These measurements were 43 38.0 2
performed in six different batches. 45 55.0 2
The data were analyzed to calibrate the bias of the field measurements 65 80.0 2
relative to the laboratory measurements. In this analysis, the field 43 38.5 2
measurement is the response variable and the laboratory measurement is 38 55.8 2
the predictor variable. 33 38.8 2
10 12.5 2
These data were provided by Harry Berger, who was at the time a 50 80.4 2
scientist for the Office of the Director of the Institute of Materials 10 12.7 2
Research (now the Materials Science and Engineering Laboratory) of 50 80.9 2
NIST. These data were used for a study conducted for the Materials 15 20.5 2
Transportation Bureau of the U.S. Department of Transportation. 53 55.0 2
15 19.0 3
Resulting 37 55.5 3
Data Field Lab 15 12.3 3
Defect Defect 18 18.4 3
Size Size Batch 11 11.5 3
----------------------- 35 38.0 3
18 20.2 1 20 18.5 3
38 56.0 1 40 38.0 3
15 12.5 1 50 55.3 3
20 21.2 1 36 38.7 3
18 15.5 1 50 54.5 3
36 39.0 1 38 38.0 3
20 21.0 1 10 12.0 3
43 38.2 1 75 81.7 3
45 55.6 1 10 11.5 3
65 81.9 1 85 80.0 3
43 39.5 1 13 18.3 3
38 56.4 1 50 55.3 3
33 40.5 1 58 80.2 3
58 80.7 3

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd621.htm (1 of 4) [11/14/2003 5:50:54 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd621.htm (2 of 4) [11/14/2003 5:50:54 PM]


4.6.2.1. Background and Data 4.6.2.1. Background and Data

48 55.8 4 15 12.9 6
12 15.0 4 45 49.0 6
63 81.0 4
10 12.0 4
63 81.4 4
13 12.5 4
28 38.2 4
35 54.2 4
63 79.3 4
13 18.2 4
45 55.5 4
9 11.4 4
20 19.5 4
18 15.5 4
35 37.5 4
20 19.5 4
38 37.5 4
50 55.5 4
70 80.0 4
40 37.5 4
21 15.5 5
19 23.7 5
10 9.8 5
33 40.8 5
16 17.5 5
5 4.3 5
32 36.5 5
23 26.3 5
30 30.4 5
45 50.2 5
33 30.1 5
25 25.5 5
12 13.8 5
53 58.9 5
36 40.0 5
5 6.0 5
63 72.5 5
43 38.8 5
25 19.4 5
73 81.5 5
45 77.4 5
52 54.6 6
9 6.8 6
30 32.6 6
22 19.8 6
56 58.8 6

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd621.htm (3 of 4) [11/14/2003 5:50:54 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd621.htm (4 of 4) [11/14/2003 5:50:54 PM]


4.6.2.2. Check for Batch Effect 4.6.2.2. Check for Batch Effect

Conditional We first generate a conditional plot where we condition on the batch.


Plot

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.2. Alaska Pipeline

4.6.2.2. Check for Batch Effect


Plot of Raw As with any regression problem, it is always a good idea to plot the raw data first. The following
Data is a scatter plot of the raw data.

This conditional plot shows a scatter plot for each of the six batches on a single page. Each of
these plots shows a similar pattern.

Linear We can follow up the conditional plot with a linear correlation plot, a linear intercept plot, a
Correlation linear slope plot, and a linear residual standard deviation plot. These four plots show the
and Related correlation, the intercept and slope from a linear fit, and the residual standard deviation for linear
Plots fits applied to each batch. These plots show how a linear fit performs across the six batches.

This scatter plot shows that a straight line fit is a good initial candidate model for these data.

Plot by Batch These data were collected in six distinct batches. The first step in the analysis is to determine if
there is a batch effect.
In this case, the scientist was not inherently interested in the batch. That is, batch is a nuisance
factor and, if reasonable, we would like to analyze the data as if it came from a single batch.
However, we need to know that this is, in fact, a reasonable assumption to make.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd622.htm (1 of 3) [11/14/2003 5:50:54 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd622.htm (2 of 3) [11/14/2003 5:50:54 PM]


4.6.2.2. Check for Batch Effect 4.6.2.3. Initial Linear Fit

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.2. Alaska Pipeline

4.6.2.3. Initial Linear Fit


Linear Fit Output Based on the initial plot of the data, we first fit a straight-line model to the data.
The following fit output was generated by Dataplot (it has been edited slightly for display).

LEAST SQUARES MULTILINEAR FIT


SAMPLE SIZE N = 107
NUMBER OF VARIABLES = 1
REPLICATION CASE
REPLICATION STANDARD DEVIATION = 0.6112687111D+01
REPLICATION DEGREES OF FREEDOM = 29
NUMBER OF DISTINCT SUBSETS = 78

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 4.99368 ( 1.126 ) 4.4
2 A1 LAB 0.731111 (0.2455E-01) 30.
The linear correlation plot (upper left), which shows the correlation between field and lab defect
sizes versus the batch, indicates that batch six has a somewhat stronger linear relationship RESIDUAL STANDARD DEVIATION = 6.0809240341
between the measurements than the other batches do. This is also reflected in the significantly RESIDUAL DEGREES OF FREEDOM = 105
lower residual standard deviation for batch six shown in the residual standard deviation plot REPLICATION STANDARD DEVIATION = 6.1126871109
(lower right), which shows the residual standard deviation versus batch. The slopes all lie within REPLICATION DEGREES OF FREEDOM = 29
a range of 0.6 to 0.9 in the linear slope plot (lower left) and the intercepts all lie between 2 and 8 LACK OF FIT F RATIO = 0.9857
in the linear intercept plot (upper right). = THE 46.3056% POINT OF THE
F DISTRIBUTION WITH 76 AND 29 DEGREES OF FREEDOM
Treat BATCH These summary plots, in conjunction with the conditional plot above, show that treating the data
as as a single batch is a reasonable assumption to make. None of the batches behaves badly
Homogeneous compared to the others and none of the batches requires a significantly different fit from the The intercept parameter is estimated to be 4.99 and the slope parameter is estimated to be 0.73.
others. Both parameters are statistically significant.

These two plots provide a good pair. The plot of the fit statistics allows quick and convenient 6-Plot for Model When there is a single independent variable, the 6-plot provides a convenient method for initial
comparisons of the overall fits. However, the conditional plot can reveal details that may be Validation model validation.
hidden in the summary plots. For example, we can more readily determine the existence of
clusters of points and outliers, curvature in the data, and other similar features.
Based on these plots we will ignore the BATCH variable for the remaining analysis.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd622.htm (3 of 3) [11/14/2003 5:50:54 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd623.htm (1 of 4) [11/14/2003 5:50:55 PM]


4.6.2.3. Initial Linear Fit 4.6.2.3. Initial Linear Fit

The basic assumptions for regression models are that the errors are random observations from a This plot shows more clearly that the assumption of homogeneous variances for the errors may be
normal distribution with mean of zero and constant standard deviation (or variance). violated.
The plots on the first row show that the residuals have increasing variance as the value of the
Plot of Residual
independent variable (lab) increases in value. This indicates that the assumption of constant
Values Against
standard deviation, or homogeneity of variances, is violated.
Independent
In order to see this more clearly, we will generate full- size plots of the predicted values with the Variable
data and the residuals against the independent variable.

Plot of Predicted
Values with
Original Data

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd623.htm (2 of 4) [11/14/2003 5:50:55 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd623.htm (3 of 4) [11/14/2003 5:50:55 PM]


4.6.2.3. Initial Linear Fit 4.6.2.4. Transformations to Improve Fit and Equalize Variances

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.2. Alaska Pipeline

4.6.2.4. Transformations to Improve Fit and Equalize


Variances
Transformations In regression modeling, we often apply transformations to achieve the following two goals:
1. to satisfy the homogeneity of variances assumption for the errors.
2. to linearize the fit as much as possible.
Some care and judgment is required in that these two goals can conflict. We generally try to
achieve homogeneous variances first and then address the issue of trying to linearize the fit.

Plot of Common The first step is to try transforming the response variable to find a tranformation that will equalize
Transformations the variances. In practice, the square root, ln, and reciprocal transformations often work well for
to Obtain this purpose. We will try these first.
Homogeneous
Variances

This plot also shows more clearly that the assumption of homogeneous variances is violated. This
assumption, along with the assumption of constant location, are typically easiest to see on this
plot.

Non-Homogeneous Because the last plot shows that the variances may differ more that slightly, we will address this
Variances issue by transforming the data or using weighted least squares.

In examining these plots, we are looking for the plot that shows the most constant variability
across the horizontal range of the plot.
This plot indicates that the ln transformation is a good candidate model for achieving the most

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd623.htm (4 of 4) [11/14/2003 5:50:55 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd624.htm (1 of 6) [11/14/2003 5:50:56 PM]


4.6.2.4. Transformations to Improve Fit and Equalize Variances 4.6.2.4. Transformations to Improve Fit and Equalize Variances

homogeneous variances.

Plot of Common One problem with applying the above transformation is that the plot indicates that a straight-line
Transformations fit will no longer be an adequate model for the data. We address this problem by attempting to find
to Linearize the a transformation of the predictor variable that will result in the most linear fit. In practice, the
Fit square root, ln, and reciprocal transformations often work well for this purpose. We will try these
first.

This plot indicates that a value of -0.1 achieves the most linear fit.
In practice, for ease of interpretation, we often prefer to use a common transformation, such as the
ln or square root, rather than the value that yields the mathematical maximum. However, the
Box-Cox linearity plot still indicates whether our choice is a reasonable one. That is, we might
sacrifice a small amount of linearity in the fit to have a simpler model.
In this case, a value of 0.0 would indicate a ln transformation. Although the optimal value from
the plot is -0.1, the plot indicates that any value between -0.2 and 0.2 will yield fairly similar
This plot shows that the ln transformation of the predictor variable is a good candidate model.
results. For that reason, we choose to stick with the common ln transformation.
Box-Cox
ln-ln Fit Based on the above plots, we choose to fit a ln-ln model. Dataplot generated the following output
Linearity Plot
for this model (it is edited slightly for display).

LEAST SQUARES MULTILINEAR FIT


SAMPLE SIZE N = 107
The previous step can be approached more formally by the use of the Box-Cox linearity plot. The NUMBER OF VARIABLES = 1
value on the x axis corresponding to the maximum correlation value on the y axis indicates the REPLICATION CASE
power transformation that yields the most linear fit. REPLICATION STANDARD DEVIATION = 0.1369758099D+00
REPLICATION DEGREES OF FREEDOM = 29
NUMBER OF DISTINCT SUBSETS = 78

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 0.281384 (0.8093E-01) 3.5
2 A1 XTEMP 0.885175 (0.2302E-01) 38.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd624.htm (2 of 6) [11/14/2003 5:50:56 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd624.htm (3 of 6) [11/14/2003 5:50:56 PM]


4.6.2.4. Transformations to Improve Fit and Equalize Variances 4.6.2.4. Transformations to Improve Fit and Equalize Variances

RESIDUAL STANDARD DEVIATION = 0.1682604253 6-Plot of Fit


RESIDUAL DEGREES OF FREEDOM = 105
REPLICATION STANDARD DEVIATION = 0.1369758099
REPLICATION DEGREES OF FREEDOM = 29
LACK OF FIT F RATIO = 1.7032 = THE 94.4923% POINT OF THE
F DISTRIBUTION WITH 76 AND 29 DEGREES OF FREEDOM

Note that although the residual standard deviation is significantly lower than it was for the original
fit, we cannot compare them directly since the fits were performed on different scales.

Plot of
Predicted
Values

Since we transformed the data, we need to check that all of the regression assumptions are now
valid.
The 6-plot of the residuals indicates that all of the regression assumptions are now satisfied.

Plot of
Residuals

The plot of the predicted values with the transformed data indicates a good fit. In addition, the
variability of the data across the horizontal range of the plot seems relatively constant.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd624.htm (4 of 6) [11/14/2003 5:50:56 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd624.htm (5 of 6) [11/14/2003 5:50:56 PM]


4.6.2.4. Transformations to Improve Fit and Equalize Variances 4.6.2.5. Weighting to Improve Fit

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.2. Alaska Pipeline

4.6.2.5. Weighting to Improve Fit


Weighting Another approach when the assumption of constant standard deviation of the errors (i.e.
homogeneous variances) is violated is to perform a weighted fit. In a weighted fit, we give less
weight to the less precise measurements and more weight to more precise measurements when
estimating the unknown parameters in the model.

Fit for
Estimating
Weights

For the pipeline data, we chose approximate replicate groups so that each group has four
observations (the last group only has three). This was done by first sorting the data by the
predictor variable and then taking four points in succession to form each replicate group.
In order to see more detail, we generate a full-size plot of the residuals versus the predictor
variable, as shown above. This plot suggests that the assumption of homogeneous variances is Using the power function model with the data for estimating the weights, Dataplot generated the
now met. following output for the fit of ln(variances) against ln(means) for the replicate groups. The output
has been edited slightly for display.

LEAST SQUARES MULTILINEAR FIT


SAMPLE SIZE N = 27
NUMBER OF VARIABLES = 1
NO REPLICATION CASE

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 -3.18451 (0.8265 ) -3.9
2 A1 XTEMP 1.69001 (0.2344 ) 7.2

RESIDUAL STANDARD DEVIATION = 0.8561206460


RESIDUAL DEGREES OF FREEDOM = 25

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd624.htm (6 of 6) [11/14/2003 5:50:56 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd625.htm (1 of 6) [11/14/2003 5:50:56 PM]


4.6.2.5. Weighting to Improve Fit 4.6.2.5. Weighting to Improve Fit

The fit output and plot from the replicate variances against the replicate means shows that the a The residual plot from the fit to determine an appropriate weighting function reveals no obvious
linear fit provides a reasonable fit with an estimated slope of 1.69. Note that this data set has a problems.
small number of replicates, so you may get a slightly different estimate for the slope. For
example, S-PLUS generated a slope estimate of 1.52. This is caused by the sorting of the Numerical Dataplot generated the following output for the weighted fit of the model that relates the field
predictor variable (i.e., where we have actual replicates in the data, different sorting algorithms Output measurements to the lab measurements (edited slightly for display).
may put some observations in different replicate groups). In practice, any value for the slope, from
which will be used as the exponent in the weight function, in the range 1.5 to 2.0 is probably Weighted LEAST SQUARES MULTILINEAR FIT
reasonable and should produce comparable results for the weighted fit. Fit SAMPLE SIZE N = 107
We used an estimate of 1.5 for the exponent in the weighting function. NUMBER OF VARIABLES = 1
REPLICATION CASE
REPLICATION STANDARD DEVIATION = 0.6112687111D+01
Residual
REPLICATION DEGREES OF FREEDOM = 29
Plot for
NUMBER OF DISTINCT SUBSETS = 78
Weight
Function
PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
1 A0 2.35234 (0.5431 ) 4.3
2 A1 LAB 0.806363 (0.2265E-01) 36.

RESIDUAL STANDARD DEVIATION = 0.3645902574


RESIDUAL DEGREES OF FREEDOM = 105
REPLICATION STANDARD DEVIATION = 6.1126871109
REPLICATION DEGREES OF FREEDOM = 29

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd625.htm (2 of 6) [11/14/2003 5:50:56 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd625.htm (3 of 6) [11/14/2003 5:50:56 PM]


4.6.2.5. Weighting to Improve Fit 4.6.2.5. Weighting to Improve Fit

This output shows a slope of 0.81 and an intercept term of 2.35. This is compared to a slope of
0.73 and an intercept of 4.99 in the original model.

Plot of
Predicted
Values

We need to verify that the weighting did not result in the other regression assumptions being
violated. A 6-plot, after weighting the residuals, indicates that the regression assumptions are
satisfied.
The plot of the predicted values with the data indicates a good fit.
Plot of
Diagnostic Weighted
Plots of Residuals
Weighted vs Lab
Residuals Defect
Size

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd625.htm (4 of 6) [11/14/2003 5:50:56 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd625.htm (5 of 6) [11/14/2003 5:50:56 PM]


4.6.2.5. Weighting to Improve Fit 4.6.2.6. Compare the Fits

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.2. Alaska Pipeline

4.6.2.6. Compare the Fits


Three Fits It is interesting to compare the results of the three fits:
to 1. Unweighted fit
Compare
2. Transformed fit
3. Weighted fit

Plot of Fits
with Data

In order to check the assumption of homogeneous variances for the errors in more detail, we
generate a full sized plot of the weighted residuals versus the predictor variable. This plot
suggests that the errors now have homogeneous variances.

This plot shows that, compared to the original fit, the transformed and weighted fits generate
smaller predicted values for low values of lab defect size and larger predicted values for high
values of lab defect size. The three fits match fairly closely for intermediate values of lab defect
size. The transformed and weighted fit tend to agree for the low values of lab defect size.
However, for large values of lab defect size, the weighted fit tends to generate higher values for
the predicted values than does the transformed fit.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd625.htm (6 of 6) [11/14/2003 5:50:56 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd626.htm (1 of 2) [11/14/2003 5:50:56 PM]


4.6.2.6. Compare the Fits 4.6.2.7. Work This Example Yourself

Conclusion Although the original fit was not bad, it violated the assumption of homogeneous variances for
the error term. Both the fit of the transformed data and the weighted fit successfully address this
problem without violating the other regression assumptions.
4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.2. Alaska Pipeline

4.6.2.7. Work This Example Yourself


View This page allows you to repeat the analysis outlined in the case study
Dataplot description on the previous page using Dataplot, if you have
Macro for downloaded and installed it. Output from each analysis step below will
this Case be displayed in one or more of the Dataplot windows. The four main
Study windows are the Output window, the Graphics window, the Command
History window and the Data Sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in.

Data Analysis Steps Results and Conclusions

Click on the links below to start Dataplot and run this case
The links in this column will connect you with more detailed
study yourself. Each step may use results from previous steps,
information about each analysis step from the case study
so please be patient. Wait until the software verifies that the
description.
current step is complete before clicking on the next step.

1. Get set up and started.

1. Read in the data. 1. You have read 3 columns of numbers


into Dataplot, variables Field,
Lab, and Batch.

2. Plot data and check for batch effect.

1. Plot field versus lab. 1. Initial plot indicates that a


simple linear model is a good
initial model.

2. Condition plot on batch. 2. Condition plot on batch indicates


no significant batch effect.

3. Check batch effect with. 3. Plots of fit by batch indicate no


linear fit plots by batch. significant batch effect.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd626.htm (2 of 2) [11/14/2003 5:50:56 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd627.htm (1 of 3) [11/14/2003 5:50:57 PM]


4.6.2.7. Work This Example Yourself 4.6.2.7. Work This Example Yourself

3. Fit and validate initial model. 5. Improve the fit using weighting.

1. Linear fit of field versus lab. 1. The linear fit was carried out. 1. Fit function to determine appropriate 1. The fit to determine an appropriate
Plot predicted values with the Although the initial fit looks good, weight function. Determine value for weight function indicates that a
data. the plot indicates that the residuals the exponent in the power model. an exponent between 1.5 and 2.0
do not have homogeneous variances. should be reasonable.

2. Generate a 6-plot for model 2. The 6-plot does not indicate any 2. Examine residuals from weight fit 2. The residuals from this fit
validation. other problems with the model, to check adequacy of weight function. indicate no major problems.
beyond the evidence of
non-constant error variance. 3. Weighted linear fit of field versus 3. The weighted fit was carried out.
lab. Plot predicted values with The plot of the predicted values
3. Plot the residuals against 3. The detailed residual plot shows the data. with the data indicates that the
the predictor variable. the inhomogeneity of the error fit of the model is improved.
variation more clearly.
4. Generate a 6-plot after weighting 4. The 6-plot shows that the model
the residuals for model validation. assumptions are satisfied.
4. Improve the fit with transformations.
5. Plot the weighted residuals 5. The detailed residual plot shows
1. Plot several common transformations 1. The plots indicate that a ln against the predictor variable. the constant variability of the
of the response variable (field) transformation of the dependent weighted residuals.
versus the predictor variable (lab). variable (field) stabilizes
the variation.
6. Compare the fits.
2. Plot ln(field) versus several 2. The plots indicate that a ln
common transformations of the transformation of the predictor 1. Plot predicted values from each 1. The transformed and weighted fits
predictor variable (lab). variable (lab) linearizes the of the three models with the generate lower predicted values for
model. data. low values of defect size and larger
predicted values for high values of
3. Box-Cox linearity plot. 3. The Box-Cox linearity plot defect size.
indicates an optimum transform
value of -0.1, although a ln
transformation should work well.

4. Linear fit of ln(field) versus


4. The plot of the predicted values
ln(lab). Plot predicted values
with the data indicates that
with the data.
the errors should now have
homogeneous variances.

5. Generate a 6-plot for model


validation. 5. The 6-plot shows that the model
assumptions are satisfied.

6. Plot the residuals against


the predictor variable. 6. The detailed residual plot shows
more clearly that the assumption
of homogeneous variances is now
satisfied.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd627.htm (2 of 3) [11/14/2003 5:50:57 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd627.htm (3 of 3) [11/14/2003 5:50:57 PM]


4.6.3. Ultrasonic Reference Block Study 4.6.3.1. Background and Data

4. Process Modeling 4. Process Modeling


4.6. Case Studies in Process Modeling 4.6. Case Studies in Process Modeling
4.6.3. Ultrasonic Reference Block Study

4.6.3. Ultrasonic Reference Block Study


4.6.3.1. Background and Data
Non-Linear Fit This example illustrates the construction of a non-linear
with regression model for ultrasonic calibration data. This case study Description The ultrasonic reference block data consist of a response variable and a
Non-Homogeneous demonstrates fitting a non-linear model and the use of of the Data predictor variable. The response variable is ultrasonic response and the
Variances transformations and weighted fits to deal with the violation of the predictor variable is metal distance.
assumption of constant standard deviations for the errors. This
These data were provided by the NIST scientist Dan Chwirut.
assumption is also called homogeneous variances for the errors.
Resulting
1. Background and Data
Data Ultrasonic Metal
2. Fit Initial Model Response Distance
3. Transformations to Improve Fit -----------------------
92.9000 0.5000
4. Weighting to Improve Fit 78.7000 0.6250
5. Compare the Fits 64.2000 0.7500
6. Work This Example Yourself 64.9000 0.8750
57.1000 1.0000
43.3000 1.2500
31.1000 1.7500
23.6000 2.2500
31.0500 1.7500
23.7750 2.2500
17.7375 2.7500
13.8000 3.2500
11.5875 3.7500
9.4125 4.2500
7.7250 4.7500
7.3500 5.2500
8.0250 5.7500
90.6000 0.5000
76.9000 0.6250
71.6000 0.7500
63.6000 0.8750
54.0000 1.0000
39.2000 1.2500
29.3000 1.7500

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd63.htm [11/14/2003 5:50:57 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd631.htm (1 of 6) [11/14/2003 5:50:57 PM]


4.6.3.1. Background and Data 4.6.3.1. Background and Data

21.4000 2.2500 32.5000 1.5000


29.1750 1.7500 12.4100 3.0000
22.1250 2.2500 13.1200 3.0000
17.5125 2.7500 15.5600 3.0000
14.2500 3.2500 5.6300 6.0000
9.4500 3.7500 78.0000 0.5000
9.1500 4.2500 59.9000 0.7500
7.9125 4.7500 33.2000 1.5000
8.4750 5.2500 13.8400 3.0000
6.1125 5.7500 12.7500 3.0000
80.0000 0.5000 14.6200 3.0000
79.0000 0.6250 3.9400 6.0000
63.8000 0.7500 76.8000 0.5000
57.2000 0.8750 61.0000 0.7500
53.2000 1.0000 32.9000 1.5000
42.5000 1.2500 13.8700 3.0000
26.8000 1.7500 11.8100 3.0000
20.4000 2.2500 13.3100 3.0000
26.8500 1.7500 5.4400 6.0000
21.0000 2.2500 78.0000 0.5000
16.4625 2.7500 63.5000 0.7500
12.5250 3.2500 33.8000 1.5000
10.5375 3.7500 12.5600 3.0000
8.5875 4.2500 5.6300 6.0000
7.1250 4.7500 12.7500 3.0000
6.1125 5.2500 13.1200 3.0000
5.9625 5.7500 5.4400 6.0000
74.1000 0.5000 76.8000 0.5000
67.3000 0.6250 60.0000 0.7500
60.8000 0.7500 47.8000 1.0000
55.5000 0.8750 32.0000 1.5000
50.3000 1.0000 22.2000 2.0000
41.0000 1.2500 22.5700 2.0000
29.4000 1.7500 18.8200 2.5000
20.4000 2.2500 13.9500 3.0000
29.3625 1.7500 11.2500 4.0000
21.1500 2.2500 9.0000 5.0000
16.7625 2.7500 6.6700 6.0000
13.2000 3.2500 75.8000 0.5000
10.8750 3.7500 62.0000 0.7500
8.1750 4.2500 48.8000 1.0000
7.3500 4.7500 35.2000 1.5000
5.9625 5.2500 20.0000 2.0000
5.6250 5.7500 20.3200 2.0000
81.5000 0.5000 19.3100 2.5000
62.4000 0.7500 12.7500 3.0000

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd631.htm (2 of 6) [11/14/2003 5:50:57 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd631.htm (3 of 6) [11/14/2003 5:50:57 PM]


4.6.3.1. Background and Data 4.6.3.1. Background and Data

10.4200 4.0000 8.7400 6.0000


7.3100 5.0000 80.7000 0.5000
7.4200 6.0000 61.3000 0.7500
70.5000 0.5000 47.5000 1.0000
59.5000 0.7500 29.0000 1.5000
48.5000 1.0000 24.0000 2.0000
35.8000 1.5000 17.7000 2.5000
21.0000 2.0000 24.5600 2.0000
21.6700 2.0000 18.6700 2.5000
21.0000 2.5000 16.2400 3.0000
15.6400 3.0000 8.7400 4.0000
8.1700 4.0000 7.8700 5.0000
8.5500 5.0000 8.5100 6.0000
10.1200 6.0000 66.7000 0.5000
78.0000 0.5000 59.2000 0.7500
66.0000 0.6250 40.8000 1.0000
62.0000 0.7500 30.7000 1.5000
58.0000 0.8750 25.7000 2.0000
47.7000 1.0000 16.3000 2.5000
37.8000 1.2500 25.9900 2.0000
20.2000 2.2500 16.9500 2.5000
21.0700 2.2500 13.3500 3.0000
13.8700 2.7500 8.6200 4.0000
9.6700 3.2500 7.2000 5.0000
7.7600 3.7500 6.6400 6.0000
5.4400 4.2500 13.6900 3.0000
4.8700 4.7500 81.0000 0.5000
4.0100 5.2500 64.5000 0.7500
3.7500 5.7500 35.5000 1.5000
24.1900 3.0000 13.3100 3.0000
25.7600 3.0000 4.8700 6.0000
18.0700 3.0000 12.9400 3.0000
11.8100 3.0000 5.0600 6.0000
12.0700 3.0000 15.1900 3.0000
16.1200 3.0000 14.6200 3.0000
70.8000 0.5000 15.6400 3.0000
54.7000 0.7500 25.5000 1.7500
48.0000 1.0000 25.9500 1.7500
39.8000 1.5000 81.7000 0.5000
29.8000 2.0000 61.6000 0.7500
23.7000 2.5000 29.8000 1.7500
29.6200 2.0000 29.8100 1.7500
23.8100 2.5000 17.1700 2.7500
17.7000 3.0000 10.3900 3.7500
11.5500 4.0000 28.4000 1.7500
12.0700 5.0000 28.6900 1.7500

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd631.htm (4 of 6) [11/14/2003 5:50:57 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd631.htm (5 of 6) [11/14/2003 5:50:57 PM]


4.6.3.1. Background and Data 4.6.3.2. Initial Non-Linear Fit

81.3000 0.5000
60.9000 0.7500
16.6500 2.7500
10.0500 3.7500 4. Process Modeling
28.9000 1.7500 4.6. Case Studies in Process Modeling
4.6.3. Ultrasonic Reference Block Study
28.9500 1.7500

4.6.3.2. Initial Non-Linear Fit


Plot of Data The first step in fitting a nonlinear function is to simply plot the data.

This plot shows an exponentially decaying pattern in the data. This suggests that some type of exponential
function might be an appropriate model for the data.

Initial Model There are two issues that need to be addressed in the initial model selection when fitting a nonlinear model.
Selection 1. We need to determine an appropriate functional form for the model.
2. We need to determine appropriate starting values for the estimation of the model parameters.

Determining an Due to the large number of potential functions that can be used for a nonlinear model, the determination of an
Appropriate appropriate model is not always obvious. Some guidelines for selecting an appropriate model were given in
Functional Form the analysis chapter.
for the Model
The plot of the data will often suggest a well-known function. In addition, we often use scientific and
engineering knowledge in determining an appropriate model. In scientific studies, we are frequently interested
in fitting a theoretical model to the data. We also often have historical knowledge from previous studies (either
our own data or from published studies) of functions that have fit similar data well in the past. In the absence
of a theoretical model or experience with prior data sets, selecting an appropriate function will often require a
certain amount of trial and error.
Regardless of whether or not we are using scientific knowledge in selecting the model, model validation is still
critical in determining if our selected model is adequate.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd631.htm (6 of 6) [11/14/2003 5:50:57 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd632.htm (1 of 5) [11/14/2003 5:51:04 PM]


4.6.3.2. Initial Non-Linear Fit 4.6.3.2. Initial Non-Linear Fit

Determining Nonlinear models are fit with iterative methods that require starting values. In some cases, inappropriate Nonlinear Fit The following fit output was generated by Dataplot (it has been edited for display).
Appropriate starting values can result in parameter estimates for the fit that converge to a local minimum or maximum Output
Starting Values rather than the global minimum or maximum. Some models are relatively insensitive to the choice of starting
values while others are extremely sensitive. LEAST SQUARES NON-LINEAR FIT
If you have prior data sets that fit similar models, these can often be used as a guide for determining good SAMPLE SIZE N = 214
starting values. We can also sometimes make educated guesses from the functional form of the model. For MODEL--ULTRASON =EXP(-B1*METAL)/(B2+B3*METAL)
some models, there may be specific methods for determining starting values. For example, sinusoidal models REPLICATION CASE
that are commonly used in time series are quite sensitive to good starting values. The beam deflection case REPLICATION STANDARD DEVIATION = 0.3281762600D+01
REPLICATION DEGREES OF FREEDOM = 192
study shows an example of obtaining starting values for a sinusoidal model.
NUMBER OF DISTINCT SUBSETS = 22
In the case where you do not know what good starting values would be, one approach is to create a grid of
values for each of the parameters of the model and compute some measure of goodness of fit, such as the
residual standard deviation, at each point on the grid. The idea is to create a broad grid that encloses FINAL PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
reasonable values for the parameter. However, we typically want to keep the number of grid points for each 1 B1 0.190404 (0.2206E-01) 8.6
parameter relatively small to keep the computational burden down (particularly as the number of parameters in 2 B2 0.613300E-02 (0.3493E-03) 18.
the model increases). The idea is to get in the right neighborhood, not to find the optimal fit. We would pick 3 B3 0.105266E-01 (0.8027E-03) 13.
the grid point that corresponds to the smallest residual standard deviation as the starting values.
RESIDUAL STANDARD DEVIATION = 3.3616721630
Fitting Data to a For this particular data set, the scientist was trying to fit the following theoretical model. RESIDUAL DEGREES OF FREEDOM = 211
Theoretical Model REPLICATION STANDARD DEVIATION = 3.2817625999
REPLICATION DEGREES OF FREEDOM = 192
LACK OF FIT F RATIO = 1.5474 = THE 92.6461% POINT OF THE
F DISTRIBUTION WITH 19 AND 192 DEGREES OF FREEDOM
Since we have a theoretical model, we use this as the initial model.

Prefit to Obtain We used the Dataplot PREFIT command to determine starting values based on a grid of the parameter values.
Plot of Predicted
Starting Values Here, our grid was 0.1 to 1.0 in increments of 0.1. The output has been edited slightly for display.
Values with
Original Data

LEAST SQUARES NON-LINEAR PRE-FIT


SAMPLE SIZE N = 214
MODEL--ULTRASON =(EXP(-B1*METAL)/(B2+B3*METAL))
REPLICATION CASE
REPLICATION STANDARD DEVIATION = 0.3281762600D+01
REPLICATION DEGREES OF FREEDOM = 192
NUMBER OF DISTINCT SUBSETS = 22

NUMBER OF LATTICE POINTS = 1000

STEP RESIDUAL * PARAMETER


NUMBER STANDARD * ESTIMATES
DEVIATION *
----------------------------------*-----------
1-- 0.35271E+02 * 0.10000E+00 0.10000E+00 0.10000E+00

FINAL PARAMETER ESTIMATES


1 B1 0.100000
2 B2 0.100000
3 B3 0.100000

RESIDUAL STANDARD DEVIATION = 35.2706031799


RESIDUAL DEGREES OF FREEDOM = 211
REPLICATION STANDARD DEVIATION = 3.2817625999
This plot shows a reasonably good fit. It is difficult to detect any violations of the fit assumptions from this
REPLICATION DEGREES OF FREEDOM = 192
plot. The estimated model is

The best starting values based on this grid is to set all three parameters to 0.1.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd632.htm (2 of 5) [11/14/2003 5:51:04 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd632.htm (3 of 5) [11/14/2003 5:51:04 PM]


4.6.3.2. Initial Non-Linear Fit 4.6.3.2. Initial Non-Linear Fit

6-Plot for Model When there is a single independent variable, the 6-plot provides a convenient method for initial model
Validation validation.

This plot suggests that the errors have greater variance for the values of metal distance less than one than
elsewhere. That is, the assumption of homogeneous variances seems to be violated.

Non-Homogeneous Except when the Metal Distance is less than or equal to one, there is not strong evidence that the error
The basic assumptions for regression models are that the errors are random observations from a normal Variances variances differ. Nevertheless, we will use transformations or weighted fits to see if we can elminate this
distribution with zero mean and constant standard deviation (or variance). problem.
These plots suggest that the variance of the errors is not constant.
In order to see this more clearly, we will generate full- sized a plot of the predicted values from the model and
overlay the data and plot the residuals against the independent variable, Metal Distance.

Plot of Residual
Values Against
Independent
Variable

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd632.htm (4 of 5) [11/14/2003 5:51:04 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd632.htm (5 of 5) [11/14/2003 5:51:04 PM]


4.6.3.3. Transformations to Improve Fit 4.6.3.3. Transformations to Improve Fit

Plot of Common After transforming the response variable, it is often helpful to transform the predictor variable as well. In
Transformations practice, the square root, ln, and reciprocal transformations often work well for this purpose. We will try these
to Predictor first.
4. Process Modeling Variable
4.6. Case Studies in Process Modeling
4.6.3. Ultrasonic Reference Block Study

4.6.3.3. Transformations to Improve Fit


Transformations One approach to the problem of non-homogeneous variances is to apply transformations to the data.

Plot of Common The first step is to try transformations of the response variable that will result in homogeneous variances. In
Transformations practice, the square root, ln, and reciprocal transformations often work well for this purpose. We will try these
to Obtain first.
Homogeneous
Variances

This plot shows that none of the proposed transformations offers an improvement over using the raw predictor
variable.

Square Root Fit Based on the above plots, we choose to fit a model with a square root transformation for the response variable
and no transformation for the predictor variable. Dataplot generated the following output for this model (it is
edited slightly for display).

LEAST SQUARES NON-LINEAR FIT


SAMPLE SIZE N = 214
MODEL--YTEMP =EXP(-B1*XTEMP)/(B2+B3*XTEMP)
REPLICATION CASE
In examining these four plots, we are looking for the plot that shows the most constant variability of the REPLICATION STANDARD DEVIATION = 0.2927381992D+00
ultrasonic response across values of metal distance. Although the scales of these plots differ widely, which REPLICATION DEGREES OF FREEDOM = 192
would seem to make comparisons difficult, we are not comparing the absolute levesl of variability between NUMBER OF DISTINCT SUBSETS = 22
plots here. Instead we are comparing only how constant the variation within each plot is for these four plots.
The plot with the most constant variation will indicate which transformation is best. FINAL PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
1 B1 -0.154326E-01 (0.8593E-02) -1.8
Based on constancy of the variation in the residuals, the square root transformation is probably the best 2 B2 0.806714E-01 (0.1524E-02) 53.
tranformation to use for this data. 3 B3 0.638590E-01 (0.2900E-02) 22.

RESIDUAL STANDARD DEVIATION = 0.2971503735


RESIDUAL DEGREES OF FREEDOM = 211
REPLICATION STANDARD DEVIATION = 0.2927381992
REPLICATION DEGREES OF FREEDOM = 192
LACK OF FIT F RATIO = 1.3373 = THE 83.6085% POINT OF THE
F DISTRIBUTION WITH 19 AND 192 DEGREES OF FREEDOM

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd633.htm (1 of 5) [11/14/2003 5:51:05 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd633.htm (2 of 5) [11/14/2003 5:51:05 PM]


4.6.3.3. Transformations to Improve Fit 4.6.3.3. Transformations to Improve Fit

Although the residual standard deviation is lower than it was for the original fit, we cannot compare them
directly since the fits were performed on different scales.

Plot of
Predicted
Values

Since we transformed the data, we need to check that all of the regression assumptions are now valid.
The 6-plot of the data using this model indicates no obvious violations of the assumptions.

Plot of
Residuals
The plot of the predicted values with the transformed data indicates a good fit. The fitted model is

6-Plot of Fit

In order to see more detail, we generate a full size version of the residuals versus predictor variable plot. This
plot suggests that the errors now satisfy the assumption of homogeneous variances.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd633.htm (3 of 5) [11/14/2003 5:51:05 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd633.htm (4 of 5) [11/14/2003 5:51:05 PM]


4.6.3.3. Transformations to Improve Fit 4.6.3.4. Weighting to Improve Fit

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.3. Ultrasonic Reference Block Study

4.6.3.4. Weighting to Improve Fit


Weighting Another approach when the assumption of constant variance of the errors is violated is to perform
a weighted fit. In a weighted fit, we give less weight to the less precise measurements and more
weight to more precise measurements when estimating the unknown parameters in the model.

Finding An Techniques for determining an appropriate weight function were discussed in detail in Section
Appropriate 4.4.5.2.
Weight
Function In this case, we have replication in the data, so we can fit the power model

to the variances from each set of replicates in the data and use for the weights.

Fit for
Estimating
Weights
Dataplot generated the following output for the fit of ln(variances) against ln(means) for the
replicate groups. The output has been edited slightly for display.

LEAST SQUARES MULTILINEAR FIT


SAMPLE SIZE N = 22
NUMBER OF VARIABLES = 1

PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 A0 2.46872 (0.2186 ) 11.
2 A1 XTEMP -1.02871 (0.1983 ) -5.2

RESIDUAL STANDARD DEVIATION = 0.6945897937


RESIDUAL DEGREES OF FREEDOM = 20

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd633.htm (5 of 5) [11/14/2003 5:51:05 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd634.htm (1 of 6) [11/14/2003 5:51:05 PM]


4.6.3.4. Weighting to Improve Fit 4.6.3.4. Weighting to Improve Fit

The fit output and plot from the replicate variances against the replicate means shows that the The residual plot from the fit to determine an appropriate weighting function reveals no obvious
linear fit provides a reasonable fit, with an estimated slope of -1.03. problems.
Based on this fit, we used an estimate of -1.0 for the exponent in the weighting function.
Numerical Dataplot generated the following output for the weighted fit (edited slightly for display).
Output
Residual from
Plot for Weighted
Weight Fit LEAST SQUARES NON-LINEAR FIT
Function SAMPLE SIZE N = 214
MODEL--ULTRASON =EXP(-B1*METAL)/(B2+B3*METAL)
REPLICATION CASE
REPLICATION STANDARD DEVIATION = 0.3281762600D+01
REPLICATION DEGREES OF FREEDOM = 192
NUMBER OF DISTINCT SUBSETS = 22

FINAL PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE


1 B1 0.147046 (0.1512E-01) 9.7
2 B2 0.528104E-02 (0.4063E-03) 13.
3 B3 0.123853E-01 (0.7458E-03) 17.

RESIDUAL STANDARD DEVIATION = 4.1106567383


RESIDUAL DEGREES OF FREEDOM = 211
REPLICATION STANDARD DEVIATION = 3.2817625999
REPLICATION DEGREES OF FREEDOM = 192
LACK OF FIT F RATIO = 7.3183 = THE 100.0000% POINT OF THE

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd634.htm (2 of 6) [11/14/2003 5:51:05 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd634.htm (3 of 6) [11/14/2003 5:51:05 PM]


4.6.3.4. Weighting to Improve Fit 4.6.3.4. Weighting to Improve Fit

F DISTRIBUTION WITH 19 AND 192 DEGREES OF FREEDOM

Plot of To assess the quality of the weighted fit, we first generate a plot of the predicted line with the
Predicted original data.
Values

We need to verify that the weighted fit does not violate the regression assumptions. The 6-plot
indicates that the regression assumptions are satisfied.

Plot of
Residuals
The plot of the predicted values with the data indicates a good fit. The model for the weighted fit
is

6-Plot of
Fit

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd634.htm (4 of 6) [11/14/2003 5:51:05 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd634.htm (5 of 6) [11/14/2003 5:51:05 PM]


4.6.3.4. Weighting to Improve Fit 4.6.3.5. Compare the Fits

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.3. Ultrasonic Reference Block Study

4.6.3.5. Compare the Fits


Three Fits It is interesting to compare the results of the three fits:
to 1. Unweighted fit
Compare
2. Transformed fit
3. Weighted fit

Plot of Fits The first step in comparing the fits is to plot all three sets of predicted values (in the original
with Data units) on the same plot with the raw data.

In order to check the assumption of equal error variances in more detail, we generate a full-sized
version of the residuals versus the predictor variable. This plot suggests that the residuals now
have approximately equal variability.

This plot shows that all three fits generate comparable predicted values. We can also compare the
residual standard deviations (RESSD) from the fits. The RESSD for the transformed data is
calculated after translating the predicted values back to the original scale.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd634.htm (6 of 6) [11/14/2003 5:51:05 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd635.htm (1 of 2) [11/14/2003 5:51:06 PM]


4.6.3.5. Compare the Fits 4.6.3.6. Work This Example Yourself

RESSD From Unweighted Fit = 3.361673


RESSD From Transformed Fit = 3.306732
RESSD From Weighted Fit = 3.392797

4. Process Modeling
4.6. Case Studies in Process Modeling
In this case, the RESSD is quite close for all three fits (which is to be expected based on the plot). 4.6.3. Ultrasonic Reference Block Study

Conclusion Given that transformed and weighted fits generate predicted values that are quite close to the
original fit, why would we want to make the extra effort to generate a transformed or weighted
4.6.3.6. Work This Example Yourself
fit? We do so to develop a model that satisfies the model assumptions for fitting a nonlinear
View This page allows you to repeat the analysis outlined in the case study
model. This gives us more confidence that conclusions and analyses based on the model are description on the previous page using Dataplot, if you have
Dataplot
justified and appropriate. Macro for downloaded and installed it. Output from each analysis step below will
this Case be displayed in one or more of the Dataplot windows. The four main
Study windows are the Output window, the Graphics window, the Command
History window and the Data Sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in.

Data Analysis Steps Results and Conclusions

Click on the links below to start Dataplot and run this case study
The links in this column will connect you with more detailed
yourself. Each step may use results from previous steps, so please be
information about each analysis step from the case study
patient. Wait until the software verifies that the current step is
description.
complete before clicking on the next step.

1. Get set up and started.

1. Read in the data. 1. You have read 2 columns of numbers


into Dataplot, variables the
ultrasonic response and metal
distance

2. Plot data, pre-fit for starting values, and


fit nonlinear model.

1. Plot the ultrasonic response versus 1. Initial plot indicates that a


metal distance. nonlinear model is required.
Theory dictates an exponential
over linear for the initial model.

2. Run PREFIT to generate good 2. Pre-fit indicated starting


starting values. values of 0.1 for all 3
parameters.

3. Nonlinear fit of the ultrasonic response 3. The nonlinear fit was carried out.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd635.htm (2 of 2) [11/14/2003 5:51:06 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd636.htm (1 of 3) [11/14/2003 5:51:06 PM]


4.6.3.6. Work This Example Yourself 4.6.3.6. Work This Example Yourself
versus metal distance. Plot predicted Initial fit looks pretty good. 3. Weighted linear fit of field versus 3. The weighted fit was carried out.
values and overlay the data. lab. Plot predicted values with The plot of the predicted values
the data. overlaid with the data suggests
4. Generate a 6-plot for model 4. The 6-plot shows that the model that the variances arehomogeneous.
validation. assumptions are satisfied except for
the non-homogeneous variances. 4. Generate a 6-plot for model 4. The 6-plot shows that the model
validation. assumptions are satisfied.
5. Plot the residuals against 5. The detailed residual plot shows
the predictor variable. the non-homogeneous variances
more clearly. 5. Plot the residuals against 5. The detailed residual plot suggests
the predictor variable. the homogeneous variances for the
errors more clearly.
3. Improve the fit with transformations.

1. Plot several common transformations 1. The plots indicate that a square 5. Compare the fits.
of the dependent variable (ultrasonic root transformation on the dependent
response). variable (ultrasonic response) is a 1. Plot predicted values from each 1. The transformed and weighted fits
good candidate model. of the three models with the generate only slightly different
data. predicted values, but the model
2. Plot several common transformations 2. The plots indicate that no assumptions are not violated.
of the predictor variable (metal). transformation on the predictor
variable (metal distance) is
a good candidate model.

3. Nonlinear fit of transformed data. 3. Carry out the fit on the transformed
Plot predicted values with the data. The plot of the predicted
data. values overlaid with the data
indicates a good fit.

4. Generate a 6-plot for model


4. The 6-plot suggests that the model
validation.
assumptions, specifically homogeneous
variances for the errors, are
satisfied.

5. Plot the residuals against


the predictor variable. 5. The detailed residual plot shows
more clearly that the homogeneous
variances assumption is now
satisfied.

4. Improve the fit using weighting.

1. Fit function to determine appropriate 1. The fit to determine an appropriate


weight function. Determine value for weight function indicates that a
the exponent in the power model. value for the exponent in the range
-1.0 to -1.1 should be reasonable.

2. Plot residuals from fit to determine 2. The residuals from this fit
appropriate weight function. indicate no major problems.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd636.htm (2 of 3) [11/14/2003 5:51:06 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd636.htm (3 of 3) [11/14/2003 5:51:06 PM]


4.6.4. Thermal Expansion of Copper Case Study 4.6.4.1. Background and Data

4. Process Modeling 4. Process Modeling


4.6. Case Studies in Process Modeling 4.6. Case Studies in Process Modeling
4.6.4. Thermal Expansion of Copper Case Study

4.6.4. Thermal Expansion of Copper Case


Study 4.6.4.1. Background and Data
Description The response variable for this data set is the coefficient of thermal
Rational This case study illustrates the use of a class of nonlinear models called of the Data expansion for copper. The predictor variable is temperature in degrees
Function rational function models. The data set used is the thermal expansion of kelvin. There were 236 data points collected.
Models copper related to temperature.
These data were provided by the NIST scientist Thomas Hahn.
This data set was provided by the NIST scientist Thomas Hahn.
Resulting
Contents 1. Background and Data Data Coefficient
2. Rational Function Models of Thermal Temperature
Expansion (Degrees
3. Initial Plot of Data
of Copper Kelvin)
4. Fit Quadratic/Quadratic Model ---------------------------
5. Fit Cubic/Cubic Model 0.591 24.41
1.547 34.82
6. Work This Example Yourself 2.902 44.09
2.894 45.07
4.703 54.98
6.307 65.51
7.030 70.53
7.898 75.70
9.470 89.57
9.484 91.14
10.072 96.40
10.163 97.19
11.615 114.26
12.005 120.25
12.478 127.08
12.982 133.55
12.970 133.61
13.926 158.67
14.452 172.74
14.404 171.31
15.190 202.14
15.550 220.55

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd64.htm [11/14/2003 5:51:06 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd641.htm (1 of 6) [11/14/2003 5:51:06 PM]


4.6.4.1. Background and Data 4.6.4.1. Background and Data

15.528 221.05 12.786 134.03


15.499 221.39 14.067 163.19
16.131 250.99 13.974 163.48
16.438 268.99 14.462 175.70
16.387 271.80 14.464 179.86
16.549 271.97 15.381 211.27
16.872 321.31 15.483 217.78
16.830 321.69 15.590 219.14
16.926 330.14 16.075 262.52
16.907 333.03 16.347 268.01
16.966 333.47 16.181 268.62
17.060 340.77 16.915 336.25
17.122 345.65 17.003 337.23
17.311 373.11 16.978 339.33
17.355 373.79 17.756 427.38
17.668 411.82 17.808 428.58
17.767 419.51 17.868 432.68
17.803 421.59 18.481 528.99
17.765 422.02 18.486 531.08
17.768 422.47 19.090 628.34
17.736 422.61 16.062 253.24
17.858 441.75 16.337 273.13
17.877 447.41 16.345 273.66
17.912 448.70 16.388 282.10
18.046 472.89 17.159 346.62
18.085 476.69 17.116 347.19
18.291 522.47 17.164 348.78
18.357 522.62 17.123 351.18
18.426 524.43 17.979 450.10
18.584 546.75 17.974 450.35
18.610 549.53 18.007 451.92
18.870 575.29 17.993 455.56
18.795 576.00 18.523 552.22
19.111 625.55 18.669 553.56
0.367 20.15 18.617 555.74
0.796 28.78 19.371 652.59
0.892 29.57 19.330 656.20
1.903 37.41 0.080 14.13
2.150 39.12 0.248 20.41
3.697 50.24 1.089 31.30
5.870 61.38 1.418 33.84
6.421 66.25 2.278 39.70
7.422 73.42 3.624 48.83
9.944 95.52 4.574 54.50
11.023 107.32 5.556 60.41
11.870 122.04 7.267 72.77

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd641.htm (2 of 6) [11/14/2003 5:51:06 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd641.htm (3 of 6) [11/14/2003 5:51:06 PM]


4.6.4.1. Background and Data 4.6.4.1. Background and Data

7.695 75.25 15.651 226.86


9.136 86.84 15.746 229.65
9.959 94.88 16.216 258.27
9.957 96.40 16.445 273.77
11.600 117.37 16.965 339.15
13.138 139.08 17.121 350.13
13.564 147.73 17.206 362.75
13.871 158.63 17.250 371.03
13.994 161.84 17.339 393.32
14.947 192.11 17.793 448.53
15.473 206.76 18.123 473.78
15.379 209.07 18.49 511.12
15.455 213.32 18.566 524.70
15.908 226.44 18.645 548.75
16.114 237.12 18.706 551.64
17.071 330.90 18.924 574.02
17.135 358.72 19.100 623.86
17.282 370.77 0.375 21.46
17.368 372.72 0.471 24.33
17.483 396.24 1.504 33.43
17.764 416.59 2.204 39.22
18.185 484.02 2.813 44.18
18.271 495.47 4.765 55.02
18.236 514.78 9.835 94.33
18.237 515.65 10.040 96.44
18.523 519.47 11.946 118.82
18.627 544.47 12.596 128.48
18.665 560.11 13.303 141.94
19.086 620.77 13.922 156.92
0.214 18.97 14.440 171.65
0.943 28.93 14.951 190.00
1.429 33.91 15.627 223.26
2.241 40.03 15.639 223.88
2.951 44.66 15.814 231.50
3.782 49.87 16.315 265.05
4.757 55.16 16.334 269.44
5.602 60.90 16.430 271.78
7.169 72.08 16.423 273.46
8.920 85.15 17.024 334.61
10.055 97.06 17.009 339.79
12.035 119.63 17.165 349.52
12.861 133.27 17.134 358.18
13.436 143.84 17.349 377.98
14.167 161.91 17.576 394.77
14.755 180.67 17.848 429.66
15.168 198.44 18.090 468.22

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd641.htm (4 of 6) [11/14/2003 5:51:06 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd641.htm (5 of 6) [11/14/2003 5:51:06 PM]


4.6.4.1. Background and Data 4.6.4.2. Rational Function Models

18.276 487.27
18.404 519.54
18.519 523.03
19.133 612.99
19.074 638.59
4. Process Modeling
19.239 641.36
4.6. Case Studies in Process Modeling
19.280 622.05 4.6.4. Thermal Expansion of Copper Case Study
19.101 631.50
19.398 663.97
19.252
19.890
646.90
748.29
4.6.4.2. Rational Function Models
20.007 749.21
19.929 750.14 Before proceeding with the case study, some explanation of rational
19.268 647.04 function models is required.
19.324 646.89
20.049 746.90 Polynomial A polynomial function is one that has the form
20.107 748.43 Functions
20.062 747.35
with n denoting a non-negative integer that defines the degree of the
20.065 749.27
polynomial. A polynomial with a degree of 0 is simply a constant, with a
19.286 647.61
degree of 1 is a line, with a degree of 2 is a quadratic, with a degree of 3 is a
19.972 747.78
cubic, and so on.
20.088 750.51
20.743 851.37
20.830 845.97 Rational A rational function is simply the ratio of two polynomial functions.
20.935 847.54 Functions
21.035 849.93
20.930 851.61
21.074 849.75 with n denoting a non-negative integer that defines the degree of the
21.085 850.98 numerator and m is a non-negative integer that defines the degree of the
20.935 848.23 denominator. For fitting rational function models, the constant term in the
denominator is usually set to 1.
Rational functions are typically identified by the degrees of the numerator
and denominator. For example, a quadratic for the numerator and a cubic for
the denominator is identified as a quadratic/cubic rational function. The
graphs of some common rational functions are shown in an appendix.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd641.htm (6 of 6) [11/14/2003 5:51:06 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd642.htm (1 of 4) [11/14/2003 5:51:07 PM]


4.6.4.2. Rational Function Models 4.6.4.2. Rational Function Models

Polynomial Historically, polynomial models are among the most frequently used Advantages Rational function models have the following advantages.
Models empirical models for fitting functions. These models are popular for the 1. Rational function models have a moderately simple form.
following reasons.
2. Rational function models are a closed family. As with polynomial
1. Polynomial models have a simple form. models, this means that rational function models are not dependent on
2. Polynomial models have well known and understood properties. the underlying metric.
3. Polynomial models have moderate flexibility of shapes. 3. Rational function models can take on an extremely wide range of
4. Polynomial models are a closed family. Changes of location and scale shapes, accommodating a much wider range of shapes than does the
in the raw data result in a polynomial model being mapped to a polynomial family.
polynomial model. That is, polynomial models are not dependent on 4. Rational function models have better interpolatory properties than
the underlying metric. polynomial models. Rational functions are typically smoother and less
5. Polynomial models are computationally easy to use. oscillatory than polynomial models.
However, polynomial models also have the following limitations. 5. Rational functions have excellent extrapolatory powers. Rational
functions can typically be tailored to model the function not only
1. Polynomial models have poor interpolatory properties. High-degree
within the domain of the data, but also so as to be in agreement with
polynomials are notorious for oscillations between exact-fit values.
theoretical/asymptotic behavior outside the domain of interest.
2. Polynomial models have poor extrapolatory properties. Polynomials
6. Rational function models have excellent asymptotic properties.
may provide good fits within the range of data, but they will
Rational functions can be either finite or infinite for finite values, or
frequently deteriorate rapidly outside the range of the data.
finite or infinite for infinite values. Thus, rational functions can
3. Polynomial models have poor asymptotic properties. By their nature, easily be incorporated into a rational function model.
polynomials have a finite response for finite values and have an
7. Rational function models can often be used to model complicated
infinite response if and only if the value is infinite. Thus
structure with a fairly low degree in both the numerator and
polynomials may not model asympototic phenomena very well.
denominator. This in turn means that fewer coefficients will be
4. Polynomial models have a shape/degree tradeoff. In order to model required compared to the polynomial model.
data with a complicated structure, the degree of the model must be
8. Rational function models are moderately easy to handle
high, indicating and the associated number of parameters to be
computationally. Although they are nonlinear models, rational
estimated will also be high. This can result in highly unstable models.
function models are a particularly easy nonlinear models to fit.
Rational A rational function model is a generalization of the polynomial model.
Disadvantages Rational function models have the following disadvantages.
Function Rational function models contain polynomial models as a subset (i.e., the
Models case when the denominator is a constant). 1. The properties of the rational function family are not as well known to
engineers and scientists as are those of the polynomial family. The
If modeling via polynomial models is inadequate due to any of the literature on the rational function family is also more limited. Because
limitations above, you should consider a rational function model. the properties of the family are often not well understood, it can be
difficult to answer the following modeling question:
Given that data has a certain shape, what values should be
chosen for the degree of the numerator and the degree on the
denominator?
2. Unconstrained rational function fitting can, at times, result in
undesired nusiance asymptotes (vertically) due to roots in the
denominator polynomial. The range of values affected by the
function "blowing up" may be quite narrow, but such asymptotes,
when they occur, are a nuisance for local interpolation in the

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd642.htm (2 of 4) [11/14/2003 5:51:07 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd642.htm (3 of 4) [11/14/2003 5:51:07 PM]


4.6.4.2. Rational Function Models 4.6.4.3. Initial Plot of Data

neighborhood of the asymptote point. These asymptotes are easy to


detect by a simple plot of the fitted function over the range of the
data. Such asymptotes should not discourage you from considering
rational function models as a choice for empirical modeling. These
nuisance asymptotes occur occasionally and unpredictably, but the 4. Process Modeling
gain in flexibility of shapes is well worth the chance that they may 4.6. Case Studies in Process Modeling
occur. 4.6.4. Thermal Expansion of Copper Case Study

Starting One common difficulty in fitting nonlinear models is finding adequate


Values for starting values. A major advantage of rational function models is the ability
4.6.4.3. Initial Plot of Data
Rational to compute starting values using a linear least squares fit.
Function Plot The first step in fitting a nonlinear function is to simply plot the data.
Models To do this, choose p points from the data set, with p denoting the number of of
parameters in the rational model. For example, given the linear/quadratic Data
model

we need to select four representative points.


We then perform a linear fit on the model

Here, pn and pd are the degrees of the numerator and denominator,


respectively, and the and contain the subset of points, not the full data
set. The estimated coefficients from this linear fit are used as the starting
values for fitting the nonlinear model to the full data set.
Note:This type of fit, with the response variable appearing on both sides of
the function, should only be used to obtain starting values for the nonlinear
fit. The statistical properties of fits like this are not well understood.
The subset of points should be selected over the range of the data. It is not
critical which points are selected, although you should avoid points that are
obvious outliers.

This plot initially shows a fairly steep slope that levels off to a more gradual slope. This type of
curve can often be modeled with a rational function model.
The plot also indicates that there do not appear to be any outliers in this data.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd642.htm (4 of 4) [11/14/2003 5:51:07 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd643.htm [11/14/2003 5:51:07 PM]


4.6.4.4. Quadratic/Quadratic Rational Function Model 4.6.4.4. Quadratic/Quadratic Rational Function Model

Nonlinear Dataplot generated the following output for the nonlinear fit. The output has been edited for display.
Fit Output

4. Process Modeling LEAST SQUARES NON-LINEAR FIT


4.6. Case Studies in Process Modeling SAMPLE SIZE N = 236
4.6.4. Thermal Expansion of Copper Case Study MODEL--THERMEXP =(A0+A1*TEMP+A2*TEMP**2)/(1+B1*TEMP+B2*TEMP**2)
REPLICATION CASE
REPLICATION STANDARD DEVIATION = 0.8131711930D-01
4.6.4.4. Quadratic/Quadratic Rational Function Model REPLICATION DEGREES OF FREEDOM = 1
NUMBER OF DISTINCT SUBSETS = 235
Q/Q We used Dataplot to fit the Q/Q rational function model. Dataplot first uses the EXACT RATIONAL FIT command to
Rational generate the starting values and then the FIT command to generate the nonlinear fit. FINAL PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
Function 1 A0 -8.12326 (0.3908 ) -21.
Model We used the following 5 points to generate the starting values. 2 A1 0.513233 (0.5418E-01) 9.5
3 A2 -0.736978E-02 (0.1705E-02) -4.3
TEMP THERMEXP 4 B1 -0.689864E-02 (0.3960E-02) -1.7
---- -------- 5 B2 -0.332089E-03 (0.7890E-04) -4.2
10 0
50 5 RESIDUAL STANDARD DEVIATION = 0.5501883030
120 12 RESIDUAL DEGREES OF FREEDOM = 231
200 15 REPLICATION STANDARD DEVIATION = 0.0813171193
800 20 REPLICATION DEGREES OF FREEDOM = 1
LACK OF FIT F RATIO = 45.9729 = THE 88.2878% POINT OF THE
F DISTRIBUTION WITH 230 AND 1 DEGREES OF FREEDOM
Exact Dataplot generated the following output from the EXACT RATIONAL FIT command. The output has been edited for
Rational display.
Fit Output The above output yields the following estimated model.
EXACT RATIONAL FUNCTION FIT
NUMBER OF POINTS IN FIRST SET = 5
DEGREE OF NUMERATOR = 2
DEGREE OF DENOMINATOR = 2
Plot of We generate a plot of the fitted rational function model with the raw data.
Q/Q
NUMERATOR --A0 A1 A2 = -0.301E+01 0.369E+00 -0.683E-02
Rational
DENOMINATOR--B0 B1 B2 = 0.100E+01 -0.112E-01 -0.306E-03
Function
Fit
APPLICATION OF EXACT-FIT COEFFICIENTS
TO SECOND PAIR OF VARIABLES--

NUMBER OF POINTS IN SECOND SET = 236


NUMBER OF ESTIMATED COEFFICIENTS = 5
RESIDUAL DEGREES OF FREEDOM = 231

RESIDUAL STANDARD DEVIATION (DENOM=N-P) = 0.17248161E+01


AVERAGE ABSOLUTE RESIDUAL (DENOM=N) = 0.82943726E+00
LARGEST (IN MAGNITUDE) POSITIVE RESIDUAL = 0.27050836E+01
LARGEST (IN MAGNITUDE) NEGATIVE RESIDUAL = -0.11428773E+02
LARGEST (IN MAGNITUDE) ABSOLUTE RESIDUAL = 0.11428773E+02

The important information in this output are the estimates for A0, A1, A2, B1, and B2 (B0 is always set to 1). These
values are used as the starting values for the fit in the next section.

Looking at the fitted function with the raw data appears to show a reasonable fit.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd644.htm (1 of 4) [11/14/2003 5:51:07 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd644.htm (2 of 4) [11/14/2003 5:51:07 PM]


4.6.4.4. Quadratic/Quadratic Rational Function Model 4.6.4.4. Quadratic/Quadratic Rational Function Model

6-Plot for Although the plot of the fitted function with the raw data appears to show a reasonable fit, we need to validate the model
Model assumptions. The 6-plot is an effective tool for this purpose.
Validation

The full-sized residual plot clearly shows the distinct pattern in the residuals. When residuals exhibit a clear pattern, the
corresponding errors are probably not random.

The plot of the residuals versus the predictor variable temperature (row 1, column 2) and of the residuals versus the
predicted values (row 1, column 3) indicate a distinct pattern in the residuals. This suggests that the assumption of random
errors is badly violated.

Residual
Plot

We generate a full-sized residual plot in order to show more detail.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd644.htm (3 of 4) [11/14/2003 5:51:07 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd644.htm (4 of 4) [11/14/2003 5:51:07 PM]


4.6.4.5. Cubic/Cubic Rational Function Model 4.6.4.5. Cubic/Cubic Rational Function Model

LARGEST (IN MAGNITUDE) POSITIVE RESIDUAL = 0.95733070E+00


LARGEST (IN MAGNITUDE) NEGATIVE RESIDUAL = -0.13497944E+01
LARGEST (IN MAGNITUDE) ABSOLUTE RESIDUAL = 0.13497944E+01

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.4. Thermal Expansion of Copper Case Study The important information in this output are the estimates for A0, A1, A2, A3, B1, B2, and B3
(B0 is always set to 1). These values are used as the starting values for the fit in the next section.

4.6.4.5. Cubic/Cubic Rational Function Model Nonlinear Dataplot generated the following output for the nonlinear fit. The output has been edited for
Fit Output display.
C/C Since the Q/Q model did not describe the data well, we next fit a cubic/cubic (C/C) rational
Rational function model.
Function LEAST SQUARES NON-LINEAR FIT
Model We used Dataplot to fit the C/C rational function model with the following 7 subset points to SAMPLE SIZE N = 236
generate the starting values. MODEL--THERMEXP =(A0+A1*TEMP+A2*TEMP**2+A3*TEMP**3)/
(1+B1*TEMP+B2*TEMP**2+B3*TEMP**3)
TEMP THERMEXP REPLICATION CASE
---- -------- REPLICATION STANDARD DEVIATION = 0.8131711930D-01
10 0 REPLICATION DEGREES OF FREEDOM = 1
30 2 NUMBER OF DISTINCT SUBSETS = 235
40 3
50 5 FINAL PARAMETER ESTIMATES (APPROX. ST. DEV.) T VALUE
120 12 1 A0 1.07913 (0.1710 ) 6.3
200 15 2 A1 -0.122801 (0.1203E-01) -10.
800 20 3 A2 0.408837E-02 (0.2252E-03) 18.
4 A3 -0.142848E-05 (0.2610E-06) -5.5
5 B1 -0.576111E-02 (0.2468E-03) -23.
Exact Dataplot generated the following output from the exact rational fit command. The output has been 6 B2 0.240629E-03 (0.1060E-04) 23.
Rational edited for display. 7 B3 -0.123254E-06 (0.1217E-07) -10.
Fit Output
RESIDUAL STANDARD DEVIATION = 0.0818038210
EXACT RATIONAL FUNCTION FIT
RESIDUAL DEGREES OF FREEDOM = 229
NUMBER OF POINTS IN FIRST SET = 7
REPLICATION STANDARD DEVIATION = 0.0813171193
DEGREE OF NUMERATOR = 3
REPLICATION DEGREES OF FREEDOM = 1
DEGREE OF DENOMINATOR = 3
LACK OF FIT F RATIO = 1.0121 = THE 32.1265% POINT OF THE
F DISTRIBUTION WITH 228 AND 1 DEGREES OF FREEDOM
NUMERATOR --A0 A1 A2 A3 =
-0.2322993E+01 0.3528976E+00 -0.1382551E-01 0.1765684E-03
DENOMINATOR--B0 B1 B2 B3 =
0.1000000E+01 -0.3394208E-01 0.1099545E-03 0.7905308E-05 The above output yields the following estimated model.

APPLICATION OF EXACT-FIT COEFFICIENTS


TO SECOND PAIR OF VARIABLES--

NUMBER OF POINTS IN SECOND SET = 236


NUMBER OF ESTIMATED COEFFICIENTS = 7
RESIDUAL DEGREES OF FREEDOM = 229

RESIDUAL SUM OF SQUARES = 0.78246452E+02


RESIDUAL STANDARD DEVIATION (DENOM=N-P) = 0.58454049E+00
AVERAGE ABSOLUTE RESIDUAL (DENOM=N) = 0.46998626E+00

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd645.htm (1 of 5) [11/14/2003 5:51:08 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd645.htm (2 of 5) [11/14/2003 5:51:08 PM]


4.6.4.5. Cubic/Cubic Rational Function Model 4.6.4.5. Cubic/Cubic Rational Function Model

Plot of We generate a plot of the fitted rational function model with the raw data.
C/C
Rational
Function
Fit

The 6-plot indicates no significant violation of the model assumptions. That is, the errors appear
to have constant location and scale (from the residual plot in row 1, column 2), seem to be
random (from the lag plot in row 2, column 1), and approximated well by a normal distribution
The fitted function with the raw data appears to show a reasonable fit. (from the histogram and normal probability plots in row 2, columns 2 and 3).

6-Plot for Residual


Model Plot
Validation

Although the plot of the fitted function with the raw data appears to show a reasonable fit, we
need to validate the model assumptions. The 6-plot is an effective tool for this purpose. We generate a full-sized residual plot in order to show more detail.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd645.htm (3 of 5) [11/14/2003 5:51:08 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd645.htm (4 of 5) [11/14/2003 5:51:08 PM]


4.6.4.5. Cubic/Cubic Rational Function Model 4.6.4.6. Work This Example Yourself

4. Process Modeling
4.6. Case Studies in Process Modeling
4.6.4. Thermal Expansion of Copper Case Study

4.6.4.6. Work This Example Yourself


View This page allows you to repeat the analysis outlined in the case study
Dataplot description on the previous page using Dataplot, if you have
Macro for downloaded and installed it. Output from each analysis step below will
this Case be displayed in one or more of the Dataplot windows. The four main
Study windows are the Output window, the Graphics window, the Command
History window and the Data Sheet window. Across the top of the main
windows there are menus for executing Dataplot commands. Across the
bottom is a command entry window where commands can be typed in.

Data Analysis Steps Results and Conclusions

Click on the links below to start Dataplot and run this case
study yourself. Each step may use results from previous The links in this column will connect you with more detailed
steps, so please be patient. Wait until the software verifies information about each analysis step from the case study
that the current step is complete before clicking on the next description.
step.
The full-sized residual plot suggests that the assumptions of constant location and scale for the
errors are valid. No distinguishing pattern is evident in the residuals.

Conclusion We conclude that the cubic/cubic rational function model does in fact provide a satisfactory 1. Get set up and started.
model for this data set.
1. Read in the data. 1. You have read 2 columns of numbers
into Dataplot, variables thermexp
and temp.

2. Plot the data.

1. Plot thermexp versus temp. 1. Initial plot indicates that a


nonlinear model is required.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd645.htm (5 of 5) [11/14/2003 5:51:08 PM] http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd646.htm (1 of 2) [11/14/2003 5:51:08 PM]


4.6.4.6. Work This Example Yourself 4.7. References For Chapter 4: Process Modeling

4. Fit a Q/Q rational function model.

1. Perform the Q/Q fit and plot the 1. The model parameters are estimated.
predicted values with the raw data. The plot of the predicted values with
the raw data seems to indicate a 4. Process Modeling
reasonable fit.

2. Perform model validation by 2. The 6-plot shows that the 4.7. References For Chapter 4: Process
generating a 6-plot. residuals follow a distinct
pattern and suggests that the
Modeling
randomness assumption for the
errors is violated. Handbook of Mathematical Functions with Formulas, Graphs and Mathematical Tables
3. Generate a full-sized plot of the (1964) Abramowitz M. and Stegun I. (eds.), U.S. Government Printing Office,
3. The full-sized residual plot shows
residuals to show greater detail. Washington, DC, 1046 p.
the non-random pattern more
clearly. Berkson J. (1950) "Are There Two Regressions?," Journal of the American Statistical
Association, Vol. 45, pp. 164-180.
Carroll, R.J. and Ruppert D. (1988) Transformation and Weighting in Regression,
3. Fit a C/C rational function model. Chapman and Hall, New York.
1. Perform the C/C fit and plot the 1. The model parameters are estimated. Cleveland, W.S. (1979) "Robust Locally Weighted Regression and Smoothing
predicted values with the raw data. The plot of the predicted values with Scatterplots," Journal of the American Statistical Association, Vol. 74, pp. 829-836.
the raw data seems to indicate a
reasonable fit. Cleveland, W.S. and Devlin, S.J. (1988) "Locally Weighted Regression: An Approach to
Regression Analysis by Local Fitting," Journal of the American Statistical Association,
2. Perform model validation by 2. The 6-plot does not indicate any Vol. 83, pp. 596-610.
generating a 6-plot. notable violations of the
assumptions. Fuller, W.A. (1987) Measurement Error Models, John Wiley and Sons, New York.
Graybill, F.A. (1976) Theory and Application of the Linear Model, Duxbury Press,
3. Generate a full-sized plot of the 3. The full-sized residual plot shows North Sciutate, Massachusetts.
residuals to show greater detail. no notable assumption violations.
Graybill, F.A. and Iyer, H.K. (1994) Regression Analysis: Concepts and Applications,
Duxbury Press, Belmont, California.
Harter, H.L. (1983) "Least Squares," Encyclopedia of Statistical Sciences, Kotz, S. and
Johnson, N.L., eds., John Wiley & Sons, New York, pp. 593-598.
Montgomery, D.C. (2001) Design and Analysis of Experiments, 5th ed., Wiley, New
York.
Neter, J., Wasserman, W., and Kutner, M. (1983) Applied Linear Regression Models,
Richard D. Irwin Inc., Homewood, IL.
Ryan, T.P. (1997) Modern Regression Methods, Wiley, New York
Seber, G.A.F and Wild, C.F. (1989) Nonlinear Regression, John Wiley and Sons, New
York.

http://www.itl.nist.gov/div898/handbook/pmd/section6/pmd646.htm (2 of 2) [11/14/2003 5:51:08 PM] http://www.itl.nist.gov/div898/handbook/pmd/section7/pmd7.htm (1 of 2) [11/14/2003 5:51:08 PM]


4.7. References For Chapter 4: Process Modeling 4.8. Some Useful Functions for Process Modeling

Stigler, S.M. (1978) "Mathematical Statistics in the Early States," The Annals of
Statistics, Vol. 6, pp. 239-265.
Stigler, S.M. (1986) The History of Statistics: The Measurement of Uncertainty Before
1900, The Belknap Press of Harvard University Press, Cambridge, Massachusetts.
4. Process Modeling

4.8. Some Useful Functions for Process


Modeling
Overview of This section lists some functions commonly-used for process modeling.
Section 4.8 Constructing an exhaustive list of useful functions is impossible, of
course, but the functions given here will often provide good starting
points when an empirical model must be developed to describe a
particular process.

Each function listed here is classified into a family of related functions,


if possible. Its statistical type, linear or nonlinear in the parameters, is
also given. Special features of each function, such as asymptotes, are
also listed along with the function's domain (the set of allowable input
values) and range (the set of possible output values). Plots of some of
the different shapes that each function can assume are also included.

Contents of 1. Univariate Functions


Section 4.8 1. Polynomials
2. Rational Functions

http://www.itl.nist.gov/div898/handbook/pmd/section7/pmd7.htm (2 of 2) [11/14/2003 5:51:08 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8.htm [11/14/2003 5:51:08 PM]


4.8.1. Univariate Functions 4.8.1.1. Polynomial Functions

4. Process Modeling 4. Process Modeling


4.8. Some Useful Functions for Process Modeling 4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions

4.8.1. Univariate Functions


4.8.1.1. Polynomial Functions
Overview of Univariate functions are listed in this section. They are useful for
Section 8.1 modeling in their own right and they can serve as the basic building
blocks for functions of higher dimension. Section 4.4.2.1 offers some Polynomial A polynomial function is one that has the form
Functions
advice on the development of empirical models for higher-dimension
processes from univariate functions. with n denoting a non-negative integer that defines the degree of the
polynomial. A polynomial with a degree of 0 is simply a constant, with a
Contents of 1. Polynomials degree of 1 is a line, with a degree of 2 is a quadratic, with a degree of 3 is a
Section 8.1 2. Rational Functions cubic, and so on.

Polynomial Historically, polynomial models are among the most frequently used
Models: empirical models for fitting functions. These models are popular for the
Advantages following reasons.
1. Polynomial models have a simple form.
2. Polynomial models have well known and understood properties.
3. Polynomial models have moderate flexibility of shapes.
4. Polynomial models are a closed family. Changes of location and scale
in the raw data result in a polynomial model being mapped to a
polynomial model. That is, polynomial models are not dependent on
the underlying metric.
5. Polynomial models are computationally easy to use.

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd81.htm [11/14/2003 5:51:09 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd811.htm (1 of 2) [11/14/2003 5:51:09 PM]


4.8.1.1. Polynomial Functions 4.8.1.1.1. Straight Line

Polynomial However, polynomial models also have the following limitations.


Model: 1. Polynomial models have poor interpolatory properties. High degree
Limitations polynomials are notorious for oscillations between exact-fit values.
2. Polynomial models have poor extrapolatory properties. Polynomials 4. Process Modeling
may provide good fits within the range of data, but they will 4.8. Some Useful Functions for Process Modeling
frequently deteriorate rapidly outside the range of the data. 4.8.1. Univariate Functions
3. Polynomial models have poor asymptotic properties. By their nature, 4.8.1.1. Polynomial Functions
polynomials have a finite response for finite values and have an
infinite response if and only if the value is infinite. Thus
polynomials may not model asympototic phenomena very well. 4.8.1.1.1. Straight Line
4. Polynomial models have a shape/degree tradeoff. In order to model
data with a complicated structure, the degree of the model must be
high, indicating and the associated number of parameters to be
estimated will also be high. This can result in highly unstable models.

Example The load cell calibration case study contains an example of fitting a
quadratic polynomial model.

Specific 1. Straight Line


Polynomial 2. Quadratic Polynomial
Functions
3. Cubic Polynomial

Function:

Function
Family: Polynomial

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd811.htm (2 of 2) [11/14/2003 5:51:09 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8111.htm (1 of 2) [11/14/2003 5:51:09 PM]


4.8.1.1.1. Straight Line 4.8.1.1.2. Quadratic Polynomial

Statistical
Type: Linear

Domain: 4. Process Modeling


4.8. Some Useful Functions for Process Modeling
Range: 4.8.1. Univariate Functions
4.8.1.1. Polynomial Functions

Special
Features: None 4.8.1.1.2. Quadratic Polynomial
Additional
Examples: None

Function:

Function
Family: Polynomial

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8111.htm (2 of 2) [11/14/2003 5:51:09 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8112.htm (1 of 5) [11/14/2003 5:51:10 PM]


4.8.1.1.2. Quadratic Polynomial 4.8.1.1.2. Quadratic Polynomial

Statistical
Type: Linear

Domain:

Range:

Special
Features: None

Additional
Examples:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8112.htm (2 of 5) [11/14/2003 5:51:10 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8112.htm (3 of 5) [11/14/2003 5:51:10 PM]


4.8.1.1.2. Quadratic Polynomial 4.8.1.1.2. Quadratic Polynomial

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8112.htm (4 of 5) [11/14/2003 5:51:10 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8112.htm (5 of 5) [11/14/2003 5:51:10 PM]


4.8.1.1.3. Cubic Polynomial 4.8.1.1.3. Cubic Polynomial

Statistical
Type: Linear

4. Process Modeling Domain:


4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
Range:
4.8.1.1. Polynomial Functions

Special
4.8.1.1.3. Cubic Polynomial Features: None

Additional
Examples:

Function:

Function
Family: Polynomial

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8113.htm (1 of 8) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8113.htm (2 of 8) [11/14/2003 5:51:11 PM]


4.8.1.1.3. Cubic Polynomial 4.8.1.1.3. Cubic Polynomial

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8113.htm (3 of 8) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8113.htm (4 of 8) [11/14/2003 5:51:11 PM]


4.8.1.1.3. Cubic Polynomial 4.8.1.1.3. Cubic Polynomial

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8113.htm (5 of 8) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8113.htm (6 of 8) [11/14/2003 5:51:11 PM]


4.8.1.1.3. Cubic Polynomial 4.8.1.1.3. Cubic Polynomial

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8113.htm (7 of 8) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8113.htm (8 of 8) [11/14/2003 5:51:11 PM]


4.8.1.2. Rational Functions 4.8.1.2. Rational Functions

Advantages Rational function models have the following advantages.


1. Rational function models have a moderately simple form.
2. Rational function models are a closed family. As with polynomial
4. Process Modeling models, this means that rational function models are not dependent on
4.8. Some Useful Functions for Process Modeling the underlying metric.
4.8.1. Univariate Functions 3. Rational function models can take on an extremely wide range of
shapes, accommodating a much wider range of shapes than does the
polynomial family.
4.8.1.2. Rational Functions 4. Rational function models have better interpolatory properties than
polynomial models. Rational functions are typically smoother and less
oscillatory than polynomial models.
Rational A rational function is simply the ratio of two polynomial functions
Functions 5. Rational functions have excellent extrapolatory powers. Rational
functions can typically be tailored to model the function not only
within the domain of the data, but also so as to be in agreement with
theoretical/asymptotic behavior outside the domain of interest.
with n denoting a non-negative integer that defines the degree of the
6. Rational function models have excellent asymptotic properties.
numerator and m denoting a non-negative integer that defines the degree of
the denominator. When fitting rational function models, the constant term in Rational functions can be either finite or infinite for finite values, or
the denominator is usually set to 1. finite or infinite for infinite values. Thus, rational functions can
easily be incorporated into a rational function model.
Rational functions are typically identified by the degrees of the numerator 7. Rational function models can often be used to model complicated
and denominator. For example, a quadratic for the numerator and a cubic for structure with a fairly low degree in both the numerator and
the denominator is identified as a quadratic/cubic rational function. denominator. This in turn means that fewer coefficients will be
required compared to the polynomial model.
Rational A rational function model is a generalization of the polynomial model. 8. Rational function models are moderately easy to handle
Function Rational function models contain polynomial models as a subset (i.e., the computationally. Although they are nonlinear models, rational
Models case when the denominator is a constant). function models are a particularly easy nonlinear models to fit.
If modeling via polynomial models is inadequate due to any of the
limitations above, you should consider a rational function model. Disadvantages Rational function models have the following disadvantages.
1. The properties of the rational function family are not as well known to
Note that fitting rational function models is also referred to as the Pade engineers and scientists as are those of the polynomial family. The
approximation. literature on the rational function family is also more limited. Because
the properties of the family are often not well understood, it can be
difficult to answer the following modeling question:
Given that data has a certain shape, what values should be
chosen for the degree of the numerator and the degree on the
denominator?
2. Unconstrained rational function fitting can, at times, result in
undesired nusiance asymptotes (vertically) due to roots in the
denominator polynomial. The range of values affected by the
function "blowing up" may be quite narrow, but such asymptotes,
when they occur, are a nuisance for local interpolation in the
neighborhood of the asymptote point. These asymptotes are easy to

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812.htm (1 of 4) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812.htm (2 of 4) [11/14/2003 5:51:11 PM]


4.8.1.2. Rational Functions 4.8.1.2. Rational Functions

detect by a simple plot of the fitted function over the range of the The subset of points should be selected over the range of the data. It is not
data. Such asymptotes should not discourage you from considering critical which points are selected, although you should avoid points that are
rational function models as a choice for empirical modeling. These obvious outliers.
nuisance asymptotes occur occasionally and unpredictably, but the
gain in flexibility of shapes is well worth the chance that they may Example The thermal expansion of copper case study contains an example of fitting a
occur.
rational function model.
General The following are general properties of rational functions.
Specific 1. Constant / Linear Rational Function
Properties of ● If the numerator and denominator are of the same degree (n=m), then Rational 2. Linear / Linear Rational Function
Rational y = an/bm is a horizontal asymptote of the function. Functions
Functions 3. Linear / Quadratic Rational Function
● If the degree of the denominator is greater than the degree of the
numerator, then y = 0 is a horizontal asymptote. 4. Quadratic / Linear Rational Function
● If the degree of the denominator is less than the degree of the 5. Quadratic / Quadratic Rational Function
numerator, then there are no horizontal asymptotes.
6. Cubic / Linear Rational Function
● When x is equal to a root of the denominator polynomial, the
denominator is zero and there is a vertical asymptote. The exception 7. Cubic / Quadratic Rational Function
is the case when the root of the denominator is also a root of the 8. Linear / Cubic Rational Function
numerator. However, for this case we can cancel a factor from both 9. Quadratic / Cubic Rational Function
the numerator and denominator (and we effectively have a
lower-degree rational function). 10. Cubic / Cubic Rational Function
11. Determining m and n for Rational Function Models
Starting One common difficulty in fitting nonlinear models is finding adequate
Values for starting values. A major advantage of rational function models is the ability
Rational to compute starting values using a linear least squares fit.
Function
Models To do this, choose p points from the data set, with p denoting the number of
parameters in the rational model. For example, given the linear/quadratic
model

we need to select four representative points.


We then perform a linear fit on the model

Here, pn and pd are the degrees of the numerator and denominator,


respectively, and the and Y contain the subset of points, not the full data
set. The estimated coefficients from this fit made using the linear least
squares algorithm are used as the starting values for fitting the nonlinear
model to the full data set.
Note: This type of fit, with the response variable appearing on both sides of
the function, should only be used to obtain starting values for the nonlinear
fit. The statistical properties of models like this are not well understood.

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812.htm (3 of 4) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812.htm (4 of 4) [11/14/2003 5:51:11 PM]


4.8.1.2.1. Constant / Linear Rational Function 4.8.1.2.1. Constant / Linear Rational Function

Function
Family: Rational

Statistical
4. Process Modeling Type: Nonlinear
4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
4.8.1.2. Rational Functions Domain:

4.8.1.2.1. Constant / Linear Rational Range:

Function Special
Features: Horizontal asymptote at:

and vertical asymptote at:

Additional
Examples:

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8121.htm (1 of 6) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8121.htm (2 of 6) [11/14/2003 5:51:11 PM]


4.8.1.2.1. Constant / Linear Rational Function 4.8.1.2.1. Constant / Linear Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8121.htm (3 of 6) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8121.htm (4 of 6) [11/14/2003 5:51:11 PM]


4.8.1.2.1. Constant / Linear Rational Function 4.8.1.2.1. Constant / Linear Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8121.htm (5 of 6) [11/14/2003 5:51:11 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8121.htm (6 of 6) [11/14/2003 5:51:11 PM]


4.8.1.2.2. Linear / Linear Rational Function 4.8.1.2.2. Linear / Linear Rational Function

Function
Family: Rational

Statistical
4. Process Modeling Type: Nonlinear
4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
4.8.1.2. Rational Functions Domain:

4.8.1.2.2. Linear / Linear Rational Function Range:

Special
Features: Horizontal asymptote at:

and vertical asymptote at:

Additional
Examples:

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8122.htm (1 of 6) [11/14/2003 5:51:12 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8122.htm (2 of 6) [11/14/2003 5:51:12 PM]


4.8.1.2.2. Linear / Linear Rational Function 4.8.1.2.2. Linear / Linear Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8122.htm (3 of 6) [11/14/2003 5:51:12 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8122.htm (4 of 6) [11/14/2003 5:51:12 PM]


4.8.1.2.2. Linear / Linear Rational Function 4.8.1.2.2. Linear / Linear Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8122.htm (5 of 6) [11/14/2003 5:51:12 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8122.htm (6 of 6) [11/14/2003 5:51:12 PM]


4.8.1.2.3. Linear / Quadratic Rational Function 4.8.1.2.3. Linear / Quadratic Rational Function

Function Rational
Family:

Statistical Nonlinear
4. Process Modeling
Type:
4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
4.8.1.2. Rational Functions Domain:

with undefined points at


4.8.1.2.3. Linear / Quadratic Rational
Function
There will be 0, 1, or 2 real solutions to this equation, corresponding to whether

is negative, zero, or positive.

Range:

Special Horizontal asymptote at:


Features:

and vertical asymptotes at:

There will be 0, 1, or 2 real solutions to this equation corresponding to whether

is negative, zero, or positive.

Additional
Examples:

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8123.htm (1 of 6) [11/14/2003 5:51:13 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8123.htm (2 of 6) [11/14/2003 5:51:13 PM]


4.8.1.2.3. Linear / Quadratic Rational Function 4.8.1.2.3. Linear / Quadratic Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8123.htm (3 of 6) [11/14/2003 5:51:13 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8123.htm (4 of 6) [11/14/2003 5:51:13 PM]


4.8.1.2.3. Linear / Quadratic Rational Function 4.8.1.2.3. Linear / Quadratic Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8123.htm (5 of 6) [11/14/2003 5:51:13 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8123.htm (6 of 6) [11/14/2003 5:51:13 PM]


4.8.1.2.4. Quadratic / Linear Rational Function 4.8.1.2.4. Quadratic / Linear Rational Function

Function Rational
Family:

Statistical
4. Process Modeling Nonlinear
Type:
4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
4.8.1.2. Rational Functions Domain:

4.8.1.2.4. Quadratic / Linear Rational Range:

Function
with

and

Special Vertical asymptote at:


Features:

Additional
Examples:

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8124.htm (1 of 6) [11/14/2003 5:51:13 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8124.htm (2 of 6) [11/14/2003 5:51:13 PM]


4.8.1.2.4. Quadratic / Linear Rational Function 4.8.1.2.4. Quadratic / Linear Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8124.htm (3 of 6) [11/14/2003 5:51:13 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8124.htm (4 of 6) [11/14/2003 5:51:13 PM]


4.8.1.2.4. Quadratic / Linear Rational Function 4.8.1.2.4. Quadratic / Linear Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8124.htm (5 of 6) [11/14/2003 5:51:13 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8124.htm (6 of 6) [11/14/2003 5:51:13 PM]


4.8.1.2.5. Quadratic / Quadratic Rational Function 4.8.1.2.5. Quadratic / Quadratic Rational Function

Function Rational
Family:

Statistical Nonlinear
4. Process Modeling
4.8. Some Useful Functions for Process Modeling
Type:
4.8.1. Univariate Functions
4.8.1.2. Rational Functions Domain:

with undefined points at


4.8.1.2.5. Quadratic / Quadratic Rational
Function
There will be 0, 1, or 2 real solutions to this equation corresponding to whether

is negative, zero, or positive.

Range: The range is complicated and depends on the specific values of 1, ..., 5.

Special Horizontal asymptotes at:


Features:

and vertical asymptotes at:

There will be 0, 1, or 2 real solutions to this equation corresponding to whether

is negative, zero, or positive.

Additional
Examples:

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8125.htm (1 of 7) [11/14/2003 5:51:14 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8125.htm (2 of 7) [11/14/2003 5:51:14 PM]


4.8.1.2.5. Quadratic / Quadratic Rational Function 4.8.1.2.5. Quadratic / Quadratic Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8125.htm (3 of 7) [11/14/2003 5:51:14 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8125.htm (4 of 7) [11/14/2003 5:51:14 PM]


4.8.1.2.5. Quadratic / Quadratic Rational Function 4.8.1.2.5. Quadratic / Quadratic Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8125.htm (5 of 7) [11/14/2003 5:51:14 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8125.htm (6 of 7) [11/14/2003 5:51:14 PM]


4.8.1.2.5. Quadratic / Quadratic Rational Function 4.8.1.2.6. Cubic / Linear Rational Function

4. Process Modeling
4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
4.8.1.2. Rational Functions

4.8.1.2.6. Cubic / Linear Rational Function

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8125.htm (7 of 7) [11/14/2003 5:51:14 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8126.htm (1 of 6) [11/14/2003 5:51:15 PM]


4.8.1.2.6. Cubic / Linear Rational Function 4.8.1.2.6. Cubic / Linear Rational Function

Function
Family: Rational

Statistical
Type: Nonlinear

Domain:

Range:

Special
Features: Vertical asymptote at:

Additional
Examples:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8126.htm (2 of 6) [11/14/2003 5:51:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8126.htm (3 of 6) [11/14/2003 5:51:15 PM]


4.8.1.2.6. Cubic / Linear Rational Function 4.8.1.2.6. Cubic / Linear Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8126.htm (4 of 6) [11/14/2003 5:51:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8126.htm (5 of 6) [11/14/2003 5:51:15 PM]


4.8.1.2.6. Cubic / Linear Rational Function 4.8.1.2.7. Cubic / Quadratic Rational Function

4. Process Modeling
4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
4.8.1.2. Rational Functions

4.8.1.2.7. Cubic / Quadratic Rational


Function

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8126.htm (6 of 6) [11/14/2003 5:51:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8127.htm (1 of 6) [11/14/2003 5:51:15 PM]


4.8.1.2.7. Cubic / Quadratic Rational Function 4.8.1.2.7. Cubic / Quadratic Rational Function

Function
Family: Rational

Statistical
Type: Nonlinear

Domain:

with undefined points at

There will be 0, 1, or 2 real solutions to this equation corresponding to whether

is negative, zero, or positive.

Range:

Special Vertical asymptotes at:


Features:

There will be 0, 1, or 2 real solutions to this equation corresponding to whether

is negative, zero, or positive.

Additional
Examples:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8127.htm (2 of 6) [11/14/2003 5:51:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8127.htm (3 of 6) [11/14/2003 5:51:15 PM]


4.8.1.2.7. Cubic / Quadratic Rational Function 4.8.1.2.7. Cubic / Quadratic Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8127.htm (4 of 6) [11/14/2003 5:51:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8127.htm (5 of 6) [11/14/2003 5:51:15 PM]


4.8.1.2.7. Cubic / Quadratic Rational Function 4.8.1.2.8. Linear / Cubic Rational Function

4. Process Modeling
4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
4.8.1.2. Rational Functions

4.8.1.2.8. Linear / Cubic Rational Function

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8127.htm (6 of 6) [11/14/2003 5:51:15 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8128.htm (1 of 5) [11/14/2003 5:51:16 PM]


4.8.1.2.8. Linear / Cubic Rational Function 4.8.1.2.8. Linear / Cubic Rational Function

Function
Family: Rational

Statistical
Type: Nonlinear

Domain:

with undefined points at the roots of

There will be 1, 2, or 3 roots, depending on the particular values of the parameters.


Explicit solutions for the roots of a cubic polynomial are complicated and are not
given here. Many mathematical and statistical software programs can determine the
roots of a polynomial equation numerically, and it is recommended that you use one
of these programs if you need to know where these roots occur.

Range:

with the possible exception that zero may be excluded.

Special Horizontal asymptote at:


Features:

and vertical asymptotes at the roots of

There will be 1, 2, or 3 roots, depending on the particular values of the parameters.


Explicit solutions for the roots of a cubic polynomial are complicated and are not
given here. Many mathematical and statistical software programs can determine the
roots of a polynomial equation numerically, and it is recommended that you use one
of these programs if you need to know where these roots occur.

Additional
Examples:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8128.htm (2 of 5) [11/14/2003 5:51:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8128.htm (3 of 5) [11/14/2003 5:51:16 PM]


4.8.1.2.8. Linear / Cubic Rational Function 4.8.1.2.8. Linear / Cubic Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8128.htm (4 of 5) [11/14/2003 5:51:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8128.htm (5 of 5) [11/14/2003 5:51:16 PM]


4.8.1.2.9. Quadratic / Cubic Rational Function 4.8.1.2.9. Quadratic / Cubic Rational Function

Function
Family: Rational

Statistical
4. Process Modeling
4.8. Some Useful Functions for Process Modeling
Type: Nonlinear
4.8.1. Univariate Functions
4.8.1.2. Rational Functions Domain:

with undefined points at the roots of


4.8.1.2.9. Quadratic / Cubic Rational
Function
There will be 1, 2, or 3 roots, depending on the particular values of the parameters.
Explicit solutions for the roots of a cubic polynomial are complicated and are not
given here. Many mathematical and statistical software programs can determine the
roots of a polynomial equation numerically, and it is recommended that you use one
of these programs if you need to know where these roots occur.

Range:

with the possible exception that zero may be excluded.

Special Horizontal asymptote at:


Features:

and vertical asymptotes at the roots of

There will be 1, 2, or 3 roots, depending on the particular values of the parameters.


Explicit solutions for the roots of a cubic polynomial are complicated and are not
given here. Many mathematical and statistical software programs can determine the
roots of a polynomial equation numerically, and it is recommended that you use one
of these programs if you need to know where these roots occur.

Additional
Examples:

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8129.htm (1 of 4) [11/14/2003 5:51:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8129.htm (2 of 4) [11/14/2003 5:51:16 PM]


4.8.1.2.9. Quadratic / Cubic Rational Function 4.8.1.2.9. Quadratic / Cubic Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8129.htm (3 of 4) [11/14/2003 5:51:16 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd8129.htm (4 of 4) [11/14/2003 5:51:16 PM]


4.8.1.2.10. Cubic / Cubic Rational Function 4.8.1.2.10. Cubic / Cubic Rational Function

Function
Family: Rational

Statistical
4. Process Modeling Type: Nonlinear
4.8. Some Useful Functions for Process Modeling
4.8.1. Univariate Functions
4.8.1.2. Rational Functions Domain:

with undefined points at the roots of


4.8.1.2.10. Cubic / Cubic Rational Function
There will be 1, 2, or 3 roots, depending on the particular values of the parameters.
Explicit solutions for the roots of a cubic polynomial are complicated and are not
given here. Many mathematical and statistical software programs can determine the
roots of a polynomial equation numerically, and it is recommended that you use one
of these programs if you need to know where these roots occur.

Range:

with the exception that y = may be excluded.

Special Horizontal asymptote at:


Features:

and vertical asymptotes at the roots of

There will be 1, 2, or 3 roots, depending on the particular values of the parameters.


Explicit solutions for the roots of a cubic polynomial are complicated and are not
given here. Many mathematical and statistical software programs can determine the
roots of a polynomial equation numerically, and it is recommended that you use one
of these programs if you need to know where these roots occur.

Additional
Examples:

Function:

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812a.htm (1 of 4) [11/14/2003 5:51:17 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812a.htm (2 of 4) [11/14/2003 5:51:17 PM]


4.8.1.2.10. Cubic / Cubic Rational Function 4.8.1.2.10. Cubic / Cubic Rational Function

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812a.htm (3 of 4) [11/14/2003 5:51:17 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812a.htm (4 of 4) [11/14/2003 5:51:17 PM]


4.8.1.2.11. Determining m and n for Rational Function Models 4.8.1.2.11. Determining m and n for Rational Function Models

● f( ) = 0, this implies n < m


● f( ) = constant, this implies n = m
● f( )= , this implies n > m

4. Process Modeling Question 2: The slope is determined by the derivative of a function. The derivative of a rational function is
4.8. Some Useful Functions for Process Modeling What Slope
4.8.1. Univariate Functions Should the
4.8.1.2. Rational Functions Function
Have at x = with
?
4.8.1.2.11. Determining m and n for Rational
Function Models

General A general question for rational function models is:


Question I have data to which I wish to fit a rational function to. What degrees n and m should I use
for the numerator and denominator, respectively?
Asymptotically
Four To answer the above broad question, the following four specific questions need to be answered.
Questions 1. What value should the function have at x = ? Specifically, is the value zero, a constant,
or plus or minus infinity? From this it follows that
2. What slope should the function have at x = ? Specifically, is the derivative of the ● if n < m, R'( )=0
function zero, a constant, or plus or minus infinity? ● if n = m, R'( )=0
3. How many times should the function equal zero (i.e., f (x) = 0) for finite x? ● if n = m +1, R'( ) = an/bm
4. How many times should the slope equal zero (i.e., f '(x) = 0) for finite x? ● if n > m + 1, R'( )=
These questions are answered by the analyst by inspection of the data and by theoretical Conversely, if the fitted function f(x) is such that
considerations of the phenomenon under study.
● f'( ) = 0, this implies n m
Each of these questions is addressed separately below. ● f'( ) = constant, this implies n = m + 1
● f'( )= , this implies n > m + 1
Question 1: Given the rational function
What Value
Question 3: For fintite x, R(x) = 0 only when the numerator polynomial, Pn, equals zero.
Should the
How Many
Function The numerator polynomial, and thus R(x) as well, can have between zero and n real roots. Thus,
Times Should
Have at x = for a given n, the number of real roots of R(x) is less than or equal to n.
or the Function
?
Equal Zero Conversely, if the fitted function f(x) is such that, for finite x, the number of times f(x) = 0 is k3,
for Finite ?
then n is greater than or equal to k3.
then asymptotically

From this it follows that


● if n < m, R( )=0
● if n = m, R( ) = an/bm
● if n > m, R( )=
Conversely, if the fitted function f(x) is such that

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (1 of 13) [11/14/2003 5:51:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (2 of 13) [11/14/2003 5:51:18 PM]
4.8.1.2.11. Determining m and n for Rational Function Models 4.8.1.2.11. Determining m and n for Rational Function Models

Question 4: The derivative function, R'(x), of the rational function will equal zero when the numerator Examples for The goal is to go from a sample data set to a specific rational function. The graphs below
How Many polynomial equals zero. The number of real roots of a polynomial is between zero and the degree Determing m summarize some common shapes that rational functions can have and shows the admissible
Times Should of the polynomial. and n values and the simplest case for n and m. We typically start with the simplest case. If the model
the Slope validation indicates an inadequate model, we then try other rational functions in the admissible
Equal Zero For n not equal to m, the numerator polynomial of R'(x) has order n+m-1. For n equal to m, the region.
for Finite ? numerator polynomial of R'(x) has order n+m-2.
From this it follows that Shape 1
● if n m, the number of real roots of R'(x), k4, n+m-1.
● if n = m, the number of real roots of R'(x), k4, is n+m-2.
Conversely, if the fitted function f(x) is such that, for finite x and n m, the number of times f'(x)
= 0 is k4, then n+m-1 is k4. Similarly, if the fitted function f(x) is such that, for finite x and n =
m, the number of times f'(x) = 0 is k4, then n+m-2 k4.

Tables for In summary, we can determine the admissible combinations of n and m by using the following
Determining four tables to generate an n versus m graph. Choose the simplest (n,m) combination for the
Admissible degrees of the intial rational function model.
Combinations
of m and n 1. Desired value of f( ) Relation of n to m

0 n<m
constant n=m
n>m

2. Desired value of f'( ) Relation of n to m

0 n<m+1
constant n = m +1
n>m+1

3. For finite x, desired number, k3, Relation of n to k3


of times f(x) = 0 Shape 2

k3 n k3

4. For finite x, desired number, k4, Relation of n to k4 and m


of times f'(x) = 0

k4 (n m) n (1 + k4) - m
k4 (n = m) n (2 + k4) - m

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (3 of 13) [11/14/2003 5:51:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (4 of 13) [11/14/2003 5:51:18 PM]
4.8.1.2.11. Determining m and n for Rational Function Models 4.8.1.2.11. Determining m and n for Rational Function Models

Shape 3 Shape 4

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (5 of 13) [11/14/2003 5:51:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (6 of 13) [11/14/2003 5:51:18 PM]
4.8.1.2.11. Determining m and n for Rational Function Models 4.8.1.2.11. Determining m and n for Rational Function Models

Shape 5 Shape 6

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (7 of 13) [11/14/2003 5:51:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (8 of 13) [11/14/2003 5:51:18 PM]
4.8.1.2.11. Determining m and n for Rational Function Models 4.8.1.2.11. Determining m and n for Rational Function Models

Shape 7 Shape 8

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (9 of 13) [11/14/2003 5:51:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (10 of 13) [11/14/2003 5:51:18 PM]
4.8.1.2.11. Determining m and n for Rational Function Models 4.8.1.2.11. Determining m and n for Rational Function Models

Shape 9 Shape 10

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (11 of 13) [11/14/2003 5:51:18 PM] http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (12 of 13) [11/14/2003 5:51:18 PM]
4.8.1.2.11. Determining m and n for Rational Function Models

http://www.itl.nist.gov/div898/handbook/pmd/section8/pmd812b.htm (13 of 13) [11/14/2003 5:51:18 PM]


5. Process Improvement 5. Process Improvement

5. Advanced Topics 6. Case Studies


1. When classical designs don't work 1. Eddy current probe sensitivity study
2. Computer-aided designs 2. Sonoluminescent light intensity
5. Process Improvement 1. D-Optimal designs study
2. Repairing a design
1. Introduction 2. Assumptions
3. Optimizing a process
1. Definition of experimental design 1. Measurement system capable
1. Single response case
2. Uses 2. Process stable
2. Multiple response case
3. Steps 3. Simple model
4. Mixture designs
4. Residuals well-behaved
1. Mixture screening designs
2. Simplex-lattice designs
3. Choosing an Experimental Design 4. Analysis of DOE Data 3. Simplex-centroid designs
1. Set objectives 1. DOE analysis steps 4. Constrained mixture designs
2. Select process variables and levels 2. Plotting DOE data 5. Treating mixture and process
variables
3. Select experimental design 3. Modeling DOE data
together
1. Completely randomized 4. Testing and revising DOE models
5. Nested variation
designs 5. Interpreting DOE results
6. Taguchi designs
2. Randomized block designs 6. Confirming DOE results
7. John's 3/4 fractional factorial
3. Full factorial designs 7. DOE examples designs
4. Fractional factorial designs 1. Full factorial example 8. Small composite designs
5. Plackett-Burman designs 2. Fractional factorial example 9. An EDA approach to experiment
6. Response surface designs 3. Response surface example design
7. Adding center point runs
8. Improving fractional design
resolution 7. A Glossary of DOE Terminology 8. References
9. Three-level full factorial
designs Click here for a detailed table of contents
10. Three-level, mixed-level and
fractional factorial designs

http://www.itl.nist.gov/div898/handbook/pri/pri.htm (1 of 2) [11/14/2003 5:52:44 PM] http://www.itl.nist.gov/div898/handbook/pri/pri.htm (2 of 2) [11/14/2003 5:52:44 PM]


5. Process Improvement 5. Process Improvement

4. Fractional factorial designs [5.3.3.4.]


1. A 23-1 design (half of a 23) [5.3.3.4.1.]
2. Constructing the 23-1 half-fraction design [5.3.3.4.2.]
3. Confounding (also called aliasing) [5.3.3.4.3.]
4. Fractional factorial design specifications and design
5. Process Improvement - Detailed Table of resolution [5.3.3.4.4.]
5. Use of fractional factorial designs [5.3.3.4.5.]
Contents [5.] 6. Screening designs [5.3.3.4.6.]
7. Summary tables of useful fractional factorial designs [5.3.3.4.7.]
5. Plackett-Burman designs [5.3.3.5.]
1. Introduction [5.1.]
6. Response surface designs [5.3.3.6.]
1. What is experimental design? [5.1.1.]
1. Central Composite Designs (CCD) [5.3.3.6.1.]
2. What are the uses of DOE? [5.1.2.]
2. Box-Behnken designs [5.3.3.6.2.]
3. What are the steps of DOE? [5.1.3.]
3. Comparisons of response surface designs [5.3.3.6.3.]
4. Blocking a response surface design [5.3.3.6.4.]
2. Assumptions [5.2.]
7. Adding centerpoints [5.3.3.7.]
1. Is the measurement system capable? [5.2.1.]
8. Improving fractional factorial design resolution [5.3.3.8.]
2. Is the process stable? [5.2.2.]
1. Mirror-Image foldover designs [5.3.3.8.1.]
3. Is there a simple model? [5.2.3.]
2. Alternative foldover designs [5.3.3.8.2.]
4. Are the model residuals well-behaved? [5.2.4.]
9. Three-level full factorial designs [5.3.3.9.]
3. Choosing an experimental design [5.3.] 10. Three-level, mixed-level and fractional factorial designs [5.3.3.10.]
1. What are the objectives? [5.3.1.]
4. Analysis of DOE data [5.4.]
2. How do you select and scale the process variables? [5.3.2.]
1. What are the steps in a DOE analysis? [5.4.1.]
3. How do you select an experimental design? [5.3.3.]
2. How to "look" at DOE data [5.4.2.]
1. Completely randomized designs [5.3.3.1.]
3. How to model DOE data [5.4.3.]
2. Randomized block designs [5.3.3.2.]
4. How to test and revise DOE models [5.4.4.]
1. Latin square and related designs [5.3.3.2.1.]
5. How to interpret DOE results [5.4.5.]
2. Graeco-Latin square designs [5.3.3.2.2.]
6. How to confirm DOE results (confirmatory runs) [5.4.6.]
3. Hyper-Graeco-Latin square designs [5.3.3.2.3.]
7. Examples of DOE's [5.4.7.]
3. Full factorial designs [5.3.3.3.]
1. Full factorial example [5.4.7.1.]
1. Two-level full factorial designs [5.3.3.3.1.]
2. Fractional factorial example [5.4.7.2.]
2. Full factorial example [5.3.3.3.2.]
3. Response surface model example [5.4.7.3.]
3. Blocking of full factorial designs [5.3.3.3.3.]

http://www.itl.nist.gov/div898/handbook/pri/pri_d.htm (1 of 6) [11/14/2003 5:52:54 PM] http://www.itl.nist.gov/div898/handbook/pri/pri_d.htm (2 of 6) [11/14/2003 5:52:54 PM]


5. Process Improvement 5. Process Improvement

5. Advanced topics [5.5.] 3. Dex mean plot [5.5.9.3.]


1. What if classical designs don't work? [5.5.1.] 4. Interaction effects matrix plot [5.5.9.4.]
2. What is a computer-aided design? [5.5.2.] 5. Block plot [5.5.9.5.]
1. D-Optimal designs [5.5.2.1.] 6. Dex Youden plot [5.5.9.6.]
2. Repairing a design [5.5.2.2.] 7. |Effects| plot [5.5.9.7.]
3. How do you optimize a process? [5.5.3.] 1. Statistical significance [5.5.9.7.1.]
1. Single response case [5.5.3.1.] 2. Engineering significance [5.5.9.7.2.]
1. Single response: Path of steepest ascent [5.5.3.1.1.] 3. Numerical significance [5.5.9.7.3.]
2. Single response: Confidence region for search path [5.5.3.1.2.] 4. Pattern significance [5.5.9.7.4.]
3. Single response: Choosing the step length [5.5.3.1.3.] 8. Half-normal probability plot [5.5.9.8.]
4. Single response: Optimization when there is adequate quadratic 9. Cumulative residual standard deviation plot [5.5.9.9.]
fit [5.5.3.1.4.] 1. Motivation: What is a Model? [5.5.9.9.1.]
5. Single response: Effect of sampling error on optimal 2. Motivation: How do we Construct a Goodness-of-fit Metric for a
solution [5.5.3.1.5.] Model? [5.5.9.9.2.]
6. Single response: Optimization subject to experimental region 3. Motivation: How do we Construct a Good Model? [5.5.9.9.3.]
constraints [5.5.3.1.6.]
4. Motivation: How do we Know When to Stop Adding
2. Multiple response case [5.5.3.2.] Terms? [5.5.9.9.4.]
1. Multiple responses: Path of steepest ascent [5.5.3.2.1.] 5. Motivation: What is the Form of the Model? [5.5.9.9.5.]
2. Multiple responses: The desirability approach [5.5.3.2.2.] 6. Motivation: Why is the 1/2 in the Model? [5.5.9.9.6.]
3. Multiple responses: The mathematical programming 7. Motivation: What are the Advantages of the LinearCombinatoric
approach [5.5.3.2.3.] Model? [5.5.9.9.7.]
4. What is a mixture design? [5.5.4.] 8. Motivation: How do we use the Model to Generate Predicted
1. Mixture screening designs [5.5.4.1.] Values? [5.5.9.9.8.]
2. Simplex-lattice designs [5.5.4.2.] 9. Motivation: How do we Use the Model Beyond the Data
3. Simplex-centroid designs [5.5.4.3.] Domain? [5.5.9.9.9.]
4. Constrained mixture designs [5.5.4.4.] 10. Motivation: What is the Best Confirmation Point for
Interpolation? [5.5.9.9.10.]
5. Treating mixture and process variables together [5.5.4.5.]
11. Motivation: How do we Use the Model for Interpolation? [5.5.9.9.11.]
5. How can I account for nested variation (restricted randomization)? [5.5.5.]
12. Motivation: How do we Use the Model for Extrapolation? [5.5.9.9.12.]
6. What are Taguchi designs? [5.5.6.]
10. DEX contour plot [5.5.9.10.]
7. What are John's 3/4 fractional factorial designs? [5.5.7.]
1. How to Interpret: Axes [5.5.9.10.1.]
8. What are small composite designs? [5.5.8.]
2. How to Interpret: Contour Curves [5.5.9.10.2.]
9. An EDA approach to experimental design [5.5.9.]
3. How to Interpret: Optimal Response Value [5.5.9.10.3.]
1. Ordered data plot [5.5.9.1.]
4. How to Interpret: Best Corner [5.5.9.10.4.]
2. Dex scatter plot [5.5.9.2.]

http://www.itl.nist.gov/div898/handbook/pri/pri_d.htm (3 of 6) [11/14/2003 5:52:54 PM] http://www.itl.nist.gov/div898/handbook/pri/pri_d.htm (4 of 6) [11/14/2003 5:52:54 PM]


5. Process Improvement 5. Process Improvement

5. How to Interpret: Steepest Ascent/Descent [5.5.9.10.5.]


6. How to Interpret: Optimal Curve [5.5.9.10.6.]
7. How to Interpret: Optimal Setting [5.5.9.10.7.]

6. Case Studies [5.6.]


1. Eddy Current Probe Sensitivity Case Study [5.6.1.]
1. Background and Data [5.6.1.1.]
2. Initial Plots/Main Effects [5.6.1.2.]
3. Interaction Effects [5.6.1.3.]
4. Main and Interaction Effects: Block Plots [5.6.1.4.]
5. Estimate Main and Interaction Effects [5.6.1.5.]
6. Modeling and Prediction Equations [5.6.1.6.]
7. Intermediate Conclusions [5.6.1.7.]
8. Important Factors and Parsimonious Prediction [5.6.1.8.]
9. Validate the Fitted Model [5.6.1.9.]
10. Using the Fitted Model [5.6.1.10.]
11. Conclusions and Next Step [5.6.1.11.]
12. Work This Example Yourself [5.6.1.12.]
2. Sonoluminescent Light Intensity Case Study [5.6.2.]
1. Background and Data [5.6.2.1.]
2. Initial Plots/Main Effects [5.6.2.2.]
3. Interaction Effects [5.6.2.3.]
4. Main and Interaction Effects: Block Plots [5.6.2.4.]
5. Important Factors: Youden Plot [5.6.2.5.]
6. Important Factors: |Effects| Plot [5.6.2.6.]
7. Important Factors: Half-Normal Probability Plot [5.6.2.7.]
8. Cumulative Residual Standard Deviation Plot [5.6.2.8.]
9. Next Step: Dex Contour Plot [5.6.2.9.]
10. Summary of Conclusions [5.6.2.10.]
11. Work This Example Yourself [5.6.2.11.]

7. A Glossary of DOE Terminology [5.7.]

8. References [5.8.]

http://www.itl.nist.gov/div898/handbook/pri/pri_d.htm (5 of 6) [11/14/2003 5:52:54 PM] http://www.itl.nist.gov/div898/handbook/pri/pri_d.htm (6 of 6) [11/14/2003 5:52:54 PM]


5.1. Introduction 5.1.1. What is experimental design?

5. Process Improvement 5. Process Improvement


5.1. Introduction

5.1. Introduction
5.1.1. What is experimental design?
This section This section introduces the basic concepts, terminology, goals and
describes procedures underlying the proper statistical design of experiments. Experimental In an experiment, we deliberately change one or more process variables (or
the basic Design of experiments is abbreviated as DOE throughout this chapter Design (or factors) in order to observe the effect the changes have on one or more response
concepts of (an alternate abbreviation, DEX, is used in DATAPLOT). DOE) variables. The (statistical) design of experiments (DOE) is an efficient procedure
the Design economically for planning experiments so that the data obtained can be analyzed to yield valid
of Topics covered are: maximizes and objective conclusions.
Experiments ● What is experimental design or DOE? information
DOE begins with determining the objectives of an experiment and selecting the
(DOE or ● What are the goals or uses of DOE?
DEX) process factors for the study. An Experimental Design is the laying out of a
● What are the steps in DOE? detailed experimental plan in advance of doing the experiment. Well chosen
experimental designs maximize the amount of "information" that can be obtained
for a given amount of experimental effort.
The statistical theory underlying DOE generally begins with the concept of
process models.

Process Models for DOE

Black box It is common to begin with a process model of the `black box' type, with several
process discrete or continuous input factors that can be controlled--that is, varied at will
model by the experimenter--and one or more measured output responses. The output
responses are assumed continuous. Experimental data are used to derive an
empirical (approximation) model linking the outputs and inputs. These empirical
models generally contain first and second-order terms.
Often the experiment has to account for a number of uncontrolled factors that
may be discrete, such as different machines or operators, and/or continuous such
as ambient temperature or humidity. Figure 1.1 illustrates this situation.

http://www.itl.nist.gov/div898/handbook/pri/section1/pri1.htm [11/14/2003 5:53:00 PM] http://www.itl.nist.gov/div898/handbook/pri/section1/pri11.htm (1 of 3) [11/14/2003 5:53:00 PM]


5.1.1. What is experimental design? 5.1.1. What is experimental design?

Schematic
for a typical
process with
controlled
inputs, The three terms with single "X's" are the main effects terms. There are k(k-1)/2 =
outputs, 3*2/2 = 3 two-way interaction terms and 1 three-way interaction term (which is
discrete often omitted, for simplicity). When the experimental data are analyzed, all the
uncontrolled unknown " " parameters are estimated and the coefficients of the "X" terms are
factors and tested to see which ones are significantly different from 0.
continuous
uncontrolled Quadratic A second-order (quadratic) model (typically used in response surface DOE's
factors model with suspected curvature) does not include the three-way interaction term but
adds three more terms to the linear model, namely

.
Note: Clearly, a full model could include many cross-product (or interaction)
terms involving squared X's. However, in general these terms are not needed and
most DOE software defaults to leaving them out of the model.

FIGURE 1.1 A `Black Box' Process Model Schematic

Models for The most common empirical models fit to the experimental data take either a
DOE's linear form or quadratic form.

Linear model A linear model with two factors, X1 and X2, can be written as

Here, Y is the response for given levels of the main effects X1 and X2 and the
X1X2 term is included to account for a possible interaction effect between X1 and
X2. The constant 0 is the response of Y when both main effects are 0.

For a more complicated example, a linear model with three factors X1, X2, X3
and one response, Y, would look like (if all possible terms were included in the
model)

http://www.itl.nist.gov/div898/handbook/pri/section1/pri11.htm (2 of 3) [11/14/2003 5:53:00 PM] http://www.itl.nist.gov/div898/handbook/pri/section1/pri11.htm (3 of 3) [11/14/2003 5:53:00 PM]


5.1.2. What are the uses of DOE? 5.1.2. What are the uses of DOE?

Selecting the Often there are many possible factors, some of which may be critical and others which may have
few that little or no effect on a response. It may be desirable, as a goal by itself, to reduce the number of
matter from factors to a relatively small set (2-5) so that attention can be focussed on controlling those factors
5. Process Improvement the many with appropriate specifications, control charts, etc.
5.1. Introduction possible
factors Screening experiments are an efficient way, with a minimal number of runs, of determining the
important factors. They may also be used as a first step when the ultimate goal is to model a
response with a response surface. We will discuss experimental designs for screening a large
5.1.2. What are the uses of DOE? number of factors in Sections 5.3.3.3, 5.3.3.4 and 5.3.3.5.

DOE is a Below are seven examples illustrating situations in which experimental design can be used Response Surface Modeling a Process
multipurpose effectively:
tool that can ● Choosing Between Alternatives Some Once one knows the primary variables (factors) that affect the responses of interest, a number of
help in many
● Selecting the Key Factors Affecting a Response reasons to additional objectives may be pursued. These include:
situations
model a ● Hitting a Target
● Response Surface Modeling to:
process
❍ Hit a Target ● Maximizing or Minimizing a Response
❍ Reduce Variability ● Reducing Variation
❍ Maximize or Minimize a Response ● Making a Process Robust
❍ Make a Process Robust (i.e., the process gets the "right" results even though there ● Seeking Multiple Goals
are uncontrollable "noise" factors) What each of these purposes have in common is that experimentation is used to fit a model that
❍ Seek Multiple Goals may permit a rough, local approximation to the actual surface. Given that the particular objective
can be met with such an approximate model, the experimental effort is kept to a minimum while
● Regression Modeling still achieving the immediate goal.

Choosing Between Alternatives (Comparative Experiment) These response surface modeling objectives will now be briefly expanded upon.

Hitting a Target
A common Supplier A vs. supplier B? Which new additive is the most effective? Is catalyst `x' an
use is improvement over the existing catalyst? These and countless other choices between alternatives
planning an can be presented to us in a never-ending parade. Often we have the choice made for us by outside Often we This is a frequently encountered goal for an experiment.
experiment factors over which we have no control. But in many cases we are also asked to make the choice. want to "fine
tune" a One might try out different settings until the desired target is `hit' consistently. For example, a
to gather It helps if one has valid data to back up one's decision. machine tool that has been recently overhauled may require some setup `tweaking' before it runs
data to make process to
The preferred solution is to agree on a measurement by which competing choices can be consistently on target. Such action is a small and common form of experimentation. However, rather than
a decision experimenting in an ad hoc manner until we happen to find a setup that hits the target, one can fit
between two compared, generate a sample of data from each alternative, and compare average results. The hit a target
'best' average outcome will be our preference. We have performed a comparative experiment! a model estimated from a small experiment and use this model to determine the necessary
or more adjustments to hit the target.
alternatives
More complex forms of experimentation, such as the determination of the correct chemical mix
Types of Sometimes this comparison is performed under one common set of conditions. This is a of a coating that will yield a desired refractive index for the dried coat (and simultaneously
comparitive comparative study with a narrow scope - which is suitable for some initial comparisons of achieve specifications for other attributes), may involve many ingredients and be very sensitive to
studies possible alternatives. Other comparison studies, intended to validate that one alternative is small changes in the percentages in the mix. Fitting suitable models, based on sequentially
perferred over a wide range of conditions, will purposely and systematically vary the background planned experiments, may be the only way to efficiently achieve this goal of hitting targets for
conditions under which the primary comparison is made in order to reach a conclusion that will multiple responses simultaneously.
be proven valid over a broad scope. We discuss experimental designs for each of these types of
comparisons in Sections 5.3.3.1 and 5.3.3.2. Maximizing or Minimizing a Response

Selecting the Key Factors Affecting a Response (Screening Experiments)

http://www.itl.nist.gov/div898/handbook/pri/section1/pri12.htm (1 of 6) [11/14/2003 5:53:01 PM] http://www.itl.nist.gov/div898/handbook/pri/section1/pri12.htm (2 of 6) [11/14/2003 5:53:01 PM]


5.1.2. What are the uses of DOE? 5.1.2. What are the uses of DOE?

Optimizing a Many processes are being run at sub-optimal settings, some of them for years, even though each
process factor has been optimized individually over time. Finding settings that increase yield or decrease
output is a the amount of scrap and rework represent opportunities for substantial financial gain. Often,
common however, one must experiment with multiple inputs to achieve a better output. Section 5.3.3.6 on
goal second-order designs plus material in Section 5.5.3 will be useful for these applications.

FIGURE 1.1 Pathway up the process response surface to an `optimum'

Reducing Variation

Processes A process may be performing with unacceptable consistency, meaning its internal variation is too
that are on high.
target, on
the average, Excessive variation can result from many causes. Sometimes it is due to the lack of having or It might be possible to reduce the variation by altering the setpoints (recipe) of the process, so that
may still following standard operating procedures. At other times, excessive variation is due to certain it runs in a more `stable' region.
have too hard-to-control inputs that affect the critical output characteristics of the process. When this latter
much situation is the case, one may experiment with these hard-to-control factors, looking for a region Graph of
variability where the surface is flatter and the process is easier to manage. To take advantage of such flatness data after
in the surface, one must use designs - such as the second-order designs of Section 5.3.3.6 - that process
permit identification of these features. Contour or surface plots are useful for elucidating the key variation
features of these fitted models. See also 5.5.3.1.4. reduced

Graph of
data before
variation
reduced

http://www.itl.nist.gov/div898/handbook/pri/section1/pri12.htm (3 of 6) [11/14/2003 5:53:01 PM] http://www.itl.nist.gov/div898/handbook/pri/section1/pri12.htm (4 of 6) [11/14/2003 5:53:01 PM]


5.1.2. What are the uses of DOE? 5.1.2. What are the uses of DOE?

Sometimes A product or process seldom has just one desirable output characteristic. There are usually
we have several, and they are often interrelated so that improving one will cause a deterioration of another.
multiple For example: rate vs. consistency; strength vs. expense; etc.
outputs and
we have to Any product is a trade-off between these various desirable final characteristics. Understanding the
compromise boundaries of the trade-off allows one to make the correct choices. This is done by either
to achieve constructing some weighted objective function (`desirability function') and optimizing it, or
desirable examining contour plots of responses generated by a computer program, as given below.
outcomes -
DOE can
help here

Sample
contour plot
of deposition
rate and
capability

Finding this new recipe could be the subject of an experiment, especially if there are many input
factors that could conceivably affect the output.

Making a Process Robust

The less a An item designed and made under controlled conditions will be later `field tested' in the hands of
process or the customer and may prove susceptible to failure modes not seen in the lab or thought of by
product is design. An example would be the starter motor of an automobile that is required to operate under FIGURE 1.4 Overlaid contour plot of Deposition Rate and Capability (Cp)
affected by extremes of external temperature. A starter that performs under such a wide range is termed
external `robust' to temperature. Regression Modeling
conditions,
the better it Designing an item so that it is robust calls for a special experimental effort. It is possible to stress Regression Sometimes we require more than a rough approximating model over a local region. In such cases,
is - this is the item in the design lab and so determine the critical components affecting its performance. A models the standard designs presented in this chapter for estimating first- or second-order polynomial
called different gauge of armature wire might be a solution to the starter motor, but so might be many (Chapter 4) models may not suffice. Chapter 4 covers the topic of experimental design and analysis for fitting
"Robustness" other alternatives. The correct combination of factors can be found only by experimentation.
are used to general models for a single explanatory factor. If one has multiple factors, and either a nonlinear
fit more model or some other special model, the computer-aided designs of Section 5.5.2 may be useful.
Seeking Multiple Goals precise
models

http://www.itl.nist.gov/div898/handbook/pri/section1/pri12.htm (5 of 6) [11/14/2003 5:53:01 PM] http://www.itl.nist.gov/div898/handbook/pri/section1/pri12.htm (6 of 6) [11/14/2003 5:53:01 PM]


5.1.3. What are the steps of DOE? 5.1.3. What are the steps of DOE?

Planning to It is often a mistake to believe that `one big experiment will give the
do a sequence answer.'
of small
experiments is A more useful approach to experimental design is to recognize that
5. Process Improvement while one experiment might provide a useful result, it is more
5.1. Introduction often better
than relying common to perform two or three, or maybe more, experiments before
on one big a complete answer is attained. In other words, an iterative approach is
best and, in the end, most economical. Putting all one's eggs in one
5.1.3. What are the steps of DOE? experiment to
basket is not advisable.
give you all
the answers
Key steps for Obtaining good results from a DOE involves these seven steps:
DOE 1. Set objectives Each stage The reason an iterative approach frequently works best is because it is
2. Select process variables provides logical to move through stages of experimentation, each stage
3. Select an experimental design insight for providing insight as to how the next experiment should be run.
next stage
4. Execute the design
5. Check that the data are consistent with the experimental
assumptions
6. Analyze and interpret the results
7. Use/present the results (may lead to further runs or DOE's).

A checklist of Important practical considerations in planning and running


practical experiments are
considerations ● Check performance of gauges/measurement devices first.

● Keep the experiment as simple as possible.

● Check that all planned runs are feasible.

● Watch out for process drifts and shifts during the run.

● Avoid unplanned changes (e.g., swap operators at halfway


point).
● Allow some time (and back-up material) for unexpected events.

● Obtain buy-in from all parties involved.

● Maintain effective ownership of each step in the experimental


plan.
● Preserve all the raw data--do not keep only summary averages!

● Record everything that happens.

● Reset equipment to its original state after the experiment.

The Sequential or Iterative Approach to DOE

http://www.itl.nist.gov/div898/handbook/pri/section1/pri13.htm (1 of 2) [11/14/2003 5:53:01 PM] http://www.itl.nist.gov/div898/handbook/pri/section1/pri13.htm (2 of 2) [11/14/2003 5:53:01 PM]


5.2. Assumptions 5.2.1. Is the measurement system capable?

5. Process Improvement 5. Process Improvement


5.2. Assumptions

5.2. Assumptions
5.2.1. Is the measurement system capable?
We should In all model building we make assumptions, and we also require
check the certain conditions to be approximately met for purposes of estimation. Metrology It is unhelpful to find, after you have finished all the experimental
engineering This section looks at some of the engineering and mathematical capabilities runs, that the measurement devices you have at your disposal cannot
and assumptions we typically make. These are: are a key measure the changes you were hoping to see. Plan to check this out
model-building ● Are the measurement systems capable for all of your
factor in most before embarking on the experiment itself. Measurement process
assumptions responses? experiments characterization is covered in Chapter 2.
that are made
in most DOE's ● Is your process stable? SPC check of In addition, it is advisable, especially if the experimental material is
● Are your responses likely to be approximated well by simple measurement planned to arrive for measurement over a protracted period, that an
polynomial models? devices SPC (i.e., quality control) check is kept on all measurement devices
● Are the residuals (the difference between the model predictions from the start to the conclusion of the whole experimental project.
Strange experimental outcomes can often be traced to `hiccups' in the
and the actual observations) well behaved?
metrology system.

http://www.itl.nist.gov/div898/handbook/pri/section2/pri2.htm [11/14/2003 5:53:01 PM] http://www.itl.nist.gov/div898/handbook/pri/section2/pri21.htm [11/14/2003 5:53:01 PM]


5.2.2. Is the process stable? 5.2.3. Is there a simple model?

5. Process Improvement 5. Process Improvement


5.2. Assumptions 5.2. Assumptions

5.2.2. Is the process stable? 5.2.3. Is there a simple model?


Plan to Experimental runs should have control runs that are made at the Polynomial In this chapter we restrict ourselves to the case for which the response
examine `standard' process setpoints, or at least at some standard operating approximation variable(s) are continuous outputs denoted as Y. Over the experimental
process recipe. The experiment should start and end with such runs. A plot of models only range, the outputs must not only be continuous, but also reasonably
stability as the outcomes of these control runs will indicate if the underlying process work for smooth. A sharp falloff in Y values is likely to be missed by the
part of your itself has drifted or shifted during the experiment. smoothly approximating polynomials that we use because these polynomials
experiment varying assume a smoothly curving underlying response surface.
It is desirable to experiment on a stable process. However, if this cannot outputs
be achieved, then the process instability must be accounted for in the
analysis of the experiment. For example, if the mean is shifting with
Piecewise If the surface under investigation is known to be only piecewise
time (or experimental trial run), then it will be necessary to include a
smoothness smooth, then the experiments will have to be broken up into separate
trend term in the experimental model (i.e., include a time variable or a
requires experiments, each investigating the shape of the separate sections. A
run number variable).
separate surface that is known to be very jagged (i.e., non-smooth) will not be
experiments successfully approximated by a smooth polynomial.

Examples of
piecewise
smooth and
jagged
responses

Piecewise Smooth Jagged


FIGURE 2.1 Examples of Piecewise
Smooth and Jagged Responses

http://www.itl.nist.gov/div898/handbook/pri/section2/pri22.htm [11/14/2003 5:53:02 PM] http://www.itl.nist.gov/div898/handbook/pri/section2/pri23.htm [11/14/2003 5:53:02 PM]


5.2.4. Are the model residuals well-behaved? 5.2.4. Are the model residuals well-behaved?

Histogram

5. Process Improvement
5.2. Assumptions

5.2.4. Are the model residuals well-behaved?


Residuals are Residuals are estimates of experimental error obtained by subtracting the observed responses
the from the predicted responses.
differences
between the The predicted response is calculated from the chosen model, after all the unknown model
observed and parameters have been estimated from the experimental data.
predicted Examining residuals is a key part of all statistical modeling, including DOE's. Carefully looking
responses at residuals can tell us whether our assumptions are reasonable and our choice of model is
appropriate.

Residuals are Residuals can be thought of as elements of variation unexplained by the fitted model. Since this is
elements of a form of error, the same general assumptions apply to the group of residuals that we typically use
variation for errors in general: one expects them to be (roughly) normal and (approximately) independently
unexplained distributed with a mean of 0 and some constant variance.
by fitted
model

Assumptions These are the assumptions behind ANOVA and classical regression analysis. This means that an
for residuals analyst should expect a regression model to err in predicting a response in a random fashion; the The histogram is a frequency plot obtained by placing the data in regularly spaced cells and
model should predict values higher than actual and lower than actual with equal probability. In plotting each cell frequency versus the center of the cell. Figure 2.2 illustrates an approximately
addition, the level of the error should be independent of when the observation occurred in the normal distribution of residuals produced by a model for a calibration process. We have
study, or the size of the observation being predicted, or even the factor settings involved in superimposed a normal density function on the histogram.
making the prediction. The overall pattern of the residuals should be similar to the bell-shaped
pattern observed when plotting a histogram of normally distributed data. Small sample Sample sizes of residuals are generally small (<50) because experiments have limited treatment
We emphasize the use of graphical methods to examine residuals. sizes combinations, so a histogram is not be the best choice for judging the distribution of residuals. A
more sensitive graph is the normal probability plot.
Departures Departures from these assumptions usually mean that the residuals contain structure that is not
indicate accounted for in the model. Identifying that structure and adding term(s) representing it to the Normal The steps in forming a normal probability plot are:
inadequate original model leads to a better model. probability ● Sort the residuals into ascending order.
model plot
● Calculate the cumulative probability of each residual using the formula:

P(i-th residual) = i/(N+1)


Tests for Residual Normality
with P denoting the cumulative probability of a point, i is the order of the value in the list
Plots for Any graph suitable for displaying the distribution of a set of data is suitable for judging the and N is the number of entries in the list.
examining normality of the distribution of a group of residuals. The three most common types are: ● Plot the calculated p-values versus the residual value on normal probability paper.
residuals 1. histograms, The normal probability plot should produce an approximately straight line if the points come
2. normal probability plots, and from a normal distribution.
3. dot plots.

http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (1 of 10) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (2 of 10) [11/14/2003 5:53:03 PM]
5.2.4. Are the model residuals well-behaved? 5.2.4. Are the model residuals well-behaved?

Sample Figure 2.3 below illustrates the normal probability graph created from the same group of residuals Sample run
normal used for Figure 2.2. sequence plot
probability that exhibits
plot with a time trend
overlaid dot
plot

Sample run
sequence plot
This graph includes the addition of a dot plot. The dot plot is the collection of points along the left that does not
y-axis. These are the values of the residuals. The purpose of the dot plot is to provide an exhibit a time
indication the distribution of the residuals. trend

"S" shaped Small departures from the straight line in the normal probability plot are common, but a clearly
curves "S" shaped curve on this graph suggests a bimodal distribution of residuals. Breaks near the
indicate middle of this graph are also indications of abnormalities in the residual distribution.
bimodal
distribution NOTE: Studentized residuals are residuals converted to a scale approximately representing the
standard deviation of an individual residual from the center of the residual distribution. The
technique used to convert residuals to this form produces a Student's t distribution of values.

Independence of Residuals Over Time

Run sequence If the order of the observations in a data table represents the order of execution of each treatment
plot combination, then a plot of the residuals of those observations versus the case order or time order
of the observations will test for any time dependency. These are referred to as run sequence plots.

http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (3 of 10) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (4 of 10) [11/14/2003 5:53:03 PM]
5.2.4. Are the model residuals well-behaved? 5.2.4. Are the model residuals well-behaved?

Sample
residuals
versus fitted
values plot
showing
increasing
residuals

Interpretation The residuals in Figure 2.4 suggest a time trend, while those in Figure 2.5 do not. Figure 2.4
of the sample suggests that the system was drifting slowly to lower values as the investigation continued. In Sample
run sequence extreme cases a drift of the equipment will produce models with very poor ability to account for residuals
plots the variability in the data (low R2). versus fitted
values plot
If the investigation includes centerpoints, then plotting them in time order may produce a more
that does not
clear indication of a time trend if one exists. Plotting the raw responses in time sequence can also
show
sometimes detect trend changes in a process that residual plots might not detect.
increasing
residuals
Plot of Residuals Versus Corresponding Predicted Values

Check for Plotting residuals versus the value of a fitted response should produce a distribution of points
increasing scattered randomly about 0, regardless of the size of the fitted value. Quite commonly, however,
residuals as residual values may increase as the size of the fitted value increases. When this happens, the
size of fitted residual cloud becomes "funnel shaped" with the larger end toward larger fitted values; that is, the
value residuals have larger and larger scatter as the value of the response increases. Plotting the
increases absolute values of the residuals instead of the signed values will produce a "wedge-shaped"
distribution; a smoothing function is added to each graph which helps to show the trend.

http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (5 of 10) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (6 of 10) [11/14/2003 5:53:03 PM]
5.2.4. Are the model residuals well-behaved? 5.2.4. Are the model residuals well-behaved?

Interpretation A residual distribution such as that in Figure 2.6 showing a trend to higher absolute residuals as Sample
of the the value of the response increases suggests that one should transform the response, perhaps by residuals
residuals modeling its logarithm or square root, etc., (contractive transformations). Transforming a versus factor
versus fitted response in this fashion often simplifies its relationship with a predictor variable and leads to setting plot
values plots simpler models. Later sections discuss transformation in more detail. Figure 2.7 plots the after adding
residuals after a transformation on the response variable was used to reduce the scatter. Notice the a quadratic
difference in scales on the vertical axes. term

Independence of Residuals from Factor Settings

Sample
residuals
versus factor
setting plot

http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (7 of 10) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (8 of 10) [11/14/2003 5:53:03 PM]
5.2.4. Are the model residuals well-behaved? 5.2.4. Are the model residuals well-behaved?

Interpreation Figure 2.8 shows that the size of the residuals changed as a function of a predictor's settings. A Interpretation The example given in Figures 2.8 and 2.9 obviously involves five levels of the predictor. The
of residuals graph like this suggests that the model needs a higher-order term in that predictor or that one of plot experiment utilized a response surface design. For the simple factorial design that includes center
versus factor should transform the predictor using a logarithm or square root, for example. Figure 2.9 shows points, if the response model being considered lacked one or more higher-order terms, the plot of
setting plots the residuals for the same response after adding a quadratic term. Notice the single point widely residuals versus factor settings might appear as in Figure 2.10.
separated from the other residuals in Figure 2.9. This point is an "outlier." That is, its position is
well within the range of values used for this predictor in the investigation, but its result was Graph While the graph gives a definite signal that curvature is present, identifying the source of that
somewhat lower than the model predicted. A signal that curvature is present is a trace resembling indicates curvature is not possible due to the structure of the design. Graphs generated using the other
a "frown" or a "smile" in these graphs. prescence of predictors in that situation would have very similar appearances.
curvature
Sample
residuals Additional Note: Residuals are an important subject discussed repeatedly in this Handbook. For example,
versus factor discussion of graphical residual plots using Dataplot are discussed in Chapter 1 and the general examination of
setting plot residual residuals as a part of model building is discussed in Chapter 4.
lacking one analysis
or more
higher-order
terms

http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (9 of 10) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section2/pri24.htm (10 of 10) [11/14/2003 5:53:03 PM]
5.3. Choosing an experimental design 5.3. Choosing an experimental design

2. Box-Behnken designs
3. Response surface design comparisons
4. Blocking a response surface design

5. Process Improvement
7. Adding center points
8. Improving fractional design resolution
1. Mirror-image foldover designs
5.3. Choosing an experimental design
2. Alternative foldover designs
Contents of This section describes in detail the process of choosing an experimental 9. Three-level full factorial designs
Section 3 design to obtain the results you need. The basic designs an engineer 10. Three-level, mixed level and fractional factorial designs
needs to know about are described in detail.

Note that 1. Set objectives


this section 2. Select process variables and levels
describes
the basic 3. Select experimental design
designs used 1. Completely randomized designs
for most
2. Randomized block designs
engineering
and 1. Latin squares
scientific 2. Graeco-Latin squares
applications
3. Hyper-Graeco-Latin squares
3. Full factorial designs
1. Two-level full factorial designs
2. Full factorial example
3. Blocking of full factorial designs
4. Fractional factorial designs
1. A 23-1 half-fraction design
2. How to construct a 23-1 design
3. Confounding
4. Design resolution
5. Use of fractional factorial designs
6. Screening designs
7. Fractional factorial designs summary tables
5. Plackett-Burman designs
6. Response surface (second-order) designs
1. Central composite designs

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3.htm (1 of 2) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3.htm (2 of 2) [11/14/2003 5:53:03 PM]


5.3.1. What are the objectives? 5.3.1. What are the objectives?

number of factors, usually a full factorial design, since the basic


purpose and analysis is similar.
● Response Surface modeling to achieve one or more of the
following objectives:
5. Process Improvement ❍ hit a target
5.3. Choosing an experimental design ❍ maximize or minimize a response

❍ reduce variation by locating a region where the process is

5.3.1. What are the objectives? easier to manage


❍ make a process robust (note: this objective may often be
accomplished with screening designs rather than with
Planning an The objectives for an experiment are best determined by a team
response surface designs - see Section 5.5.6)
experiment discussion. All of the objectives should be written down, even the
begins with "unspoken" ones. ● Regression modeling
carefully ❍ to estimate a precise model, quantifying the dependence of
considering The group should discuss which objectives are the key ones, and which
response variable(s) on process inputs.
what the ones are "nice but not really necessary". Prioritization of the objectives
objectives helps you decide which direction to go with regard to the selection of
the factors, responses and the particular design. Sometimes prioritization Based on After identifying the objective listed above that corresponds most
(or goals) objective, closely to your specific goal, you can
are will force you to start over from scratch when you realize that the
experiment you decided to run does not meet one or more critical where to go ● proceed to the next section in which we discuss selecting
objectives. next experimental factors
and then
Types of Examples of goals were given earlier in Section 5.1.2, in which we ● select the appropriate design named in section 5.3.3 that suits
designs described four broad categories of experimental designs, with various your objective (and follow the related links).
objectives for each. These were:
● Comparative designs to:

❍ choose between alternatives, with narrow scope, suitable


for an initial comparison (see Section 5.3.3.1)
❍ choose between alternatives, with broad scope, suitable for
a confirmatory comparison (see Section 5.3.3.2)
● Screening designs to identify which factors/effects are important
❍ when you have 2 - 4 factors and can perform a full factorial
(Section 5.3.3.3)
❍ when you have more than 3 factors and want to begin with
as small a design as possible (Section 5.3.3.4 and 5.3.3.5)
❍ when you have some qualitative factors, or you have some
quantitative factors that are known to have a
non-monotonic effect (Section 3.3.3.10)
Note that some authors prefer to restrict the term screening design
to the case where you are trying to extract the most important
factors from a large (say > 5) list of initial factors (usually a
fractional factorial design). We include the case with a smaller

http://www.itl.nist.gov/div898/handbook/pri/section3/pri31.htm (1 of 2) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri31.htm (2 of 2) [11/14/2003 5:53:03 PM]


5.3.2. How do you select and scale the process variables? 5.3.2. How do you select and scale the process variables?

Consider The term "two-level design" is something of a misnomer, however, as it is


adding recommended to include some center points during the experiment (center points
some are located in the middle of the design `box').
5. Process Improvement center
5.3. Choosing an experimental design points to
your
two-level
5.3.2. How do you select and scale the process design

variables? Notation for 2-Level Designs

Guidelines Process variables include both inputs and outputs - i.e., factors and responses. The Matrix The standard layout for a 2-level design uses +1 and -1 notation to denote the
to assist the selection of these variables is best done as a team effort. The team should notation for "high level" and the "low level" respectively, for each factor. For example, the
engineering ● Include all important factors (based on engineering judgment). describing matrix below
judgment an
● Be bold, but not foolish, in choosing the low and high factor levels. Factor 1 (X1) Factor 2 (X2)
process of experiment
● Check the factor settings for impractical or impossible combinations - i.e., Trial 1 -1 -1
selecting
process very low pressure and very high gas flows. Trial 2 +1 -1
variables ● Include all relevant responses.
Trial 3 -1 +1
for a DOE Trial 4 +1 +1
● Avoid using only responses that combine two or more measurements of the
process. For example, if interested in selectivity (the ratio of two etch describes an experiment in which 4 trials (or runs) were conducted with each
rates), measure both rates, not just the ratio. factor set to high or low during a run according to whether the matrix had a +1 or
-1 set for the factor during that trial. If the experiment had more than 2 factors,
Be careful We have to choose the range of the settings for input factors, and it is wise to give there would be an additional column in the matrix for each additional factor.
when this some thought beforehand rather than just try extreme values. In some cases,
choosing extreme values will give runs that are not feasible; in other cases, extreme ranges Note: Some authors shorten the matrix notation for a two-level design by just
the might move one out of a smooth area of the response surface into some jagged recording the plus and minus signs, leaving out the "1's".
allowable region, or close to an asymptote.
range for Coding the The use of +1 and -1 for the factor settings is called coding the data. This aids in
each factor data the interpretation of the coefficients fit to any experimental model. After factor
settings are coded, center points have the value "0". Coding is described in more
Two-level The most popular experimental designs are two-level designs. Why only two detail in the DOE glossary.
designs levels? There are a number of good reasons why two is the most common choice
have just a amongst engineers: one reason is that it is ideal for screening designs, simple and The Model or Analysis Matrix
"high" and economical; it also gives most of the information required to go to a multilevel
a "low" response surface experiment if one is needed.
setting for
each factor

http://www.itl.nist.gov/div898/handbook/pri/section3/pri32.htm (1 of 3) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri32.htm (2 of 3) [11/14/2003 5:53:03 PM]


5.3.2. How do you select and scale the process variables? 5.3.3. How do you select an experimental design?

Design If we add an "I" column and an "X1*X2" column to the matrix of 4 trials for a
matrices two-factor experiment described earlier, we obtain what is known as the model or
analysis matrix for this simple experiment, which is shown below. The model
matrix for a three-factor experiment is shown later in this section. 5. Process Improvement
5.3. Choosing an experimental design
I X1 X2 X1*X2
+1 -1 -1 +1
+1
+1
+1
-1
-1
+1
-1
-1
5.3.3. How do you select an experimental
+1 +1 +1 +1 design?
Model for The model for this experiment is A design is The choice of an experimental design depends on the objectives of the
the selected experiment and the number of factors to be investigated.
experiment based on the
and the "I" column of the design matrix has all 1's to provide for the 0 term. The experimental
X1*X2 column is formed by multiplying the "X1" and "X2" columns together, objective
row element by row element. This column gives interaction term for each trial. and the
number of
Model in In matrix notation, we can summarize this experiment by factors
matrix
Y = X + experimental error
notation Experimental Design Objectives
for which Xis the 4 by 4 design matrix of 1's and -1's shown above, is the vector
of unknown model coefficients and Y is a vector consisting of Types of Types of designs are listed here according to the experimental objective
the four trial response observations. designs are they meet.
listed here ● Comparative objective: If you have one or several factors under
Orthogonal Property of Scaling in a 2-Factor Experiment according to investigation, but the primary goal of your experiment is to make
the a conclusion about one a-priori important factor, (in the presence
experimental of, and/or in spite of the existence of the other factors), and the
Coding Coding is sometime called "orthogonal coding" since all the columns of a coded
objective question of interest is whether or not that factor is "significant",
produces 2-factor design matrix (except the "I" column) are typically orthogonal. That is,
they meet (i.e., whether or not there is a significant change in the response
orthogonal the dot product for any pair of columns is zero. For example, for X1 and X2:
columns (-1)(-1) + (+1)(-1) + (-1)(+1) + (+1)(+1) = 0. for different levels of that factor), then you have a comparative
problem and you need a comparative design solution.
● Screening objective: The primary purpose of the experiment is
to select or screen out the few important main effects from the
many less important ones. These screening designs are also
termed main effects designs.
● Response Surface (method) objective: The experiment is
designed to allow us to estimate interaction and even quadratic
effects, and therefore give us an idea of the (local) shape of the
response surface we are investigating. For this reason, they are
termed response surface method (RSM) designs. RSM designs are
used to:
❍ Find improved or optimal process settings

http://www.itl.nist.gov/div898/handbook/pri/section3/pri32.htm (3 of 3) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri33.htm (1 of 3) [11/14/2003 5:53:03 PM]


5.3.3. How do you select an experimental design? 5.3.3. How do you select an experimental design?

❍ Troubleshoot process problems and weak points


❍ Make a product or process more robust against external Save some It is a good idea to choose a design that requires somewhat fewer runs
and non-controllable influences. "Robust" means relatively runs for than the budget permits, so that center point runs can be added to check
insensitive to these influences. center points for curvature in a 2-level screening design and backup resources are
● Optimizing responses when factors are proportions of a and "redos" available to redo runs that have processing mishaps.
mixture objective: If you have factors that are proportions of a that might
mixture and you want to know what the "best" proportions of the be needed
factors are so as to maximize (or minimize) a response, then you
need a mixture design.
● Optimal fitting of a regression model objective: If you want to
model a response as a mathematical function (either known or
empirical) of a few continuous factors and you desire "good"
model parameter estimates (i.e., unbiased and minimum
variance), then you need a regression design.

Mixture and Mixture designs are discussed briefly in section 5 (Advanced Topics)
regression and regression designs for a single factor are discussed in chapter 4.
designs Selection of designs for the remaining 3 objectives is summarized in the
following table.

Summary TABLE 3.1 Design Selection Guideline


table for
Response
choosing an Number Comparative Screening
Surface
experimental of Factors Objective Objective
Objective
design for
comparative, 1-factor
screening, completely
and 1 _ _
randomized
response design
surface
designs Central
Randomized Full or fractional
2-4 composite or
block design factorial
Box-Behnken
Screen first to
Randomized Fractional factorial
5 or more reduce number
block design or Plackett-Burman
of factors

Resources Choice of a design from within these various types depends on the
and degree amount of resources available and the degree of control over making
of control wrong decisions (Type I and Type II errors for testing hypotheses) that
over wrong the experimenter desires.
decisions

http://www.itl.nist.gov/div898/handbook/pri/section3/pri33.htm (2 of 3) [11/14/2003 5:53:03 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri33.htm (3 of 3) [11/14/2003 5:53:03 PM]


5.3.3.1. Completely randomized designs 5.3.3.1. Completely randomized designs

Balance Balance dictates that the number of replications be the same at each
level of the factor (this will maximize the sensitivity of subsequent
statistical t (or F) tests).
5. Process Improvement
5.3. Choosing an experimental design Typical A typical example of a completely randomized design is the
5.3.3. How do you select an experimental design? example of a following:
completely k = 1 factor (X1)
randomized L = 4 levels of that single factor (called "1", "2", "3", and "4")
5.3.3.1. Completely randomized designs design n = 3 replications per level
N = 4 levels * 3 replications per level = 12 runs
These designs Here we consider completely randomized designs that have one
are for studying primary factor. The experiment compares the values of a response A sample The randomized sequence of trials might look like:
the effects of variable based on the different levels of that primary factor. randomized X1
one primary sequence of
For completely randomized designs, the levels of the primary factor 3
factor without trials
are randomly assigned to the experimental units. By randomization, 1
the need to take
other nuisance we mean that the run sequence of the experimental units is 4
factors into determined randomly. For example, if there are 3 levels of the 2
account primary factor with each level to be run 2 times, then there are 6 2
factorial possible run sequences (or 6! ways to order the 1
experimental trials). Because of the replication, the number of unique 3
orderings is 90 (since 90 = 6!/(2!*2!*2!)). An example of an 4
unrandomized design would be to always run 2 replications for the 1
first level, then 2 for the second level, and finally 2 for the third 2
level. To randomize the runs, one way would be to put 6 slips of 4
paper in a box with 2 having level 1, 2 having level 2, and 2 having 3
level 3. Before each run, one of the slips would be drawn blindly
from the box and the level selected would be used for the next run of Note that in this example there are 12!/(3!*3!*3!*3!) = 369,600 ways
the experiment. to run the experiment, all equally likely to be picked by a
randomization procedure.
Randomization In practice, the randomization is typically performed by a computer
typically program (in Dataplot, see the Generate Random Run Sequence menu Model for a The model for the response is
performed by under the main DEX menu). However, the randomization can also be completely Yi,j = + Ti + random error
computer generated from random number tables or by some physical randomized
software mechanism (e.g., drawing the slips of paper). design with
Yi,j being any observation for which X1 = i
Three key All completely randomized designs with one primary factor are (or mu) is the general location parameter
numbers defined by 3 numbers: Ti is the effect of having treatment level i
k = number of factors (= 1 for these designs)
L = number of levels Estimates and Statistical Tests
n = number of replications
and the total sample size (number of runs) is N = k x L x n.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri331.htm (1 of 3) [11/14/2003 5:53:04 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri331.htm (2 of 3) [11/14/2003 5:53:04 PM]


5.3.3.1. Completely randomized designs 5.3.3.2. Randomized block designs

Estimating and Estimate for : = the average of all the data


testing model
factor levels Estimate for Ti : -
5. Process Improvement
with = average of all Y for which X1 = i.
5.3. Choosing an experimental design
Statistical tests for levels of X1 are shown in the section on one-way 5.3.3. How do you select an experimental design?
ANOVA in Chapter 7.
5.3.3.2. Randomized block designs
Blocking to For randomized block designs, there is one factor or variable that is of
"remove" the primary interest. However, there are also several other nuisance
effect of factors.
nuisance
factors Nuisance factors are those that may affect the measured result, but are
not of primary interest. For example, in applying a treatment, nuisance
factors might be the specific operator who prepared the treatment, the
time of day the experiment was run, and the room temperature. All
experiments have nuisance factors. The experimenter will typically
need to spend some time deciding which nuisance factors are
important enough to keep track of or control, if possible, during the
experiment.

Blocking used When we can control nuisance factors, an important technique known
for nuisance as blocking can be used to reduce or eliminate the contribution to
factors that experimental error contributed by nuisance factors. The basic concept
can be is to create homogeneous blocks in which the nuisance factors are held
controlled constant and the factor of interest is allowed to vary. Within blocks, it
is possible to assess the effect of different levels of the factor of
interest without having to worry about variations due to changes of the
block factors, which are accounted for in the analysis.

Definition of A nuisance factor is used as a blocking factor if every level of the


blocking primary factor occurs the same number of times with each level of the
factors nuisance factor. The analysis of the experiment will focus on the
effect of varying levels of the primary factor within each block of the
experiment.

Block for a The general rule is:


few of the "Block what you can, randomize what you cannot."
most
Blocking is used to remove the effects of a few of the most important
important
nuisance variables. Randomization is then used to reduce the
nuisance
contaminating effects of the remaining nuisance variables.
factors

http://www.itl.nist.gov/div898/handbook/pri/section3/pri331.htm (3 of 3) [11/14/2003 5:53:04 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri332.htm (1 of 4) [11/14/2003 5:53:04 PM]


5.3.3.2. Randomized block designs 5.3.3.2. Randomized block designs

Table of One useful way to look at a randomized block experiment is to Ideal would An ideal way to run this experiment would be to run all the 4x3=12
randomized consider it as a collection of completely randomized experiments, each be to wafers in the same furnace run. That would eliminate the nuisance
block designs run within one of the blocks of the total experiment. eliminate furnace factor completely. However, regular production wafers have
nuisance furnace priority, and only a few experimental wafers are allowed into
Randomized Block Designs (RBD) furnace factor any furnace run at the same time.
Name of Number of Number of
Design Factors Runs Non-Blocked A non-blocked way to run this experiment would be to run each of the
k n method twelve experimental wafers, in random order, one per furnace run.
2-factor RBD 2 L1 * L2 That would increase the experimental error of each resistivity
measurement by the run-to-run furnace variability and make it more
3-factor RBD 3 L1 * L2 * L3 difficult to study the effects of the different dosages. The blocked way
4-factor RBD 4 L1 * L2 * L3 * L4 to run this experiment, assuming you can convince manufacturing to
. . . let you put four experimental wafers in a furnace run, would be to put
k-factor RBD k L1 * L2 * ... * Lk four wafers with different dosages in each of three furnace runs. The
only randomization would be choosing which of the three wafers with
with dosage 1 would go into furnace run 1, and similarly for the wafers
L1 = number of levels (settings) of factor 1 with dosages 2, 3 and 4.
L2 = number of levels (settings) of factor 2
Description of Let X1 be dosage "level" and X2 be the blocking factor furnace run.
L3 = number of levels (settings) of factor 3 the Then the experiment can be described as follows:
L4 = number of levels (settings) of factor 4 experiment k = 2 factors (1 primary factor X1 and 1 blocking factor X2)
L1 = 4 levels of factor X1
. L2 = 3 levels of factor X2
. n = 1 replication per cell
. N =L1 * L2 = 4 * 3 = 12 runs

Lk = number of levels (settings) of factor k Design trial Before randomization, the design trials look like:
before X1 X2
Example of a Randomized Block Design randomization 1 1
1 2
Example of a Suppose engineers at a semiconductor manufacturing facility want to
randomized test whether different wafer implant material dosages have a 1 3
block design significant effect on resistivity measurements after a diffusion process 2 1
taking place in a furnace. They have four different dosages they want 2 2
to try and enough experimental wafers from the same lot to run three 2 3
wafers at each of the dosages. 3 1
3 2
Furnace run The nuisance factor they are concerned with is "furnace run" since it is 3 3
is a nuisance known that each furnace run differs from the last and impacts many 4 1
factor process parameters. 4 2
4 3

http://www.itl.nist.gov/div898/handbook/pri/section3/pri332.htm (2 of 4) [11/14/2003 5:53:04 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri332.htm (3 of 4) [11/14/2003 5:53:04 PM]


5.3.3.2. Randomized block designs 5.3.3.2.1. Latin square and related designs

Matrix An alternate way of summarizing the design trials would be to use a


representation 4x3 matrix whose 4 rows are the levels of the treatment X1 and whose
columns are the 3 levels of the blocking variable X2. The cells in the
matrix have indices that match the X1, X2 combinations above. 5. Process Improvement
5.3. Choosing an experimental design
By extension, note that the trials for any K-factor randomized block
5.3.3. How do you select an experimental design?
design are simply the cell indices of a K dimensional matrix.
5.3.3.2. Randomized block designs

Model for a Randomized Block Design


5.3.3.2.1. Latin square and related designs
Model for a The model for a randomized block design with one nuisance variable
randomized is
Latin square Latin square designs, and the related Graeco-Latin square and
block design Yi,j = + Ti + Bj + random error (and related) Hyper-Graeco-Latin square designs, are a special type of comparative
where designs are design.
Yi,j is any observation for which X1 = i and X2 = j efficient
designs to There is a single factor of primary interest, typically called the
X1 is the primary factor block from 2 treatment factor, and several nuisance factors. For Latin square designs
X2 is the blocking factor to 4 nuisance there are 2 nuisance factors, for Graeco-Latin square designs there are
is the general location parameter (i.e., the mean) factors 3 nuisance factors, and for Hyper-Graeco-Latin square designs there
Ti is the effect for being in treatment i (of factor X1) are 4 nuisance factors.
Bj is the effect for being in block j (of factor X2)
Nuisance The nuisance factors are used as blocking variables.
factors used 1. For Latin square designs, the 2 nuisance factors are divided into
Estimates for a Randomized Block Design
as blocking a tabular grid with the property that each row and each column
variables receive each treatment exactly once.
Estimating Estimate for : = the average of all the data
factor effects 2. As with the Latin square design, a Graeco-Latin square design is
for a Estimate for Ti : - a kxk tabular grid in which k is the number of levels of the
randomized treatment factor. However, it uses 3 blocking variables instead
with = average of all Y for which X1 = i.
block design of the 2 used by the standard Latin square design.
Estimate for Bj : - 3. A Hyper-Graeco-Latin square design is also a kxk tabular grid
with k denoting the number of levels of the treatment factor.
with = average of all Y for which X2 = j.
However, it uses 4 blocking variables instead of the 2 used by
the standard Latin square design.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri332.htm (4 of 4) [11/14/2003 5:53:04 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm (1 of 6) [11/14/2003 5:53:04 PM]


5.3.3.2.1. Latin square and related designs 5.3.3.2.1. Latin square and related designs

Advantages The advantages of Latin square designs are: Latin square The model for a response for a latin square design is
and 1. They handle the case when we have several nuisance factors and design model
disadvantages we either cannot combine them into a single factor or we wish to and estimates
of Latin for effect with
keep them separate.
square levels Yijk denoting any observation for which
2. They allow experiments with a relatively small number of runs.
designs
The disadvantages are: X1 = i, X2 = j, X3 = k
X1 and X2 are blocking factors
1. The number of levels of each blocking variable must equal the
X3 is the primary factor
number of levels of the treatment factor.
2. The Latin square model assumes that there are no interactions
denoting the general location parameter
between the blocking variables or between the treatment
variable and the blocking variable. Ri denoting the effect for block i
Note that Latin square designs are equivalent to specific fractional Cj denoting the effect for block j
factorial designs (e.g., the 4x4 Latin square design is equivalent to a Tk denoting the effect for treatment k
43-1fractional factorial design). Models for Graeco-Latin and Hyper-Graeco-Latin squares are the
obvious extensions of the Latin square model, with additional blocking
Summary of Several useful designs are described in the table below. variables added.
designs
Some Useful Latin Square, Graeco-Latin Square and Estimates for Latin Square Designs
Hyper-Graeco-Latin Square Designs
Name of Number of Number of Estimates Estimate for :
Design Factors Runs = the average of all the data
k N Estimate for Ri: -

3-by-3 Latin Square 3 9 = average of all Y for which X1 = i


4-by-4 Latin Square 3 16 Estimate for Cj: -
5-by-5 Latin Square 3 25
= average of all Y for which X2 = j
3-by-3 Graeco-Latin Square 4 9
4-by-4 Graeco-Latin Square 4 16 Estimate for Tk: -
5-by-5 Graeco-Latin Square 4 25
= average of all Y for which X3 = k
4-by-4 Hyper-Graeco-Latin Square 5 16
5-by-5 Hyper-Graeco-Latin Square 5 25 Randomize as Designs for Latin squares with 3-, 4-, and 5-level factors are given
much as next. These designs show what the treatment combinations should be
Model for Latin Square and Related Designs design allows for each run. When using any of these designs, be sure to randomize
the treatment units and trial order, as much as the design allows.
For example, one recommendation is that a Latin square design be
randomly selected from those available, then randomize the run order.

Latin Square Designs for 3-, 4-, and 5-Level Factors

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm (2 of 6) [11/14/2003 5:53:04 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm (3 of 6) [11/14/2003 5:53:04 PM]


5.3.3.2.1. Latin square and related designs 5.3.3.2.1. Latin square and related designs

3 2 4
Designs for 3-Level Factors 3 3 3
3-level X1 X2 X3 3 4 1
factors (and 2 row column treatment 4 1 3
nuisance or blocking blocking factor 4 2 1
blocking factor factor 4 3 2
factors) 4 4 4
1 1 1
1 2 2 with
1 3 3 k = 3 factors (2 blocking factors and 1 primary factor)
2 1 3 L1 = 4 levels of factor X1 (block)
2 2 1 L2 = 4 levels of factor X2 (block)
2 3 2 L3 = 4 levels of factor X3 (primary)
3 1 2 N = L1 * L2 = 16 runs
3 2 3 This can alternatively be represented as
3 3 1
A B D C
with
D C A B
k = 3 factors (2 blocking factors and 1 primary factor) B D C A
L1 = 3 levels of factor X1 (block)
C A B D
L2 = 3 levels of factor X2 (block)
L3 = 3 levels of factor X3 (primary)
Designs for 5-Level Factors
N = L1 * L2 = 9 runs
5-level X1 X2 X3
This can alternatively be represented as factors (and 2 row column treatment
A B C nuisance or blocking blocking factor
blocking factor factor
C A B
factors)
B C A 1 1 1
1 2 2
Designs for 4-Level Factors 1 3 3
4-level X1 X2 X3 1 4 4
factors (and 2 row column treatment 1 5 5
nuisance or blocking blocking factor 2 1 3
blocking factor factor 2 2 4
factors) 2 3 5
1 1 1
2 4 1
1 2 2
2 5 2
1 3 4
3 1 5
1 4 3
3 2 1
2 1 4
3 3 2
2 2 3
3 4 3
2 3 1
3 5 4
2 4 2
4 1 2
3 1 2

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm (4 of 6) [11/14/2003 5:53:04 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm (5 of 6) [11/14/2003 5:53:04 PM]


5.3.3.2.1. Latin square and related designs 5.3.3.2.2. Graeco-Latin square designs

4 2 3
4 3 4
4 4 5
4 5 1
5 1 4 5. Process Improvement
5 2 5 5.3. Choosing an experimental design
5 3 1 5.3.3. How do you select an experimental design?
5 4 2 5.3.3.2. Randomized block designs
5 5 3
with 5.3.3.2.2. Graeco-Latin square designs
k = 3 factors (2 blocking factors and 1 primary factor)
L1 = 5 levels of factor X1 (block) These Graeco-Latin squares, as described on the previous page, are efficient
L2 = 5 levels of factor X2 (block) designs designs to study the effect of one treatment factor in the presence of 3
L3 = 5 levels of factor X3 (primary) handle 3 nuisance factors. They are restricted, however, to the case in which all
N = L1 * L2 = 25 runs nuisance the factors have the same number of levels.
This can alternatively be represented as factors

A B C D E Randomize Designs for 3-, 4-, and 5-level factors are given on this page. These
C D E A B as much as designs show what the treatment combinations would be for each run.
E A B C D design When using any of these designs, be sure to randomize the treatment
B C D E A allows units and trial order, as much as the design allows.
D E A B C For example, one recommendation is that a Graeco-Latin square design
be randomly selected from those available, then randomize the run
Further More details on Latin square designs can be found in Box, Hunter, and order.
information Hunter (1978).
Graeco-Latin Square Designs for 3-, 4-, and 5-Level Factors

Designs for 3-Level Factors


3-level X1 X2 X3 X4
factors row column blocking treatment
blocking blocking factor factor
factor factor
1 1 1 1
1 2 2 2
1 3 3 3
2 1 2 3
2 2 3 1
2 3 1 2
3 1 3 2
3 2 1 3
3 3 2 1

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm (6 of 6) [11/14/2003 5:53:04 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3322.htm (1 of 4) [11/14/2003 5:53:05 PM]


5.3.3.2.2. Graeco-Latin square designs 5.3.3.2.2. Graeco-Latin square designs

with N = L1 * L2 = 16 runs
k = 4 factors (3 blocking factors and 1 primary factor) This can alternatively be represented as (A, B, C, and D represent the
L1 = 3 levels of factor X1 (block) treatment factor and 1, 2, 3, and 4 represent the blocking factor):
L2 = 3 levels of factor X2 (block) A1 B2 C3 D4
L3 = 3 levels of factor X3 (primary) D2 C1 B4 A3
L4 = 3 levels of factor X4 (primary) B3 A4 D1 C2
N = L1 * L2 = 9 runs C4 D3 A2 B1
This can alternatively be represented as (A, B, and C represent the
treatment factor and 1, 2, and 3 represent the blocking factor):
Designs for 5-Level Factors
A1 B2 C3 5-level X1 X2 X3 X4
C2 A3 B1 factors row column blocking treatment
B3 C1 A2 blocking blocking factor factor
factor factor
Designs for 4-Level Factors 1 1 1 1
4-level X1 X2 X3 X4 1 2 2 2
factors row column blocking treatment 1 3 3 3
blocking blocking factor factor 1 4 4 4
factor factor 1 5 5 5
2 1 2 3
1 1 1 1
2 2 3 4
1 2 2 2
2 3 4 5
1 3 3 3
2 4 5 1
1 4 4 4
2 5 1 2
2 1 2 4
3 1 3 5
2 2 1 3
3 2 4 1
2 3 4 2
3 3 5 2
2 4 3 1
3 4 1 3
3 1 3 2
3 5 2 4
3 2 4 1
4 1 4 2
3 3 1 4
4 2 5 3
3 4 2 3
4 3 1 4
4 1 4 3
4 4 2 5
4 2 3 4
4 5 3 1
4 3 2 1
5 1 5 4
4 4 1 2
5 2 1 5
with 5 3 2 1
k = 4 factors (3 blocking factors and 1 primary factor) 5 4 3 2
L1 = 3 levels of factor X1 (block) 5 5 4 3
L2 = 3 levels of factor X2 (block) with
L3 = 3 levels of factor X3 (primary)
k = 4 factors (3 blocking factors and 1 primary factor)
L4 = 3 levels of factor X4 (primary) L1 = 3 levels of factor X1 (block)

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3322.htm (2 of 4) [11/14/2003 5:53:05 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3322.htm (3 of 4) [11/14/2003 5:53:05 PM]


5.3.3.2.2. Graeco-Latin square designs 5.3.3.2.3. Hyper-Graeco-Latin square designs

L2 = 3 levels of factor X2 (block)


L3 = 3 levels of factor X3 (primary)
L4 = 3 levels of factor X4 (primary)
N = L1 * L2 = 25 runs
This can alternatively be represented as (A, B, C, D, and E represent the 5. Process Improvement
5.3. Choosing an experimental design
treatment factor and 1, 2, 3, 4, and 5 represent the blocking factor):
5.3.3. How do you select an experimental design?
A1 B2 C3 D4 E5 5.3.3.2. Randomized block designs
C2 D3 E4 A5 B1
E3 A4 B5 C1 D2
B4 C5 D1 E2 A3
5.3.3.2.3. Hyper-Graeco-Latin square
D5 E1 A2 B3 C4 designs
Further More designs are given in Box, Hunter, and Hunter (1978). These designs Hyper-Graeco-Latin squares, as described earlier, are efficient designs
information handle 4 to study the effect of one treatment factor in the presence of 4 nuisance
nuisance factors. They are restricted, however, to the case in which all the
factors factors have the same number of levels.

Randomize as Designs for 4- and 5-level factors are given on this page. These
much as designs show what the treatment combinations should be for each run.
design allows When using any of these designs, be sure to randomize the treatment
units and trial order, as much as the design allows.
For example, one recommendation is that a hyper-Graeco-Latin square
design be randomly selected from those available, then randomize the
run order.

Hyper-Graeco-Latin Square Designs for 4- and 5-Level Factors

Designs for 4-Level Factors


4-level factors X1 X2 X3 X4 X5
(there are no row column blocking blocking treatment
3-level factor blocking blocking factor factor factor
Hyper-Graeco factor factor
Latin square
designs) 1 1 1 1 1
1 2 2 2 2
1 3 3 3 3
1 4 4 4 4
2 1 4 2 3
2 2 3 1 4
2 3 2 4 1
2 4 1 3 2
3 1 2 3 4

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3322.htm (4 of 4) [11/14/2003 5:53:05 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3323.htm (1 of 3) [11/14/2003 5:53:05 PM]


5.3.3.2.3. Hyper-Graeco-Latin square designs 5.3.3.2.3. Hyper-Graeco-Latin square designs

3 2 1 4 3 3 4 1 3 5
3 3 4 1 2 3 5 2 4 1
3 4 3 2 1 4 1 4 2 5
4 1 3 4 2 4 2 5 3 1
4 2 4 3 1 4 3 1 4 2
4 3 1 2 4 4 4 2 5 3
4 4 2 1 3 4 5 3 1 4
with 5 1 5 4 3
5 2 1 5 4
k = 5 factors (4 blocking factors and 1 primary factor)
L1 = 4 levels of factor X1 (block) 5 3 2 1 5
5 4 3 2 1
L2 = 4 levels of factor X2 (block)
5 5 4 3 2
L3 = 4 levels of factor X3 (primary)
with
L4 = 4 levels of factor X4 (primary)
L5 = 4 levels of factor X5 (primary) k = 5 factors (4 blocking factors and 1 primary factor)
L1 = 5 levels of factor X1 (block)
N = L1 * L2 = 16 runs
L2 = 5 levels of factor X2 (block)
This can alternatively be represented as (A, B, C, and D represent the
L3 = 5 levels of factor X3 (primary)
treatment factor and 1, 2, 3, and 4 represent the blocking factors):
L4 = 5 levels of factor X4 (primary)
A11 B22 C33 D44 L5 = 5 levels of factor X5 (primary)
C42 D31 A24 B13 N = L1 * L2 = 25 runs
D23 C14 B41 A32
This can alternatively be represented as (A, B, C, D, and E represent
B34 A43 D12 C21 the treatment factor and 1, 2, 3, 4, and 5 represent the blocking
factors):
Designs for 5-Level Factors A11 B22 C33 D44 E55
5-level factors X1 X2 X3 X4 X5 D23 E34 A45 B51 C12
row column blocking blocking treatment
B35 C41 D52 E31 A24
blocking blocking factor factor factor
factor factor E42 A53 B14 C25 D31
C54 D15 E21 A32 B43
1 1 1 1 1
1 2 2 2 2 Further More designs are given in Box, Hunter, and Hunter (1978).
1 3 3 3 3 information
1 4 4 4 4
1 5 5 5 5
2 1 2 3 4
2 2 3 4 5
2 3 4 5 1
2 4 5 1 2
2 5 1 2 3
3 1 3 5 2
3 2 4 1 3
3 3 5 2 4

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3323.htm (2 of 3) [11/14/2003 5:53:05 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3323.htm (3 of 3) [11/14/2003 5:53:05 PM]


5.3.3.3. Full factorial designs 5.3.3.3.1. Two-level full factorial designs

5. Process Improvement 5. Process Improvement


5.3. Choosing an experimental design 5.3. Choosing an experimental design
5.3.3. How do you select an experimental design? 5.3.3. How do you select an experimental design?
5.3.3.3. Full factorial designs

5.3.3.3. Full factorial designs


5.3.3.3.1. Two-level full factorial designs
Full factorial designs in two levels
Description
A design in A common experimental design is one with all input factors set at two
which every levels each. These levels are called `high' and `low' or `+1' and `-1', Graphical Consider the two-level, full factorial design for three factors, namely
setting of respectively. A design with all possible high/low combinations of all representation the 23 design. This implies eight runs (not counting replications or
every factor the input factors is called a full factorial design in two levels. of a two-level center point runs). Graphically, we can represent the 23 design by the
appears with design with 3 cube shown in Figure 3.1. The arrows show the direction of increase of
every setting If there are k factors, each at 2 levels, a full factorial design has 2k factors the factors. The numbers `1' through `8' at the corners of the design
of every other runs. box reference the `Standard Order' of runs (see Figure 3.1).
factor is a
full factorial FIGURE 3.1 A 23 two-level, full factorial design; factors X1, X2,
TABLE 3.2 Number of Runs for a 2k Full Factorial X3
design
Number of Factors Number of Runs
2 4
3 8
4 16
5 32
6 64
7 128

Full factorial As shown by the above table, when the number of factors is 5 or
designs not greater, a full factorial design requires a large number of runs and is
recommended not very efficient. As recommended in the Design Guideline Table, a
for 5 or more fractional factorial design or a Plackett-Burman design is a better
factors choice for 5 or more factors.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri333.htm [11/14/2003 5:53:05 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3331.htm (1 of 3) [11/14/2003 5:53:05 PM]


5.3.3.3.1. Two-level full factorial designs 5.3.3.3.1. Two-level full factorial designs

The design In tabular form, this design is given by: Analysis An engineering experiment called for running three factors; namely,
matrix matrix for the Pressure (factor X1), Table speed (factor X2) and Down force (factor
TABLE 3.3 A 23 two-level, full factorial design 3-factor X3), each at a `high' and `low' setting, on a production tool to
table showing runs in `Standard Order' complete determine which had the greatest effect on product uniformity. Two
run X1 X2 X3 factorial replications were run at each setting. A (full factorial) 23 design with 2
replications calls for 8*2=16 runs.
1 -1 -1 -1
2 1 -1 -1 TABLE 3.4 Model or Analysis Matrix for a 23 Experiment
3 -1 1 -1 Model Matrix Response
4 1 1 -1 Variables
5 -1 -1 1 Rep Rep
6 1 -1 1 I X1 X2 X1*X2 X3 X1*X3 X2*X3 X1*X2*X3 1 2
7 -1 1 1
8 1 1 1 +1 -1 -1 +1 -1 +1 +1 -1 -3 -1
+1 +1 -1 -1 -1 -1 +1 +1 0 -1
The left-most column of Table 3.3, numbers 1 through 8, specifies a
+1 -1 +1 -1 -1 +1 -1 +1 -1 0
(non-randomized) run order called the `Standard Order.' These
+1 +1 +1 +1 -1 -1 -1 -1 +2 +3
numbers are also shown in Figure 3.1. For example, run 1 is made at
the `low' setting of all three factors. +1 -1 -1 +1 +1 -1 -1 +1 -1 0
+1 +1 -1 -1 +1 +1 -1 -1 +2 +1
+1 -1 +1 -1 +1 -1 +1 -1 +1 +1
Standard Order for a 2k Level Factorial Design
+1 +1 +1 +1 +1 +1 +1 +1 +6 +5
Rule for We can readily generalize the 23 standard order matrix to a 2-level full The block with the 1's and -1's is called the Model Matrix or the
writing a 2k factorial with k factors. The first (X1) column starts with -1 and Analysis Matrix. The table formed by the columns X1, X2 and X3 is
full factorial alternates in sign for all 2k runs. The second (X2) column starts with -1 called the Design Table or Design Matrix.
in "standard repeated twice, then alternates with 2 in a row of the opposite sign
order" until all 2k places are filled. The third (X3) column starts with -1 Orthogonality Properties of Analysis Matrices for 2-Factor
repeated 4 times, then 4 repeats of +1's and so on. In general, the i-th Experiments
column (Xi) starts with 2i-1 repeats of -1 folowed by 2i-1 repeats of +1.
Eliminate When all factors have been coded so that the high value is "1" and the
Example of a 23 Experiment correlation low value is "-1", the design matrix for any full (or suitably chosen
between fractional) factorial experiment has columns that are all pairwise
estimates of orthogonal and all the columns (except the "I" column) sum to 0.
main effects
and The orthogonality property is important because it eliminates
interactions correlation between the estimates of the main effects and interactions.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3331.htm (2 of 3) [11/14/2003 5:53:05 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3331.htm (3 of 3) [11/14/2003 5:53:05 PM]


5.3.3.3.2. Full factorial example 5.3.3.3.2. Full factorial example

Graphical We want to try various combinations of these settings so as to establish


representation the best way to run the polisher. There are eight different ways of
of the factor combining high and low settings of Speed, Feed, and Depth. These eight
5. Process Improvement level settings are shown at the corners of the following diagram.
5.3. Choosing an experimental design
FIGURE 3.2 A 23 Two-level, Full Factorial Design; Factors X1, X2,
5.3.3. How do you select an experimental design?
5.3.3.3. Full factorial designs
X3. (The arrows show the direction of increase of the factors.)

5.3.3.3.2. Full factorial example


A Full Factorial Design Example

An example of The following is an example of a full factorial design with 3 factors that
a full factorial also illustrates replication, randomization, and added center points.
design with 3
factors Suppose that we wish to improve the yield of a polishing operation. The
three inputs (factors) that are considered important to the operation are
Speed (X1), Feed (X2), and Depth (X3). We want to ascertain the relative
importance of each of these factors on Yield (Y).
Speed, Feed and Depth can all be varied continuously along their
respective scales, from a low to a high setting. Yield is observed to vary
smoothly when progressive changes are made to the inputs. This leads us
to believe that the ultimate response surface for Y will be smooth.

Table of factor TABLE 3.5 High (+1), Low (-1), and Standard (0)
level settings Settings for a Polishing Operation
Low (-1) Standard (0) High (+1) Units 23 implies 8 Note that if we have k factors, each run at two levels, there will be 2k
Speed 16 20 24 rpm runs different combinations of the levels. In the present case, k = 3 and 23 = 8.
Feed 0.001 0.003 0.005 cm/sec
Depth 0.01 0.015 0.02 cm/sec Full Model Running the full complement of all possible factor combinations means
that we can estimate all the main and interaction effects. There are three
main effects, three two-factor interactions, and a three-factor interaction,
Factor Combinations all of which appear in the full model as follows:

A full factorial design allows us to estimate all eight `beta' coefficients


.

Standard order

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3332.htm (1 of 6) [11/14/2003 5:53:06 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3332.htm (2 of 6) [11/14/2003 5:53:06 PM]


5.3.3.3.2. Full factorial example 5.3.3.3.2. Full factorial example

Coded The numbering of the corners of the box in the last figure refers to a Factor settings We now have constructed a design table for a two-level full factorial in
variables in standard way of writing down the settings of an experiment called in standard three factors, replicated twice.
standard order `standard order'. We see standard order displayed in the following tabular order with
representation of the eight-cornered box. Note that the factor settings have replication TABLE 3.7 The 23 Full Factorial Replicated
been coded, replacing the low setting by -1 and the high setting by 1. Twice and Presented in Standard Order
Speed, X1 Feed, X2 Depth, X3
Factor settings TABLE 3.6 A 23 Two-level, Full Factorial Design 1 16, -1 .001, -1 .01, -1
in tabular Table Showing Runs in `Standard Order' 2 24, +1 .001, -1 .01, -1
form X1 X2 X3 3 16, -1 .005, +1 .01, -1
1 -1 -1 -1 4 24, +1 .005, +1 .01, -1
2 +1 -1 -1 5 16, -1 .001, -1 .02, +1
3 -1 +1 -1 6 24, +1 .001, -1 .02, +1
4 +1 +1 -1 7 16, -1 .005, +1 .02, +1
5 -1 -1 +1 8 24, +1 .005, +1 .02, +1
6 +1 -1 +1 9 16, -1 .001, -1 .01, -1
7 -1 +1 +1 10 24, +1 .001, -1 .01, -1
8 +1 +1 +1 11 16, -1 .005, +1 .01, -1
12 24, +1 .005, +1 .01, -1
Replication 13 16, -1 .001, -1 .02, +1
14 24, +1 .001, -1 .02, +1
Replication Running the entire design more than once makes for easier data analysis 15 16, -1 .005, +1 .02, +1
provides because, for each run (i.e., `corner of the design box') we obtain an 16 24, +1 .005, +1 .02, +1
information on average value of the response as well as some idea about the dispersion
variability (variability, consistency) of the response at that setting.
Randomization
Homogeneity One of the usual analysis assumptions is that the response dispersion is
of variance uniform across the experimental space. The technical term is No If we now ran the design as is, in the order shown, we would have two
`homogeneity of variance'. Replication allows us to check this assumption randomization deficiencies, namely:
and possibly find the setting combinations that give inconsistent yields, and no center 1. no randomization, and
allowing us to avoid that area of the factor space. points
2. no center points.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3332.htm (3 of 6) [11/14/2003 5:53:06 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3332.htm (4 of 6) [11/14/2003 5:53:06 PM]


5.3.3.3.2. Full factorial example 5.3.3.3.2. Full factorial example

Randomization The more freely one can randomize experimental runs, the more insurance Table showing This design would be improved by adding at least 3 centerpoint runs
provides one has against extraneous factors possibly affecting the results, and design matrix placed at the beginning, middle and end of the experiment. The final
protection hence perhaps wasting our experimental time and effort. For example, with design matrix is shown below:
against consider the `Depth' column: the settings of Depth, in standard order, randomization
extraneous follow a `four low, four high, four low, four high' pattern. and center TABLE 3.9 The 23 Full Factorial Replicated
factors points Twice with Random Run Order Indicated and
affecting the Suppose now that four settings are run in the day and four at night, and Center Point Runs Added
results that (unknown to the experimenter) ambient temperature in the polishing Random Standard
shop affects Yield. We would run the experiment over two days and two Order Order X1 X2 X3
nights and conclude that Depth influenced Yield, when in fact ambient
temperature was the significant influence. So the moral is: Randomize 1 0 0 0
experimental runs as much as possible. 2 5 -1 -1 +1
3 15 -1 +1 +1
Table of factor Here's the design matrix again with the rows randomized (using the 4 9 -1 -1 -1
settings in RAND function of EXCEL). The old standard order column is also shown 5 7 -1 +1 +1
randomized for comparison and for re-sorting, if desired, after the runs are in. 6 3 -1 +1 -1
order
TABLE 3.8 The 23 Full Factorial Replicated 7 12 +1 +1 -1
Twice with Random Run Order Indicated 8 6 +1 -1 +1
Random Standard 9 0 0 0
Order Order X1 X2 X3 10 4 +1 +1 -1
1 5 -1 -1 +1 11 2 +1 -1 -1
2 15 -1 +1 +1 12 13 -1 -1 +1
3 9 -1 -1 -1 13 8 +1 +1 +1
4 7 -1 +1 +1 14 16 +1 +1 +1
5 3 -1 +1 -1 15 1 -1 -1 -1
6 12 +1 +1 -1 16 14 +1 -1 +1
7 6 +1 -1 +1 17 11 -1 +1 -1
8 4 +1 +1 -1 18 10 +1 -1 -1
9 2 +1 -1 -1 19 0 0 0
10 13 -1 -1 +1
11 8 +1 +1 +1
12 16 +1 +1 +1
13 1 -1 -1 -1
14 14 +1 -1 +1
15 11 -1 +1 -1
16 10 +1 -1 -1

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3332.htm (5 of 6) [11/14/2003 5:53:06 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3332.htm (6 of 6) [11/14/2003 5:53:06 PM]


5.3.3.3.3. Blocking of full factorial designs 5.3.3.3.3. Blocking of full factorial designs

Graphical FIGURE 3.3 Blocking Scheme for a 23 Using Alternate Corners


representation
of blocking
5. Process Improvement scheme
5.3. Choosing an experimental design
5.3.3. How do you select an experimental design?
5.3.3.3. Full factorial designs

5.3.3.3.3. Blocking of full factorial designs


Eliminate the We often need to eliminate the influence of extraneous factors when
influence of running an experiment. We do this by "blocking".
extraneous
factors by Previously, blocking was introduced when randomized block designs
"blocking" were discussed. There we were concerned with one factor in the
presence of one of more nuisance factors. In this section we look at a
general approach that enables us to divide 2-level factorial
experiments into blocks.
For example, assume we anticipate predictable shifts will occur while
an experiment is being run. This might happen when one has to
change to a new batch of raw materials halfway through the Three-factor This works because we are in fact assigning the `estimation' of the
experiment. The effect of the change in raw materials is well known, interaction (unwanted) blocking effect to the three-factor interaction, and because
and we want to eliminate its influence on the subsequent data analysis. confounded of the special property of two-level designs called orthogonality. That
with the block is, the three-factor interaction is "confounded" with the block effect as
Blocking in a In this case, we need to divide our experiment into two halves (2 effect will be seen shortly.
23 factorial blocks), one with the first raw material batch and the other with the
design new batch. The division has to balance out the effect of the materials Orthogonality Orthogonality guarantees that we can always estimate the effect of one
change in such a way as to eliminate its influence on the analysis, and factor or interaction clear of any influence due to any other factor or
we do this by blocking. interaction. Orthogonality is a very desirable property in DOE and this
is a major reason why two-level factorials are so popular and
Example Example: An eight-run 23 full factorial has to be blocked into two successful.
groups of four runs each. Consider the design `box' for the 23 full
factorial. Blocking can be achieved by assigning the first block to the
dark-shaded corners and the second block to the open circle corners.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3333.htm (1 of 4) [11/14/2003 5:53:06 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3333.htm (2 of 4) [11/14/2003 5:53:06 PM]


5.3.3.3.3. Blocking of full factorial designs 5.3.3.3.3. Blocking of full factorial designs

Table Formally, consider the 23 design table with the three-factor interaction
showing column added.
blocking
scheme TABLE 3.10 Two Blocks for a 23 Design
SPEED FEED DEPTH BLOCK
X1 X2 X3 X1*X2*X3
-1 -1 -1 -1 I
+1 -1 -1 +1 II
-1 +1 -1 +1 II
+1 +1 -1 -1 I
-1 -1 +1 +1 II
+1 -1 +1 -1 I
-1 +1 +1 -1 I
+1 +1 +1 +1 II

Block by Rows that have a `-1' in the three-factor interaction column are
assigning the assigned to `Block I' (rows 1, 4, 6, 7), while the other rows are
"Block effect" assigned to `Block II' (rows 2, 3, 5, 8). Note that the Block I rows are
to a the open circle corners of the design `box' above; Block II are
high-order dark-shaded corners.
interaction

Most DOE The general rule for blocking is: use one or a combination of
software will high-order interaction columns to construct blocks. This gives us a
do blocking formal way of blocking complex designs. Apart from simple cases in
for you which you can design your own blocks, your statistical/DOE software
will do the blocking if asked, but you do need to understand the
principle behind it.

Block effects The price you pay for blocking by using high-order interaction
are columns is that you can no longer distinguish the high-order
confounded interaction(s) from the blocking effect - they have been `confounded,'
with higher- or `aliased.' In fact, the blocking effect is now the sum of the blocking
order effect and the high-order interaction effect. This is fine as long as our
interactions assumption about negligible high-order interactions holds true, which
it usually does.

Center points Within a block, center point runs are assigned as if the block were a
within a block separate experiment - which in a sense it is. Randomization takes place
within a block as it would for any non-blocked DOE.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3333.htm (3 of 4) [11/14/2003 5:53:06 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3333.htm (4 of 4) [11/14/2003 5:53:06 PM]


5.3.3.4. Fractional factorial designs 5.3.3.4. Fractional factorial designs

5. Process Improvement
5.3. Choosing an experimental design
5.3.3. How do you select an experimental design?

5.3.3.4. Fractional factorial designs


Full factorial The ASQC (1983) Glossary & Tables for Statistical Quality Control
experiments defines fractional factorial design in the following way: "A factorial
can require experiment in which only an adequately chosen fraction of the
many runs treatment combinations required for the complete factorial experiment
is selected to be run."

A carefully Even if the number of factors, k, in a design is small, the 2k runs


chosen specified for a full factorial can quickly become very large. For
fraction of example, 26 = 64 runs is for a two-level, full factorial design with six
the runs may factors. To this design we need to add a good number of centerpoint
be all that is runs and we can thus quickly run up a very large resource requirement
necessary for runs with only a modest number of factors.

Later The solution to this problem is to use only a fraction of the runs
sections will specified by the full factorial design. Which runs to make and which to
show how to leave out is the subject of interest here. In general, we pick a fraction
choose the such as ½, ¼, etc. of the runs called for by the full factorial. We use
"right" various strategies that ensure an appropriate choice of runs. The
fraction for following sections will show you how to choose an appropriate fraction
2-level of a full factorial design to suit your purpose at hand. Properly chosen
designs - fractional factorial designs for 2-level experiments have the desirable
these are properties of being both balanced and orthogonal.
both
balanced and
orthogonal

2-Level Note: We will be emphasizing fractions of two-level designs only. This


fractional is because two-level fractional designs are, in engineering at least, by
factorial far the most popular fractional designs. Fractional factorials where
designs some factors have three levels will be covered briefly in Section
emphasized 5.3.3.10.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri334.htm (1 of 2) [11/14/2003 5:53:13 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri334.htm (2 of 2) [11/14/2003 5:53:13 PM]


5.3.3.4.1. A 23-1 design (half of a 23) 5.3.3.4.1. A 23-1 design (half of a 23)

Tabular In tabular form, this design (also showing eight observations `yj'
representation (j = 1,...,8) is given by
of the design
5. Process Improvement
TABLE 3.11 A 23 Two-level, Full Factorial Design Table Showing
5.3. Choosing an experimental design Runs in `Standard Order,' Plus Observations (yj)
5.3.3. How do you select an experimental design? X1 X2 X3 Y
5.3.3.4. Fractional factorial designs
1 -1 -1 -1 y1 = 33
2 +1 -1 -1 y2 = 63
5.3.3.4.1. A 23-1 design (half of a 23) 3 -1 +1 -1 y3 = 41
4 +1 +1 -1 Y4 = 57
We can run a Consider the two-level, full factorial design for three factors, namely
5 -1 -1 +1 y5 = 57
fraction of a the 23 design. This implies eight runs (not counting replications or
full factorial center points). Graphically, as shown earlier, we can represent the 23 6 +1 -1 +1 y6 = 51
experiment design by the following cube: 7 -1 +1 +1 y7 = 59
and still be
able to FIGURE 3.4 A 23 Full Factorial Design; 8 +1 +1 +1 y8 = 53
estimate main Factors X1, X2, X3. (The arrows show the direction of increase of
effects the factors. Numbers `1' through `8' at the corners of the design Responses in The right-most column of the table lists `y1' through `y8' to indicate the
cube reference the `Standard Order' of runs) standard responses measured for the experimental runs when listed in standard
order order. For example, `y1' is the response (i.e., output) observed when
the three factors were all run at their `low' setting. The numbers
entered in the "y" column will be used to illustrate calculations of
effects.

Computing X1 From the entries in the table we are able to compute all `effects' such
main effect as main effects, first-order `interaction' effects, etc. For example, to
compute the main effect estimate `c1' of factor X1, we compute the
average response at all runs with X1 at the `high' setting, namely
(1/4)(y2 + y4 + y6 + y8), minus the average response of all runs with X1
set at `low,' namely (1/4)(y1 + y3 + y5 + y7). That is,
c1 = (1/4) (y2 + y4 + y6 + y8) - (1/4)(y1 + y3 + y5 + y7) or
c1 = (1/4)(63+57+51+53 ) - (1/4)(33+41+57+59) = 8.5

Can we Suppose, however, that we only have enough resources to do four


estimate X1 runs. Is it still possible to estimate the main effect for X1? Or any other
main effect main effect? The answer is yes, and there are even different choices of
with four the four runs that will accomplish this.
runs?

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3341.htm (1 of 4) [11/14/2003 5:53:13 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3341.htm (2 of 4) [11/14/2003 5:53:13 PM]


5.3.3.4.1. A 23-1 design (half of a 23) 5.3.3.4.1. A 23-1 design (half of a 23)

choice.
Example of For example, suppose we select only the four light (unshaded) corners On the other hand, if interactions are potentially large (and if the
computing the of the design cube. Using these four runs (1, 4, 6 and 7), we can still
replication required could be set aside), then the usual 23 full factorial
main effects compute c1 as follows:
design (with center points) would serve as a good design.
using only c1 = (1/2) (y4 + y6) - (1/2) (y1 + y7) or
four runs
c1 = (1/2) (57+51) - (1/2) (33+59) = 8.
Simarly, we would compute c2, the effect due to X2, as
c2 = (1/2) (y4 + y7) - (1/2) (y1 + y6) or
c2 = (1/2) (57+59) - (1/2) (33+51) = 16.
Finally, the computation of c3 for the effect due to X3 would be
c3 = (1/2) (y6 + y7) - (1/2) (y1 + y4) or
c3 = (1/2) (51+59) - (1/2) (33+57) = 10.

Alternative We could also have used the four dark (shaded) corners of the design
runs for cube for our runs and obtained similiar, but slightly different,
computing estimates for the main effects. In either case, we would have used half
main effects the number of runs that the full factorial requires. The half fraction we
used is a new design written as 23-1. Note that 23-1 = 23/2 = 22 = 4,
which is the number of runs in this half-fraction design. In the next
section, a general method for choosing fractions that "work" will be
discussed.

Example of Example: An engineering experiment calls for running three factors,


how namely Pressure, Table speed, and Down force, each at a `high' and a
fractional `low' setting, on a production tool to determine which has the greatest
factorial effect on product uniformity. Interaction effects are considered
experiments negligible, but uniformity measurement error requires that at least two
often arise in separate runs (replications) be made at each process setting. In
industry addition, several `standard setting' runs (centerpoint runs) need to be
made at regular intervals during the experiment to monitor for process
drift. As experimental time and material are limited, no more than 15
runs can be planned.
A full factorial 23 design, replicated twice, calls for 8x2 = 16 runs,
even without centerpoint runs, so this is not an option. However a 23-1
design replicated twice requires only 4x2 = 8 runs, and then we would
have 15-8 = 7 spare runs: 3 to 5 of these spare runs can be used for
centerpoint runs and the rest saved for backup in case something goes
wrong with any run. As long as we are confident that the interactions
are negligbly small (compared to the main effects), and as long as
complete replication is required, then the above replicated 23-1
fractional factorial design (with center points) is a very reasonable

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3341.htm (3 of 4) [11/14/2003 5:53:13 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3341.htm (4 of 4) [11/14/2003 5:53:13 PM]


5.3.3.4.2. Constructing the 23-1 half-fraction design 5.3.3.4.2. Constructing the 23-1 half-fraction design

Design table We may now substitute `X3' in place of `X1*X2' in this table.
with X3 set
to X1*X2 TABLE 3.15 A 23-1 Design Table
with Column X3 set to X1*X2
5. Process Improvement
5.3. Choosing an experimental design X1 X2 X3
5.3.3. How do you select an experimental design? 1 -1 -1 +1
5.3.3.4. Fractional factorial designs 2 +1 -1 -1
3 -1 +1 -1
5.3.3.4.2. Constructing the 23-1 half-fraction 4 +1 +1 +1

design Design table Note that the rows of Table 3.14 give the dark-shaded corners of the
with X3 set design in Figure 3.4. If we had set X3 = -X1*X2 as the rule for
Construction First note that, mathematically, 23-1 = 22. This gives us the first step, to -X1*X2 generating the third column of our 23-1 design, we would have obtained:
of a 23-1 half which is to start with a regular 22 full factorial design. That is, we start
fraction with the following design table. TABLE 3.15 A 23-1 Design Table
design by with Column X3 set to - X1*X2
staring with TABLE 3.12 A Standard Order X1 X2 X3
a 22 full 22 Full Factorial Design Table
1 -1 -1 -1
factorial X1 X2 2 +1 -1 +1
design 1 -1 -1 3 -1 +1 +1
2 +1 -1 4 +1 +1 -1
3 -1 +1
4 +1 +1 Main effect This design gives the light-shaded corners of the box of Figure 3.4. Both
estimates 23-1 designs that we have generated are equally good, and both save half
Assign the This design has four runs, the right number for a half-fraction of a 23, from the number of runs over the original 23 full factorial design. If c1, c2,
third factor but there is no column for factor X3. We need to add a third column to fractional
and c3 are our estimates of the main effects for the factors X1, X2, X3
to the take care of this, and we do it by adding the X1*X2 interaction column. factorial not
interaction as good as (i.e., the difference in the response due to going from "low" to "high"
This column is, as you will recall from full factorial designs,
column of a full factorial for an effect), then the precision of the estimates c1, c2, and c3 are not
constructed by multiplying the row entry for X1 with that of X2 to
22 design obtain the row entry for X1*X2. quite as good as for the full 8-run factorial because we only have four
observations to construct the averages instead of eight; this is one price
TABLE 3.13 A 22 Design Table we have to pay for using fewer runs.
Augmented with the X1*X2
Interaction Column `X1*X2' Example Example: For the `Pressure (P), Table speed (T), and Down force (D)'
X1 X2 X1*X2 design situation of the previous example, here's a replicated 23-1 in
1 -1 -1 +1 randomized run order, with five centerpoint runs (`000') interspersed
2 +1 -1 -1 among the runs. This design table was constructed using the technique
discussed above, with D = P*T.
3 -1 +1 -1
4 +1 +1 +1

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3342.htm (1 of 3) [11/14/2003 5:53:13 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3342.htm (2 of 3) [11/14/2003 5:53:13 PM]


5.3.3.4.2. Constructing the 23-1 half-fraction design 5.3.3.4.3. Confounding (also called aliasing)

Design table TABLE 3.16 A 23-1 Design Replicated Twice,


for the with Five Centerpoint Runs Added
example Center
Pattern P T D Point 5. Process Improvement
1 000 0 0 0 1 5.3. Choosing an experimental design
5.3.3. How do you select an experimental design?
2 +-- +1 -1 -1 0 5.3.3.4. Fractional factorial designs
3 -+- -1 +1 -1 0
4 000 0 0 0 1
5 +++ +1 +1 +1 0 5.3.3.4.3. Confounding (also called aliasing)
6 --+ -1 -1 +1 0
7 000 0 0 0 1 Confounding One price we pay for using the design table column X1*X2 to obtain
means we column X3 in Table 3.14 is, clearly, our inability to obtain an estimate of
8 +-- +1 -1 -1 0
have lost the the interaction effect for X1*X2 (i.e., c12) that is separate from an estimate
9 --+ -1 -1 +1 0 ability to of the main effect for X3. In other words, we have confounded the main
10 000 0 0 0 1 estimate effect estimate for factor X3 (i.e., c3) with the estimate of the interaction
11 +++ +1 +1 +1 0 some effects
and/or effect for X1 and X2 (i.e., with c12). The whole issue of confounding is
12 -+- -1 +1 -1 0
interactions fundamental to the construction of fractional factorial designs, and we will
13 000 0 0 0 1 spend time discussing it below.

Sparsity of In using the 23-1 design, we also assume that c12 is small compared to c3;
effects this is called a `sparsity of effects' assumption. Our computation of c3 is in
assumption
fact a computation of c3 + c12. If the desired effects are only confounded
with non-significant interactions, then we are OK.

A Notation and Method for Generating Confounding or Aliasing

A short way A short way of writing `X3 = X1*X2' (understanding that we are talking
of writing about multiplying columns of the design table together) is: `3 = 12'
factor column (similarly 3 = -12 refers to X3 = -X1*X2). Note that `12' refers to column
multiplication multiplication of the kind we are using to construct the fractional design
and any column multiplied by itself gives the identity column of all 1's.

Next we multiply both sides of 3=12 by 3 and obtain 33=123, or I=123


since 33=I (or a column of all 1's). Playing around with this "algebra", we
see that 2I=2123, or 2=2123, or 2=1223, or 2=13 (since 2I=2, 22=I, and
1I3=13). Similarly, 1=23.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3342.htm (3 of 3) [11/14/2003 5:53:13 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3343.htm (1 of 3) [11/14/2003 5:53:13 PM]


5.3.3.4.3. Confounding (also called aliasing) 5.3.3.4.3. Confounding (also called aliasing)

Definition of I=123 is called a design generator or a generating relation for this


"design 23-1design (the dark-shaded corners of Figure 3.4). Since there is only one
generator" or design generator for this design, it is also the defining relation for the
"generating design. Equally, I=-123 is the design generator (and defining relation) for
relation" and the light-shaded corners of Figure 3.4. We call I=123 the defining relation
"defining for the 23-1 design because with it we can generate (by "multiplication") the
relation" complete confounding pattern for the design. That is, given I=123, we can
generate the set of {1=23, 2=13, 3=12, I=123}, which is the complete set of
aliases, as they are called, for this 23-1 fractional factorial design. With
I=123, we can easily generate all the columns of the half-fraction design
23-1.

Principal Note: We can replace any design generator by its negative counterpart and
fraction have an equivalent, but different fractional design. The fraction generated
by positive design generators is sometimes called the principal fraction.

All main The confounding pattern described by 1=23, 2=13, and 3=12 tells us that
effects of 23-1 all the main effects of the 23-1 design are confounded with two-factor
design interactions. That is the price we pay for using this fractional design. Other
confounded fractional designs have different confounding patterns; for example, in the
with typical quarter-fraction of a 26 design, i.e., in a 26-2 design, main effects are
two-factor confounded with three-factor interactions (e.g., 5=123) and so on. In the The next section will add one more item to the above box, and then we will
interactions case of 5=123, we can also readily see that 15=23 (etc.), which alerts us to be able to select the right two-level fractional factorial design for a wide
the fact that certain two-factor interactions of a 26-2 are confounded with range of experimental tasks.
other two-factor interactions.

A useful Summary: A convenient summary diagram of the discussion so far about


summary the 23-1 design is as follows:
diagram for a
fractional FIGURE 3.5 Essential Elements of a 23-1 Design
factorial
design

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3343.htm (2 of 3) [11/14/2003 5:53:13 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3343.htm (3 of 3) [11/14/2003 5:53:13 PM]


5.3.3.4.4. Fractional factorial design specifications and design resolution 5.3.3.4.4. Fractional factorial design specifications and design resolution

How to Construct a Fractional Factorial Design From the


Specification

5. Process Improvement Rule for In order to construct the design, we do the following:
5.3. Choosing an experimental design constructing 1. Write down a full factorial design in standard order for k-p
5.3.3. How do you select an experimental design? a fractional factors (8-3 = 5 factors for the example above). In the
5.3.3.4. Fractional factorial designs factorial
specification above we start with a 25 full factorial design. Such a
design
design has 25 = 32 rows.
5.3.3.4.4. Fractional factorial design 2. Add a sixth column to the design table for factor 6, using 6 = 345
(or 6 = -345) to manufacture it (i.e., create the new column by
specifications and design multiplying the indicated old columns together).
resolution 3. Do likewise for factor 7 and for factor 8, using the appropriate
design generators given in Figure 3.6.
Generating We considered the 23-1 design in the previous section and saw that its 4. The resultant design matrix gives the 32 trial runs for an 8-factor
relation and generator written in "I = ... " form is {I = +123}. Next we look at a fractional factorial design. (When actually running the
diagram for experiment, we would of course randomize the run order.
one-eighth fraction of a 28 design, namely the 28-3 fractional factorial
the 28-3 design. Using a diagram similar to Figure 3.5, we have the following:
fractional Design We note further that the design generators, written in `I = ...' form, for
factorial FIGURE 3.6 Specifications for a 28-3 Design generators the principal 28-3 fractional factorial design are:
design { I = + 3456; I = + 12457; I = +12358 }.
These design generators result from multiplying the "6 = 345" generator
by "6" to obtain "I = 3456" and so on for the other two generqators.

"Defining The total collection of design generators for a factorial design, including
relation" for all new generators that can be formed as products of these generators,
a fractional is called a defining relation. There are seven "words", or strings of
factorial numbers, in the defining relation for the 28-3 design, starting with the
design original three generators and adding all the new "words" that can be
formed by multiplying together any two or three of these original three
words. These seven turn out to be I = 3456 = 12457 = 12358 = 12367 =
12468 = 3478 = 5678. In general, there will be (2p -1) words in the
defining relation for a 2k-p fractional factorial.

Definition of The length of the shortest word in the defining relation is called the
28-3 design Figure 3.6 tells us that a 28-3 design has 32 runs, not including "Resolution" resolution of the design. Resolution describes the degree to which
has 32 runs centerpoint runs, and eight factors. There are three generators since this estimated main effects are aliased (or confounded) with estimated
is a 1/8 = 2-3 fraction (in general, a 2k-p fractional factorial needs p 2-level interactions, 3-level interactions, etc.
generators which define the settings for p additional factor columns to
be added to the 2k-p full factorial design columns - see the following
detailed description for the 28-3 design).

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3344.htm (1 of 7) [11/14/2003 5:53:14 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3344.htm (2 of 7) [11/14/2003 5:53:14 PM]


5.3.3.4.4. Fractional factorial design specifications and design resolution 5.3.3.4.4. Fractional factorial design specifications and design resolution

Notation for The length of the shortest word in the defining relation for the 28-3 The The complete confounding pattern, for confounding of up to two-factor
resolution design is four. This is written in Roman numeral script, and subscripted complete interactions, arising from the design given in Figure 3.7 is
(Roman as . Note that the 23-1 design has only one word, "I = 123" (or "I = first-order 34 = 56 = 78
numerals) interaction 35 = 46
-123"), in its defining relation since there is only one design generator,
confounding 36 = 45
and so this fractional factorial design has resolution three; that is, we
for the given 37 = 48
may write .
28-3 design 38 = 47
57 = 68
Diagram for Now Figure 3.6 may be completed by writing it as: 58 = 67
a 28-3 design
showing FIGURE 3.7 Specifications for a 28-3, Showing Resolution IV
All of these relations can be easily verified by multiplying the indicated
resolution two-factor interactions by the generators. For example, to verify that
38= 47, multiply both sides of 8=1235 by 3 to get 38=125. Then,
multiply 7=1245 by 4 to get 47=125. From that it follows that 38=47.

One or two For this fractional factorial design, 15 two-factor interactions are
factors aliased (confounded) in pairs or in a group of three. The remaining 28 -
suspected of 15 = 13 two-factor interactions are only aliased with higher-order
possibly interactions (which are generally assumed to be negligible). This is
having verified by noting that factors "1" and "2" never appear in a length-4
significant word in the defining relation. So, all 13 interactions involving "1" and
first-order "2" are clear of aliasing with any other two factor interaction.
interactions
can be If one or two factors are suspected of possibly having significant
assigned in first-order interactions, they can be assigned in such a way as to avoid
such a way having them aliased.
as to avoid
Resolution The design resolution tells us how badly the design is confounded. having them
and Previously, in the 23-1 design, we saw that the main effects were aliased
confounding confounded with two-factor interactions. However, main effects were
not confounded with other main effects. So, at worst, we have 3=12, or Higher A resolution IV design is "better" than a resolution III design because
2=13, etc., but we do not have 1=2, etc. In fact, a resolution II design resoulution we have less-severe confounding pattern in the `IV' than in the `III'
would be pretty useless for any purpose whatsoever! designs have situation; higher-order interactions are less likely to be significant than
Similarly, in a resolution IV design, main effects are confounded with at less severe low-order interactions.
worst three-factor interactions. We can see, in Figure 3.7, that 6=345. confounding,
but require A higher-resolution design for the same number of factors will,
We also see that 36=45, 34=56, etc. (i.e., some two-factor interactions however, require more runs and so it is `worse' than a lower order
are confounded with certain other two-factor interactions) etc.; but we more runs
design in that sense.
never see anything like 2=13, or 5=34, (i.e., main effects confounded
with two-factor interactions).

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3344.htm (3 of 7) [11/14/2003 5:53:14 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3344.htm (4 of 7) [11/14/2003 5:53:14 PM]


5.3.3.4.4. Fractional factorial design specifications and design resolution 5.3.3.4.4. Fractional factorial design specifications and design resolution

Resolution V Similarly, with a resolution V design, main effects would be


designs for 8 confounded with four-factor (and possibly higher-order) interactions,
factors and two-factor interactions would be confounded with certain
three-factor interactions. To obtain a resolution V design for 8 factors
requires more runs than the 28-3 design. One option, if estimating all
main effects and two-factor interactions is a requirement, is a
design. However, a 48-run alternative (John's 3/4 fractional factorial) is
also available.

There are Note: There are other fractional designs that can be derived
many starting with different choices of design generators for the "6", "7" and
choices of "8" factor columns. However, they are either equivalent (in terms of the
fractional number of words of length of length of four) to the fraction with
factorial generators 6 = 345, 7 = 1245, 8 = 1235 (obtained by relabeling the
designs - factors), or they are inferior to the fraction given because their defining This design is equivalent to the design specified in Figure 3.7 after
some may relation contains more words of length four (and therefore more relabeling the factors as follows: 1 becomes 5, 2 becomes 8, 3 becomes
have the
confounded two-factor interactions). For example, the design with 1, 4 becomes 2, 5 becomes 3, 6 remains 6, 7 becomes 4 and 8 becomes
same
generators 6 = 12345, 7 = 135, and 8 = 245 has five length-four words 7.
number of
runs and in the defining relation (the defining relation is I = 123456 = 1357 =
resolution, 2458 = 2467 = 1368 = 123478 = 5678). As a result, this design would Minimum A table given later in this chapter gives a collection of useful fractional
but different confound more two factor-interactions (23 out of 28 possible two-factor aberration factorial designs that, for a given k and p, maximize the possible
aliasing interactions are confounded, leaving only "12", "14", "23", "27" and resolution and minimize the number of short words in the defining
patterns. "34" as estimable two-factor interactions). relation (which minimizes two-factor aliasing). The term for this is
"minimum aberration".
Diagram of As an example of an equivalent "best" fractional factorial design,
an Design Resolution Summary
obtained by "relabeling", consider the design specified in Figure 3.8.
alternative
way for FIGURE 3.8 Another Way of Generating the 28-3 Design Commonly The meaning of the most prevalent resolution levels is as follows:
generating used design
Resolutions Resolution III Designs
the 28-3
design Main effects are confounded (aliased) with two-factor interactions.
Resolution IV Designs
No main effects are aliased with two-factor interactions, but two-factor
interactions are aliased with each other.
Resolution V Designs
No main effect or two-factor interaction is aliased with any other main
effect or two-factor interaction, but two-factor interactions are aliased
with three-factor interactions.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3344.htm (5 of 7) [11/14/2003 5:53:14 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3344.htm (6 of 7) [11/14/2003 5:53:14 PM]


5.3.3.4.4. Fractional factorial design specifications and design resolution 5.3.3.4.5. Use of fractional factorial designs

5. Process Improvement
5.3. Choosing an experimental design
5.3.3. How do you select an experimental design?
5.3.3.4. Fractional factorial designs

5.3.3.4.5. Use of fractional factorial designs


Use The basic purpose of a fractional factorial design is to
low-resolution economically investigate cause-and-effect relationships of
designs for significance in a given experimental setting. This does not differ in
screening among essence from the purpose of any experimental design. However,
main effects and because we are able to choose fractions of a full design, and hence
use be more economical, we also have to be aware that different
higher-resolution factorial designs serve different purposes.
designs when
interaction effects Broadly speaking, with designs of resolution three, and sometimes
and response four, we seek to screen out the few important main effects from the
surfaces need to many less important others. For this reason, these designs are often
be investigated termed main effects designs, or screening designs.
On the other hand, designs of resolution five, and higher, are used
for focusing on more than just main effects in an experimental
situation. These designs allow us to estimate interaction effects and
such designs are easily augmented to complete a second-order
design - a design that permits estimation of a full second-order
(quadratic) model.

Different Within the screening/RSM strategy of design, there are a number


purposes for of functional purposes for which designs are used. For example, an
screening/RSM experiment might be designed to determine how to make a product
designs better or a process more robust against the influence of external
and non-controllable influences such as the weather. Experiments
might be designed to troubleshoot a process, to determine
bottlenecks, or to specify which component(s) of a product are
most in need of improvement. Experiments might also be designed
to optimize yield, or to minimize defect levels, or to move a
process away from an unstable operating zone. All these aims and
purposes can be achieved using fractional factorial designs and
their appropriate design enhancements.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3344.htm (7 of 7) [11/14/2003 5:53:14 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3345.htm (1 of 2) [11/14/2003 5:53:14 PM]


5.3.3.4.5. Use of fractional factorial designs 5.3.3.4.6. Screening designs

5. Process Improvement
5.3. Choosing an experimental design
5.3.3. How do you select an experimental design?
5.3.3.4. Fractional factorial designs

5.3.3.4.6. Screening designs


Screening The term `Screening Design' refers to an experimental plan that is
designs are an intended to find the few significant factors from a list of many
efficient way to potential ones. Alternatively, we refer to a design as a screening
identify design if its primary purpose is to identify significant main effects,
significant main rather than interaction effects, the latter being assumed an order of
effects magnitude less important.

Use screening Even when the experimental goal is to eventually fit a response
designs when you surface model (an RSM analysis), the first experiment should be a
have many screening design when there are many factors to consider.
factors to
consider

Screening Screening designs are typically of resolution III. The reason is that
designs are resolution III designs permit one to explore the effects of many
usually factors with an efficient number of runs.
resolution III or
IV Sometimes designs of resolution IV are also used for screening
designs. In these designs, main effects are confounded with, at
worst, three-factor interactions. This is better from the confounding
viewpoint, but the designs require more runs than a resolution III
design.

Plackett-Burman Another common family of screening designs is the


designs Plackett-Burman set of designs, so named after its inventors. These
designs are of resolution III and will be described later.

Economical In short, screening designs are economical experimental plans that


plans for focus on determining the relative significance of many main
determing effects.
significant main
effects

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3345.htm (2 of 2) [11/14/2003 5:53:14 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3346.htm (1 of 2) [11/14/2003 5:53:14 PM]


5.3.3.4.6. Screening designs 5.3.3.4.7. Summary tables of useful fractional factorial designs

5. Process Improvement
5.3. Choosing an experimental design
5.3.3. How do you select an experimental design?
5.3.3.4. Fractional factorial designs

5.3.3.4.7. Summary tables of useful


fractional factorial designs
Useful There are very useful summaries of two-level fractional factorial designs
fractional for up to 11 factors, originally published in the book Statistics for
factorial Experimenters by G.E.P. Box, W.G. Hunter, and J.S. Hunter (New
designs for York, John Wiley & Sons, 1978). and also given in the book Design and
up to 10 Analysis of Experiments, 5th edition by Douglas C. Montgomery (New
factors are York, John Wiley & Sons, 2000).
summarized
here

Generator They differ in the notation for the design generators. Box, Hunter, and
column Hunter use numbers (as we did in our earlier discussion) and
notation can Montgomery uses capital letters according to the following scheme:
use either
numbers or
letters for
the factor
columns

Notice the absence of the letter I. This is usually reserved for the
intercept column that is identically 1. As an example of the letter
notation, note that the design generator "6 = 12345" is equivalent to "F =
ABCDE".

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3346.htm (2 of 2) [11/14/2003 5:53:14 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3347.htm (1 of 3) [11/14/2003 5:53:15 PM]


5.3.3.4.7. Summary tables of useful fractional factorial designs 5.3.3.4.7. Summary tables of useful fractional factorial designs

9 2III9-5 16
Details of TABLE 3.17 catalogs these useful fractional factorial designs using the 10-3
the design notation previously described in FIGURE 3.7. 10 2V 128
generators, 10 2IV10-4 64
the defining Clicking on the specification for a given design provides details
relation, the 10 2IV10-5 32
(courtesy of Dataplot files) of the design generators, the defining
confounding relation, the confounding structure (as far as main effects and two-level 10 2III10-6 16
structure, interactions are concerned), and the design matrix. The notation used
and the 11 2V 11-4 128
follows our previous labeling of factors with numbers, not letters.
design
11 2IV11-5 64
matrix
11 2IV11-6 32
Click on the TABLE 3.17 Summary of Useful Fractional Factorial Designs
11 2III11-7 16
design
Number of Factors, k Design Specification Number of Runs N
specification 15 2III15-11 16
in the table
31 2III31-26 32
below and a 3 2III3-1 4
text file with
details 4 2IV4-1 8
about the 5 2V5-1 16
design can
be viewed or 5 2III5-2 8
saved 2VI 6-1
6 32
6 2IV6-2 16
6 2III6-3 8
7 2VII7-1 64
7 2IV7-2 32
7 2IV7-3 16
7 2III7-4 8
8 2VI 8-1 128
8 2V8-2 64
8 2IV8-3 32
8 2IV8-4 16
9 2VI 9-2 128
9 2IV9-3 64
9 2IV9-4 32

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3347.htm (2 of 3) [11/14/2003 5:53:15 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3347.htm (3 of 3) [11/14/2003 5:53:15 PM]


5.3.3.5. Plackett-Burman designs 5.3.3.5. Plackett-Burman designs

Saturated PB designs also exist for 20-run, 24-run, and 28-run (and higher) designs. With a 20-run
Main Effect design you can run a screening experiment for up to 19 factors, up to 23 factors in a
designs 24-run design, and up to 27 factors in a 28-run design. These Resolution III designs are
5. Process Improvement known as Saturated Main Effect designs because all degrees of freedom are utilized to
5.3. Choosing an experimental design estimate main effects. The designs for 20 and 24 runs are shown below.
5.3.3. How do you select an experimental design?
20-Run TABLE 3.19 A 20-Run Plackett-Burman Design
Plackett- X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19
5.3.3.5. Plackett-Burman designs Burnam
1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1
design
2 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1
Plackett- In 1946, R.L. Plackett and J.P. Burman published their now famous paper "The Design
3 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1
Burman of Optimal Multifactorial Experiments" in Biometrika (vol. 33). This paper described
designs the construction of very economical designs with the run number a multiple of four 4 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1
(rather than a power of 2). Plackett-Burman designs are very efficient screening designs 5 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1
when only main effects are of interest. 6 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1
7 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1
These Plackett-Burman (PB) designs are used for screening experiments because, in a PB 8 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1
designs design, main effects are, in general, heavily confounded with two-factor interactions.
9 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1
have run The PB design in 12 runs, for example, may be used for an experiment containing up to
numbers 11 factors. 10 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1
that are a 11 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1 +1
multiple of 12 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1 -1
4 13 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1 +1
14 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1 +1
12-Run TABLE 3.18 Plackett-Burman Design in 12 Runs for up to 11 Factors
15 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1 +1
Plackett- Pattern X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11
Burnam 16 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1 +1
1 +++++++++++ +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 17 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1 -1
design
2 -+-+++---+- -1 +1 -1 +1 +1 +1 -1 -1 -1 +1 -1 18 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 -1
3 --+-+++---+ -1 -1 +1 -1 +1 +1 +1 -1 -1 -1 +1
19 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1
4 +--+-+++--- +1 -1 -1 +1 -1 +1 +1 +1 -1 -1 -1
20 +1 -1 -1 +1 +1 +1 +1 -1 +1 -1 +1 -1 -1 -1 -1 +1 +1 -1 -1
5 -+--+-+++-- -1 +1 -1 -1 +1 -1 +1 +1 +1 -1 -1
6 --+--+-+++- -1 -1 +1 -1 -1 +1 -1 +1 +1 +1 -1 24-Run TABLE 3.20 A 24-Run Plackett-Burman Design
7 ---+--+-+++ -1 -1 -1 +1 -1 -1 +1 -1 +1 +1 +1 Plackett- X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14 X15 X16 X17 X18 X19 X20 X21 X22 X23
8 +---+--+-++ +1 -1 -1 -1 +1 -1 -1 +1 -1 +1 +1 Burnam 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

9 ++---+--+-+ +1 +1 -1 -1 -1 +1 -1 -1 +1 -1 +1 design 2 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1
3 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1
10 +++---+--+- +1 +1 +1 -1 -1 -1 +1 -1 -1 +1 -1 4 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1
11 -+++---+--+ -1 +1 +1 +1 -1 -1 -1 +1 -1 -1 +1 5 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1
6 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1
12 +-+++---+-- +1 -1 +1 +1 +1 -1 -1 -1 +1 -1 -1
7 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1
8 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1
9 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1
10 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1
11 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1
12 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1 1
13 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1 -1

http://www.itl.nist.gov/div898/handbook/pri/section3/pri335.htm (1 of 3) [11/14/2003 5:53:15 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri335.htm (2 of 3) [11/14/2003 5:53:15 PM]


5.3.3.5. Plackett-Burman designs 5.3.3.6. Response surface designs

14 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1 -1
15 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1 1
16 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1 1
17 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1 -1
18 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1 1
19 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1 -1
20 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 1
5. Process Improvement
21 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 1 5.3. Choosing an experimental design
22 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1 1 5.3.3. How do you select an experimental design?
23 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1 1
24 1 1 1 1 -1 1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 -1 -1 -1 -1 -1

5.3.3.6. Response surface designs


No defining These designs do not have a defining relation since interactions are not identically equal
relation to main effects. With the designs, a main effect column Xi is either orthogonal to Response Earlier, we described the response surface method (RSM) objective. Under
XiXj or identical to plus or minus XiXj. For Plackett-Burman designs, the two-factor surface some circumstances, a model involving only main effects and interactions
interaction column XiXj is correlated with every Xk (for k not equal to i or j). models may may be appropriate to describe a response surface when
involve just 1. Analysis of the results revealed no evidence of "pure quadratic"
Economical However, these designs are very useful for economically detecting large main effects, main effects curvature in the response of interest (i.e., the response at the center
for assuming all interactions are negligible when compared with the few important main and approximately equals the average of the responses at the factorial
detecting effects. interactions runs).
large main or they may
also have 2. The design matrix originally used included the limits of the factor
effects
quadratic settings available to run the process.
and possibly
cubic terms
to account
for curvature

Equations for In other circumstances, a complete description of the process behavior might
quadratic require a quadratic or cubic model:
and cubic
models Quadratic

Cubic

These are the full models, with all possible terms, rarely would all of the
terms be needed in an application.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri335.htm (3 of 3) [11/14/2003 5:53:15 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri336.htm (1 of 6) [11/14/2003 5:53:16 PM]


5.3.3.6. Response surface designs 5.3.3.6. Response surface designs

Quadratic If the experimenter has defined factor limits appropriately and/or taken Possible Figures 3.13 through 3.15 illustrate possible behaviors of responses as
models advantage of all the tools available in multiple regression analysis behaviors of functions of factor settings. In each case, assume the value of the response
almost (transformations of responses and factors, for example), then finding an responses as increases from the bottom of the figure to the top and that the factor settings
always industrial process that requires a third-order model is highly unusual. functions of increase from left to right.
sufficient for Therefore, we will only focus on designs that are useful for fitting quadratic factor
industrial models. As we will see, these designs often provide lack of fit detection that settings
applications will help determine when a higher-order model is needed.

General Figures 3.9 to 3.12 identify the general quadratic surface types that an
quadratic investigator might encounter
surface types

FIGURE 3.13 FIGURE 3.14 FIGURE 3.15


Linear Function Quadratic Function Cubic Function

A two-level If a response behaves as in Figure 3.13, the design matrix to quantify that
experiment behavior need only contain factors with two levels -- low and high. This
with center model is a basic assumption of simple two-level factorial and fractional
points can factorial designs. If a response behaves as in Figure 3.14, the minimum
detect, but number of levels required for a factor to quantify that behavior is three. One
FIGURE 3.9 A Response FIGURE 3.10 A Response not fit, might logically assume that adding center points to a two-level design would
Surface "Peak" Surface "Hillside" quadratic satisfy that requirement, but the arrangement of the treatments in such a
effects matrix confounds all quadratic effects with each other. While a two-level
design with center points cannot estimate individual pure quadratic effects, it
can detect them effectively.

Three-level A solution to creating a design matrix that permits the estimation of simple
factorial curvature as shown in Figure 3.14 would be to use a three-level factorial
design design. Table 3.21 explores that possibility.

Four-level Finally, in more complex cases such as illustrated in Figure 3.15, the design
factorial matrix must contain at least four levels of each factor to characterize the
FIGURE 3.11 A Response FIGURE 3.12 A Response design behavior of the response adequately.
Surface "Rising Ridge" Surface "Saddle"

Factor Levels for Higher-Order Designs

http://www.itl.nist.gov/div898/handbook/pri/section3/pri336.htm (2 of 6) [11/14/2003 5:53:16 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri336.htm (3 of 6) [11/14/2003 5:53:16 PM]


5.3.3.6. Response surface designs 5.3.3.6. Response surface designs

3-level TABLE 3.21 Three-level Factorial Designs "Rotatability" In a rotatable design, the variance of the predicted values of y is a function of
factorial Number Treatment Combinations Number of Coefficients is a desirable the distance of a point from the center of the design and is not a function of
designs can of Factors 3k Factorial Quadratic Empirical Model property not the direction the point lies from the center. Before a study begins, little or no
fit quadratic present in knowledge may exist about the region that contains the optimum response.
models but 2 9 6 3-level Therefore, the experimental design matrix should not bias an investigation in
they require 3 27 10 factorial any direction.
many runs 4 81 15 designs
when there 5 243 21
are more 6 729 28 Contours of In a rotatable design, the contours associated with the variance of the
than 4 factors variance of predicted values are concentric circles. Figures 3.16 and 3.17 (adapted from
predicted Box and Draper, `Empirical Model Building and Response Surfaces,' page
Fractional Two-level factorial designs quickly become too large for practical application values are 485) illustrate a three-dimensional plot and contour plot, respectively, of the
factorial as the number of factors investigated increases. This problem was the concentric `information function' associated with a 32 design.
designs motivation for creating `fractional factorial' designs. Table 3.21 shows that circles
created to the number of runs required for a 3k factorial becomes unacceptable even
avoid such a more quickly than for 2k designs. The last column in Table 3.21 shows the Information The information function is:
large number number of terms present in a quadratic model for each case. function
of runs

Number of With only a modest number of factors, the number of runs is very large, even
runs large an order of magnitude greater than the number of parameters to be estimated with V denoting the variance (of the predicted value ).
even for when k isn't small. For example, the absolute minimum number of runs
modest required to estimate all the terms present in a four-factor quadratic model is Each figure clearly shows that the information content of the design is not
number of 15: the intercept term, 4 main effects, 6 two-factor interactions, and 4 only a function of the distance from the center of the design space, but also a
factors quadratic terms. function of direction.

The corresponding 3k design for k = 4 requires 81 runs. Graphs of the Figures 3.18 and 3.19 are the corresponding graphs of the information
information function for a rotatable quadratic design. In each of these figures, the value of
Complex Considering a fractional factorial at three levels is a logical step, given the function for a the information function depends only on the distance of a point from the
alias success of fractional designs when applied to two-level designs. rotatable center of the space.
structure and Unfortunately, the alias structure for the three-level fractional factorial quadratic
lack of designs is considerably more complex and harder to define than in the design
rotatability two-level case.
for 3-level
fractional Additionally, the three-level factorial designs suffer a major flaw in their lack
factorial of `rotatability.'
designs

Rotatability of Designs

http://www.itl.nist.gov/div898/handbook/pri/section3/pri336.htm (4 of 6) [11/14/2003 5:53:16 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri336.htm (5 of 6) [11/14/2003 5:53:16 PM]


5.3.3.6. Response surface designs 5.3.3.6.1. Central Composite Designs (CCD)

FIGURE 3.16
Three-Dimensional FIGURE 3.17
Illustration for the Contour Map of the Information Function
Information Function of a for a 32 Design
32 Design 5. Process Improvement
5.3. Choosing an experimental design
5.3.3. How do you select an experimental design?
5.3.3.6. Response surface designs

5.3.3.6.1. Central Composite Designs (CCD)


Box-Wilson Central Composite Designs

CCD designs A Box-Wilson Central Composite Design, commonly called `a central


start with a composite design,' contains an imbedded factorial or fractional
factorial or factorial design with center points that is augmented with a group of
fractional `star points' that allow estimation of curvature. If the distance from the
FIGURE 3.18
factorial center of the design space to a factorial point is ±1 unit for each factor,
Three-Dimensional
FIGURE 3.19 Contour Map of the design (with the distance from the center of the design space to a star point is ±
Illustration of the
Information Function for a Rotatable center points) with | | > 1. The precise value of depends on certain properties
Information Function for a
Quadratic Design for Two Factors and add desired for the design and on the number of factors involved.
Rotatable Quadratic Design
"star" points
for Two Factors Similarly, the number of centerpoint runs the design is to contain also
to estimate
curvature depends on certain properties required for the design.
Classical Quadratic Designs
Diagram of
Central Introduced during the 1950's, classical quadratic designs fall into two broad central
composite categories: Box-Wilson central composite designs and Box-Behnken designs. composite
and The next sections describe these design classes and their properties. design
Box-Behnken generation for
designs two factors

FIGURE 3.20 Generation of a Central Composite Design for Two


Factors

http://www.itl.nist.gov/div898/handbook/pri/section3/pri336.htm (6 of 6) [11/14/2003 5:53:16 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3361.htm (1 of 5) [11/14/2003 5:53:17 PM]


5.3.3.6.1. Central Composite Designs (CCD) 5.3.3.6.1. Central Composite Designs (CCD)

In this design the star points are


A CCD design A central composite design always contains twice as many star points at the center of each face of the
with k factors as there are factors in the design. The star points represent new factorial space, so = ± 1. This
has 2k star extreme values (low and high) for each factor in the design. Table 3.22 variety requires 3 levels of each
Face Centered CCF
points summarizes the properties of the three varieties of central composite factor. Augmenting an existing
designs. Figure 3.21 illustrates the relationships among these varieties. factorial or resolution V design
with appropriate star points can
Description of TABLE 3.22 Central Composite Designs also produce this design.
3 types of Central Composite
CCD designs, Terminology Comments Pictorial
Design Type
which depend representation
CCC designs are the original
on where the of where the
form of the central composite
star points star points
design. The star points are at
are placed are placed for
some distance from the center
based on the properties desired the 3 types of
for the design and the number of CCD designs
factors in the design. The star
points establish new extremes for
the low and high settings for all
Circumscribed CCC
factors. Figure 5 illustrates a
CCC design. These designs have
circular, spherical, or
hyperspherical symmetry and
require 5 levels for each factor.
Augmenting an existing factorial
or resolution V fractional
factorial design with star points
can produce this design.
For those situations in which the
limits specified for factor settings
are truly limits, the CCI design
uses the factor settings as the star
points and creates a factorial or
fractional factorial design within
Inscribed CCI those limits (in other words, a
CCI design is a scaled down
CCC design with each factor
level of the CCC design divided
by to generate the CCI design).
FIGURE 3.21 Comparison of the Three Types of Central
This design also requires 5 levels
Composite Designs
of each factor.

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3361.htm (2 of 5) [11/14/2003 5:53:17 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3361.htm (3 of 5) [11/14/2003 5:53:17 PM]


5.3.3.6.1. Central Composite Designs (CCD) 5.3.3.6.1. Central Composite Designs (CCD)

Comparison The diagrams in Figure 3.21 illustrate the three types of central Orthogonal The value of also depends on whether or not the design is
of the 3 composite designs for two factors. Note that the CCC explores the blocking orthogonally blocked. That is, the question is whether or not the
central largest process space and the CCI explores the smallest process space. design is divided into blocks such that the block effects do not affect
composite Both the CCC and CCI are rotatable designs, but the CCF is not. In the the estimates of the coefficients in the 2nd order model.
designs CCC design, the design points describe a circle circumscribed about
the factorial square. For three factors, the CCC design points describe Example of Under some circumstances, the value of allows simultaneous
a sphere around the factorial cube. both rotatability and orthogonality. One such example for k = 2 is shown
rotatability below:
Determining in Central Composite Designs and
orthogonal BLOCK X1 X2
The value of To maintain rotatability, the value of depends on the number of blocking for
is chosen to experimental runs in the factorial portion of the central composite two factors 1 -1 -1
maintain design: 1 1 -1
rotatability 1 -1 1
1 1 1
1 0 0
1 0 0
2 -1.414 0
If the factorial is a full factorial, then 2 1.414 0
2 0 -1.414
2 0 1.414
2 0 0
2 0 0

However, the factorial portion can also be a fractional factorial design


Additional Examples of other central composite designs will be given after
of resolution V.
central Box-Behnken designs are described.
Table 3.23 illustrates some typical values of as a function of the composite
number of factors. designs

Values of TABLE 3.23 Determining for Rotatability


depending on Number of Factorial Scaled Value for
the number of Factors Portion Relative to ±1
factors in the
factorial part 2 22 22/4 = 1.414
of the design 3 23 23/4 = 1.682
4 24 24/4 = 2.000
5 25-1 24/4 = 2.000
5 25 25/4 = 2.378
6 26-1 25/4 = 2.378
6 26 26/4 = 2.828

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3361.htm (4 of 5) [11/14/2003 5:53:17 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3361.htm (5 of 5) [11/14/2003 5:53:17 PM]


5.3.3.6.2. Box-Behnken designs 5.3.3.6.2. Box-Behnken designs

Geometry of The geometry of this design suggests a sphere within the process space
the design such that the surface of the sphere protrudes through each face with the
surface of the sphere tangential to the midpoint of each edge of the
5. Process Improvement space.
5.3. Choosing an experimental design
Examples of Box-Behnken designs are given on the next page.
5.3.3. How do you select an experimental design?
5.3.3.6. Response surface designs

5.3.3.6.2. Box-Behnken designs


An alternate The Box-Behnken design is an independent quadratic design in that it
choice for does not contain an embedded factorial or fractional factorial design. In
fitting this design the treatment combinations are at the midpoints of edges of
quadratic the process space and at the center. These designs are rotatable (or near
models that rotatable) and require 3 levels of each factor. The designs have limited
requires 3 capability for orthogonal blocking compared to the central composite
levels of designs.
each factor
and is Figure 3.22 illustrates a Box-Behnken design for three factors.
rotatable (or
"nearly"
rotatable)

Box-Behnken FIGURE 3.22 A Box-Behnken Design for Three Factors


design for 3
factors

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3362.htm (1 of 2) [11/14/2003 5:53:17 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3362.htm (2 of 2) [11/14/2003 5:53:17 PM]


5.3.3.6.3. Comparisons of response surface designs 5.3.3.6.3. Comparisons of response surface designs

1 0 0 -1.682 1 0 0 -1 3 0 0 0
1 0 0 1.682 1 0 0 +1
6 0 0 0 6 0 0 0
Total Runs = 20 Total Runs = 20 Total Runs = 15
5. Process Improvement
5.3. Choosing an experimental design
5.3.3. How do you select an experimental design? Factor Table 3.25 illustrates the factor settings required for a central composite
5.3.3.6. Response surface designs settings for circumscribed (CCC) design and for a central composite inscribed (CCI) design
CCC and (standard order), assuming three factors, each with low and high settings of 10
CCI three and 20, respectively. Because the CCC design generates new extremes for all
5.3.3.6.3. Comparisons of response surface factor factors, the investigator must inspect any worksheet generated for such a design
designs to make certain that the factor settings called for are reasonable.
designs
In Table 3.25, treatments 1 to 8 in each case are the factorial points in the design;
treatments 9 to 14 are the star points; and 15 to 20 are the system-recommended
Choosing a Response Surface Design
center points. Notice in the CCC design how the low and high values of each
factor have been extended to create the star points. In the CCI design, the
Various Table 3.24 contrasts the structures of four common quadratic designs one might specified low and high values become the star points, and the system computes
CCD designs use when investigating three factors. The table combines CCC and CCI designs appropriate settings for the factorial part of the design inside those boundaries.
and because they are structurally identical.
Box-Behnken TABLE 3.25 Factor Settings for CCC and CCI Designs for Three
designs are For three factors, the Box-Behnken design offers some advantage in requiring a Factors
compared fewer number of runs. For 4 or more factors, this advantage disappears.
Central Composite Central Composite
and their Circumscribed CCC Inscribed CCI
properties Sequence Sequence
discussed Number X1 X2 X3 Number X1 X2 X3
1 10 10 10 1 12 12 12
Structural TABLE 3.24 Structural Comparisons of CCC (CCI), CCF, and
comparisons Box-Behnken Designs for Three Factors 2 20 10 10 2 18 12 12
of CCC 3 10 20 10 3 12 18 12
CCC (CCI) CCF Box-Behnken
(CCI), CCF, 4 20 20 10 4 18 18 12
Rep X1 X2 X3 Rep X1 X2 X3 Rep X1 X2 X3
and 5 10 10 20 5 12 12 18
Box-Behnken 1 -1 -1 -1 1 -1 -1 -1 1 -1 -1 0
6 20 10 20 6 18 12 18
designs for 1 +1 -1 -1 1 +1 -1 -1 1 +1 -1 0
three factors 7 10 20 20 7 12 12 18
1 -1 +1 -1 1 -1 +1 -1 1 -1 +1 0
8 20 20 20 8 18 18 18
1 +1 +1 -1 1 +1 +1 -1 1 +1 +1 0
1 -1 -1 +1 1 -1 -1 +1 1 -1 0 -1 9 6.6 15 15 * 9 10 15 15
10 23.4 15 15 * 10 20 15 15
1 +1 -1 +1 1 +1 -1 +1 1 +1 0 -1
11 15 6.6 15 * 11 15 10 15
1 -1 +1 +1 1 -1 +1 +1 1 -1 0 +1
12 15 23.4 15 * 12 15 20 15
1 +1 +1 +1 1 +1 +1 +1 1 +1 0 +1
13 15 15 6.6 * 13 15 15 10
1 -1.682 0 0 1 -1 0 0 1 0 -1 -1
14 15 15 23.4 * 14 15 15 20
1 1.682 0 0 1 +1 0 0 1 0 +1 -1
15 15 15 15 15 15 15 15
1 0 -1.682 0 1 0 -1 0 1 0 -1 +1
16 15 15 15 16 15 15 15
1 0 1.682 0 1 0 +1 0 1 0 +1 +1
17 15 15 15 17 15 15 15

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3363.htm (1 of 5) [11/14/2003 5:53:17 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3363.htm (2 of 5) [11/14/2003 5:53:17 PM]


5.3.3.6.3. Comparisons of response surface designs 5.3.3.6.3. Comparisons of response surface designs

18 15 15 15 18 15 15 15
19 15 15 15 19 15 15 15 Properties of Table 3.27 summarizes properties of the classical quadratic designs. Use this table
20 15 15 15 20 15 15 15 classical for broad guidelines when attempting to choose from among available designs.
response
surface TABLE 3.27 Summary of Properties of Classical Response Surface Designs
* are star points designs Design Type Comment
CCC designs provide high quality predictions over the entire
Factor Table 3.26 illustrates the factor settings for the corresponding central composite design space, but require factor settings outside the range of the
settings for face-centered (CCF) and Box-Behnken designs. Note that each of these designs factors in the factorial part. Note: When the possibility of running
CCF and provides three levels for each factor and that the Box-Behnken design requires a CCC design is recognized before starting a factorial experiment,
CCC
Box-Behnken fewer runs in the three-factor case. factor spacings can be reduced to ensure that ± for each coded
three factor factor corresponds to feasible (reasonable) levels.
designs TABLE 3.26 Factor Settings for CCF and Box-Behnken Designs for
Three Factors Requires 5 levels for each factor.
Central Composite Box-Behnken CCI designs use only points within the factor ranges originally
Face-Centered CCC specified, but do not provide the same high quality prediction
Sequence Sequence CCI over the entire space compared to the CCC.
Number X1 X2 X3 Number X1 X2 X3
Requires 5 levels of each factor.
1 10 10 10 1 10 10 10
CCF designs provide relatively high quality predictions over the
2 20 10 10 2 20 10 15 entire design space and do not require using points outside the
3 10 20 10 3 10 20 15 CCF original factor range. However, they give poor precision for
4 20 20 10 4 20 20 15 estimating pure quadratic coefficients.
5 10 10 20 5 10 15 10 Requires 3 levels for each factor.
6 20 10 20 6 20 15 10 These designs require fewer treatment combinations than a
7 10 20 20 7 10 15 20 central composite design in cases involving 3 or 4 factors.
8 20 20 20 8 20 15 20
The Box-Behnken design is rotatable (or nearly so) but it contains
9 10 15 15 * 9 15 10 10
regions of poor prediction quality like the CCI. Its "missing
10 20 15 15 * 10 15 20 10 Box-Behnken
corners" may be useful when the experimenter should avoid
11 15 10 15 * 11 15 10 20 combined factor extremes. This property prevents a potential loss
12 15 20 15 * 12 15 20 20 of data in those cases.
13 15 15 10 * 13 15 15 15 Requires 3 levels for each factor.
14 15 15 20 * 14 15 15 15
15 15 15 15 15 15 15 15
16 15 15 15
17 15 15 15
18 15 15 15
19 15 15 15
20 15 15 15

* are star points for the CCC

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3363.htm (3 of 5) [11/14/2003 5:53:17 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3363.htm (4 of 5) [11/14/2003 5:53:17 PM]


5.3.3.6.3. Comparisons of response surface designs 5.3.3.6.4. Blocking a response surface design

Number of Table 3.28 compares the number of runs required for a given number of factors
runs for various Central Composite and Box-Behnken designs.
required by
central TABLE 3.28 Number of Runs Required by Central Composite and
5. Process Improvement
composite Box-Behnken Designs
5.3. Choosing an experimental design
and Number of Factors Central Composite Box-Behnken 5.3.3. How do you select an experimental design?
Box-Behnken 2 13 (5 center points) - 5.3.3.6. Response surface designs
designs 3 20 (6 centerpoint runs) 15
4
5
30 (6 centerpoint runs)
33 (fractional factorial) or 52 (full factorial)
27
46
5.3.3.6.4. Blocking a response surface design
6 54 (fractional factorial) or 91 (full factorial) 54 How can we block a response surface design?

Desirable Features for Response Surface Designs When If an investigator has run either a 2k full factorial or a 2k-p fractional factorial
augmenting design of at least resolution V, augmentation of that design to a central
A summary G. E. P. Box and N. R. Draper in "Empirical Model Building and Response a resolution composite design (either CCC of CCF) is easily accomplished by adding an
of desirable Surfaces," John Wiley and Sons, New York, 1987, page 477, identify desirable V design to additional set (block) of star and centerpoint runs. If the factorial experiment
properties properties for a response surface design: a CCC indicated (via the t test) curvature, this composite augmentation is the best
for response ● Satisfactory distribution of information across the experimental region. design by follow-up option (follow-up options for other situations will be discussed later).
surface adding star
- rotatability
designs points, it
● Fitted values are as close as possible to observed values. may be
- minimize residuals or error of prediction desirable to
● Good lack of fit detection. block the
design
● Internal estimate of error.
● Constant variance check. An An important point to take into account when choosing a response surface
● Transformations can be estimated. orthogonal design is the possibility of running the design in blocks. Blocked designs are
● Suitability for blocking. blocked better designs if the design allows the estimation of individual and interaction
response factor effects independently of the block effects. This condition is called
● Sequential construction of higher order designs from simpler designs
surface orthogonal blocking. Blocks are assumed to have no impact on the nature and
● Minimum number of treatment combinations. design has shape of the response surface.
● Good graphical analysis through simple data patterns. advantages
● Good behavior when errors in settings of input variables occur.
CCF The CCF design does not allow orthogonal blocking and the Box-Behnken
designs designs offer blocking only in limited circumstances, whereas the CCC does
cannot be permit orthogonal blocking.
orthogonally
blocked

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3363.htm (5 of 5) [11/14/2003 5:53:17 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3364.htm (1 of 5) [11/14/2003 5:53:18 PM]


5.3.3.6.4. Blocking a response surface design 5.3.3.6.4. Blocking a response surface design

-+- 2 -1 +1 -1 Full Factorial


Axial and In general, when two blocks are required there should be an axial block and a +-- 2 +1 -1 -1 Full Factorial
factorial factorial block. For three blocks, the factorial block is divided into two blocks +++ 2 +1 +1 +1 Full Factorial
blocks and the axial block is not split. The blocking of the factorial design points 000 2 0 0 0 Center-Full Factorial
should result in orthogonality between blocks and individual factors and 000 2 0 0 0 Center-Full Factorial
between blocks and the two factor interactions. -00 3 -1.681793 0 0 Axial
The following Central Composite design in two factors is broken into two +00 3 +1.681793 0 0 Axial
blocks. 0-0 3 0 -1.681793 0 Axial
0+0 3 0 +1.681793 0 Axial
Table of TABLE 3.29 CCD: 2 Factors, 2 Blocks 00- 3 0 0 -1.681793 Axial
CCD design 00+ 3 0 0 +1.681793 Axial
Pattern Block X1 X2 Comment
with 2 000 3 0 0 0 Axial
factors and -- +1 -1 -1 Full Factorial 000 3 0 0 0 Axial
2 blocks -+ +1 -1 +1 Full Factorial
+- +1 +1 -1 Full Factorial Table of TABLE 3.31 CCD: 4 Factors, 3 Blocks
++ +1 +1 +1 Full Factorial CCD design Pattern Block X1 X2 X3 X4 Comment
00 +1 0 0 Center-Full Factorial with 4
00 +1 0 0 Center-Full Factorial factors and ---+ 1 -1 -1 -1 +1 Full Factorial
00 +1 0 0 Center-Full Factorial 3 blocks --+- 1 -1 -1 +1 -1 Full Factorial
-0 +2 -1.414214 0 Axial -+-- 1 -1 +1 -1 -1 Full Factorial
+0 +2 +1.414214 0 Axial -+++ 1 -1 +1 +1 +1 Full Factorial
0- +2 0 -1.414214 Axial +--- 1 +1 -1 -1 -1 Full Factorial
0+ +2 0 +1.414214 Axial +-++ 1 +1 -1 +1 +1 Full Factorial
00 +2 0 0 Center-Axial ++-+ 1 +1 +1 -1 +1 Full Factorial
00 +2 0 0 Center-Axial +++- 1 +1 +1 +1 -1 Full Factorial
00 +2 0 0 Center-Axial 0000 1 0 0 0 0 Center-Full Factorial
0000 1 0 0 0 0 Center-Full Factorial
Note that the first block includes the full factorial points and three centerpoint ---- 2 -1 -1 -1 -1 Full Factorial
replicates. The second block includes the axial points and another three --++ 2 -1 -1 +1 +1 Full Factorial
centerpoint replicates. Naturally these two blocks should be run as two separate -+-+ 2 -1 +1 -1 +1 Full Factorial
random sequences. -++- 2 -1 +1 +1 -1 Full Factorial
+--+ 2 +1 -1 -1 +1 Full Factorial
Table of The following three examples show blocking structure for various designs. +-+- 2 +1 -1 +1 -1 Full Factorial
CCD design ++-- 2 +1 +1 -1 -1 Full Factorial
with 3 TABLE 3.30 CCD: 3 Factors 3 Blocks, Sorted by Block
Pattern Block X1 X2 X3 Comment ++++ 2 +1 +1 +1 +1 Full Factorial
factors and
0000 2 0 0 0 0 Center-Full Factorial
3 blocks
--- 1 -1 -1 -1 Full Factorial 0000 2 0 0 0 0 Center-Full Factorial
-++ 1 -1 +1 +1 Full Factorial -000 3 -2 0 0 0 Axial
+-+ 1 +1 -1 +1 Full Factorial +000 3 +2 0 0 0 Axial
++- 1 +1 +1 -1 Full Factorial +000 3 +2 0 0 0 Axial
000 1 0 0 0 Center-Full Factorial 0-00 3 0 -2 0 0 Axial
000 1 0 0 0 Center-Full Factorial 0+00 3 0 +2 0 0 Axial
--+ 2 -1 -1 +1 Full Factorial 00-0 3 0 0 -2 0 Axial

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3364.htm (2 of 5) [11/14/2003 5:53:18 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3364.htm (3 of 5) [11/14/2003 5:53:18 PM]


5.3.3.6.4. Blocking a response surface design 5.3.3.6.4. Blocking a response surface design

00+0 3 0 0 +2 0 Axial 000+0 2 0 0 0 +2 0 Axial


000- 3 0 0 0 -2 Axial 0000- 2 0 0 0 0 -2 Axial
000+ 3 0 0 0 +2 Axial 0000+ 2 0 0 0 0 +2 Axial
0000 3 0 0 0 0 Center-Axial 00000 2 0 0 0 0 0 Center-Axial

Table TABLE 3.32 CCD: 5 Factors, 2 Blocks


of Pattern Block X1 X2 X3 X4 X5 Comment
CCD
design ----+ 1 -1 -1 -1 -1 +1 Fractional Factorial
with 5 ---+- 1 -1 -1 -1 +1 -1 Fractional Factorial
factors --+-- 1 -1 -1 +1 -1 -1 Fractional Factorial
and 2 --+++ 1 -1 -1 +1 +1 +1 Fractional Factorial
blocks -+--- 1 -1 +1 -1 -1 -1 Fractional Factorial
-+-++ 1 -1 +1 -1 +1 +1 Fractional Factorial
-++-+ 1 -1 +1 +1 -1 +1 Fractional Factorial
-+++- 1 -1 +1 +1 +1 -1 Fractional Factorial
+---- 1 +1 -1 -1 -1 -1 Fractional Factorial
+--++ 1 +1 -1 -1 +1 +1 Fractional Factorial
+-+-+ 1 +1 -1 +1 -1 +1 Fractional Factorial
+-++- 1 +1 -1 +1 +1 -1 Fractional Factorial
++--+ 1 +1 +1 -1 -1 +1 Fractional Factorial
++-+- 1 +1 +1 -1 +1 -1 Fractional Factorial
+++-- 1 +1 +1 +1 -1 -1 Fractional Factorial
+++++ 1 +1 +1 +1 +1 +1 Fractional Factorial
00000 1 0 0 0 0 0 Center-Fractional
Factorial
00000 1 0 0 0 0 0 Center-Fractional
Factorial
00000 1 0 0 0 0 0 Center-Fractional
Factorial
00000 1 0 0 0 0 0 Center-Fractional
Factorial
00000 1 0 0 0 0 0 Center-Fractional
Factorial
00000 1 0 0 0 0 0 Center-Fractional
Factorial
-0000 2 -2 0 0 0 0 Axial
+0000 2 +2 0 0 0 0 Axial
0-000 2 0 -2 0 0 0 Axial
0+000 2 0 +2 0 0 0 Axial
00-00 2 0 0 -2 0 0 Axial
00+00 2 0 0 +2 0 0 Axial
000-0 2 0 0 0 -2 0 Axial

http://www.itl.nist.gov/div898/handbook/pri/section3/pri3364.htm (4 of 5) [11/14/2003 5:53:18 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri3364.htm (5 of 5) [11/14/2003 5:53:18 PM]


5.3.3.7. Adding centerpoints 5.3.3.7. Adding centerpoints

Table of In the following Table we have added three centerpoint runs to the
randomized, otherwise randomized design matrix, making a total of nineteen runs.
replicated
23 full TABLE 3.32 Randomized, Replicated 23 Full Factorial Design
5. Process Improvement
factorial Matrix with Centerpoint Control Runs Added
5.3. Choosing an experimental design
5.3.3. How do you select an experimental design? design with Random Order Standard Order SPEED FEED DEPTH
centerpoints 1 not applicable not applicable 0 0 0
2 1 5 -1 -1 1
5.3.3.7. Adding centerpoints 3 2 15 -1 1 1
4 3 9 -1 -1 -1
Center point, or `Control' Runs
5 4 7 -1 1 1
Centerpoint As mentioned earlier in this section, we add centerpoint runs 6 5 3 -1 1 -1
runs provide interspersed among the experimental setting runs for two purposes: 7 6 12 1 1 -1
a check for 1. To provide a measure of process stability and 8 7 6 1 -1 1
both process inherent variability 9 8 4 1 1 -1
stability and 10 not applicable not applicable 0 0 0
2. To check for curvature.
possible
11 9 2 1 -1 -1
curvature
12 10 13 -1 -1 1
Centerpoint Centerpoint runs should begin and end the experiment, and should be 13 11 8 1 1 1
runs are not dispersed as evenly as possible throughout the design matrix. The 14 12 16 1 1 1
randomized centerpoint runs are not randomized! There would be no reason to 15 13 1 -1 -1 -1
randomize them as they are there as guardians against process instability 16 14 14 1 -1 1
and the best way to find instability is to sample the process on a regular
17 15 11 -1 1 -1
basis.
18 16 10 1 -1 -1
Rough rule With this in mind, we have to decide on how many centerpoint runs to 19 not applicable not applicable 0 0 0
of thumb is do. This is a tradeoff between the resources we have, the need for
to add 3 to 5 enough runs to see if there is process instability, and the desire to get the Preparing a To prepare a worksheet for an operator to use when running the
center point experiment over with as quickly as possible. As a rough guide, you worksheet experiment, delete the columns `RandOrd' and `Standard Order.' Add an
runs to your should generally add approximately 3 to 5 centerpoint runs to a full or for operator additional column for the output (Yield) on the right, and change all `-1',
design fractional factorial design. of `0', and `1' to original factor levels as follows.
experiment

http://www.itl.nist.gov/div898/handbook/pri/section3/pri337.htm (1 of 4) [11/14/2003 5:53:18 PM] http://www.itl.nist.gov/div898/handbook/pri/section3/pri337.htm (2 of 4) [11/14/2003 5:53:18 PM]


5.3.3.7. Adding centerpoints 5.3.3.7. Adding centerpoints

Operator TABLE 3.33 DOE Worksheet Ready to Run Center Points in Response Surface Designs
worksheet Sequence
Number Speed Feed Depth Yield Uniform In an unblocked response surface design, the number of center points
1 20 0.003 0.015 precision controls other properties of the design matrix. The number of center

You might also like