Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
104 views

Prob & Stats (Slides) PDF

This document provides an overview of probability and statistics concepts for engineering experimentation. It discusses why statistical analysis is important, sources of variation in measured data, statistical measurement theory, and common probability and statistics terms. Key concepts covered include population vs sample, measures of central tendency and dispersion, graphical representations like histograms and probability density functions, and how sample size affects analysis. The document aims to introduce foundational probability and statistics topics relevant for experimental design and data analysis.

Uploaded by

Kawser Ahmed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
104 views

Prob & Stats (Slides) PDF

This document provides an overview of probability and statistics concepts for engineering experimentation. It discusses why statistical analysis is important, sources of variation in measured data, statistical measurement theory, and common probability and statistics terms. Key concepts covered include population vs sample, measures of central tendency and dispersion, graphical representations like histograms and probability density functions, and how sample size affects analysis. The document aims to introduce foundational probability and statistics topics relevant for experimental design and data analysis.

Uploaded by

Kawser Ahmed
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 101

EGR 601

Advanced Engineering Experimentation

PROBABILITY AND STATISTICS

Dr. Ernur Karadoğan

1
Why statistics matter?
 Given a sufficiently precise instrument, no two
measurements (taken at different times or locations)
will be exactly the same.

 A statistical analysis of a set of measurements can


extract more information (or better information)
than consideration of a single measurement.

 Statistical analysis characterizes experimental data by


determining parameters that specify the central
tendency and the dispersion of the data.
2
Causes of Variation (Randomness) in
Measured Data
 Variation in the underlying measured quantity (over time
or location)

 Error in the measurement


 Bias (Systematic) error: the measurement is consistently off
from the true value (in a specific direction by a specific
amount)
 Random (Precision) error: a measurement differs from the
true value in a random, fluctuating way. If the fluctuations are
slow, they are referred to as drift, if they are rapid, they are
called noise.
 Mistakes (unit conversion, incorrect experimental setup)

3
Statistical Measurement Theory
 A sample of data refers to a set of data obtained during
repeated measurements of a variable under fixed
operating conditions (measured variable = measurand).

 Sampling is taken from the population, i.e. all possible


values of the measured variable.

 The estimation of true mean value, 𝒙′ from the repeated


measurements of the variable, 𝒙. So we have a sample of
variable of 𝒙 under controlled, fixed operating conditions
from a finite number of data points.

4
Best Estimate
The range within which the true value will lie with P% probability

 𝒙′ is the population mean, a.k.a. true value

 𝒙 is the sample mean

 𝑢𝑥 is the uncertainty interval in the estimation of the true


value at some probability level, 𝑷%.

 Finite-sized data can only give the estimate of the true value.
 Ex: measuring the diameter of a certain number of the manufactured
bearings and estimating the batch’s diameter.

5
Random vs. Systematic Error

 The uncertainty interval is based both on estimates of the random


(precision) error and on systematic (bias) error in the
measurement of x.

 When systematic error is ignored, 𝑢𝑥 is the confidence interval.

 For now, we will estimate 𝒙′ and the random error in 𝒙 caused only
by the variation in the data set (effects of random error that is
called random uncertainty).

 Systematic error doesn’t alter the statistics of a measurement as


they are constant and independent of the number of repeated
measurements.

6
Why Statistical Analysis?
 Statistical analysis characterizes experimental data by
determining parameters that specify the central tendency
and the dispersion (spread) of the data.

7
Common Terms

 Population- the collection of


measurements/observations of interest

 Sample- representative subset of the population

 Sample space- set of all possible experimental


outcomes. Example: there are six possible outcomes in
rolling a fair dice ({1, 2, 3, 4, 5, 6}).

8
Common Terms
 Random Variable- The variables being measured in
an experiment are considered random variables. The
outcome of the measurement is not unique and
influenced by many uncontrollable factors (generally
unavoidable).

 A random variable is affected by random chance

9
Random Variables

Continuous (variable) Data Discrete (attribute) Data


 Data measured on an infinitely  Discrete data measures
divisible scale or continuum. attributes, qualitative
 No gaps between possible conditions, counts.
values  Gaps between possible values
 Examples  Examples
 Tire pressure  # of defects per unit
 Cycle Time  # of forest Fires
 Speed  # of calls on hold per hour
 Length/Height  # of employees
 Response time  Rolls of the dice

10
Population vs. Sample
 Population - A total set of all process results
 Sample - A subset of a population

POPULATION Sample

Population Measures Sample Measures

N number of data points n, N


_
m, 𝒙′ mean x

11 s standard deviation S
Measures of Central Tendency
 Mean – also known as average; sum of all values divided
by number of values
N
Population Mean  m  1
N x
i 1
i

n
Sample Mean  x  1n  xi
i 1

 Median - midpoint of the data


 Arrange the data from lowest to highest, the median is the middle
data point number
 50% of the data points will fall below the median and the other 50%
will fall above the median
 Mode - the most frequent data point, or value occurring
the most
12
Measures of Dispersion
(Variation or Spread)
Range
 Range - Total width of a distribution.
 Range = Maximum Value - Minimum Value

 Variance (V) – Measure of the spread in data about the


mean
N
Population Variance  s  2 1
N

 ix  m 2

i 1
Deviation from the mean
n
Sample Variance  S   x  x 
2 1 2
 n 1 i
i 1

 Standard Deviation is the most common measure of


dispersion
Standard Deviation  Variance
13
Graphical Representation

mean
Dot Diagram

14
Graphical Representation
 Histogram visually represent data centering, variability, and
shape

 Histogram is a graphical tool used to depict the frequency of


numerical data by categories (classes or bins)

Histogram
Sum from Rolling Two Six Sided Dice
Properties
16
14
• All data will fall into a
Frequency

12
10 class or bin
8 • No data will overlap
6
4
2
0
2 3 4 5 6 7 8 9 10 11 12
15
Sum of Dice
Histogram

min

max

Number of bins (suggestion)*


(Small N) ~7

(Large N)

*Ensures at least one bin with at least 5 occurrences

16
Effect of data size

5 100

50
1000

17
Effect of interval numbers
As the bin width
tends to zero; the
envelope of the
histogram becomes
a function which
can be evaluated
for any value of x.

Continuous
envelope
18
Probability Density Function
(𝑓(𝑥) or 𝑝(𝑥))
 A probability density function, f(x), defines the
probability of occurrence of the random variable in
an interval between xi and xi+dx

P xi  x  xi  dx   f  xi dx 0.2

Pa  x  b    f  x dx
b
0.15
a

P   x     1
f(x)
0.1


m   xf  x dx

0.05


s  2

 x  m  f  x dx
2
0
0 5 10 15 20

x
19
21
22
Population
mean

25
infant mortality failures
End-of-life wear-out failure

31 * Experimentally determined histograms are used to infer guidance on the most likely model for p(x).
Gaussian (Normal) Distribution
 Frequently, a stable, controlled process will produce a histogram
that resembles the bell shaped curve also known as the
Gaussian Distribution (Normal) Distribution of birth weight in 3,226 newborn
babies (data from O' Cathain et al 2002)
Common applications:
Exam scores
Human body temperature
Human birth weight
Dimensional tolerances
Employee performance

Notation when x is normally


distributed with mean m and
standard deviation s : x ~ N ( m ,s )
33
Gaussian (Normal) Distribution

 Continuous Data
 Typically 2 parameters
 Scale parameter = mean (m)
 Shape parameter = standard deviation (s)
2
1  xm 
  
 PDF 1 2 s 
f ( x)  e
s 2

 Maximum occurs at 𝑥 = 𝜇

34
Effect of Standard Deviation (Spread)

35
Distributions and Probability
 Distributions can be linked to probability – making possible predictions
and evaluations of the likelihood of a particular occurrence

2
1  xm 
  
Px1  x  x2    f ( x)dx  
x2 x2 1 2 s 
e dx
x1 x1
s 2

36
Standard Normal (z) Distribution

 Problem: Unlimited number of possible normal


distributions (- < m <  , s > 0)
 Solution: Standardize the random variable to have
mean 0 and standard deviation 1

xm
x ~ N ( m ,s )  z ~ N (0,1)
s
• Probabilities of certain ranges of values and specific
percentiles of interest can be obtained through the
standard normal (z) distribution
37
Standard Normal (z) Distribution
Given p(𝑥), how can we predict the probability that any future
measurement will fall within some stated interval of 𝑥 values?

The probability that 𝑥 will assume a value within the


interval 𝑥 ′ ± 𝛿𝑥 is given by the area under p(𝑥),

(Standardized normal variate)

38
Standard Normal (z) Distribution

tabulated

39
40
Standard Normal Distribution
 Normal (m=0, s=1)
xm
 Standard(ized) normal variate: z
s

 All normal distributions can be simply transformed to the


standard normal distribution

 The probability that 𝑥 is between 𝑥1 and 𝑥2 is the same as the


probability that z is between z1 and z2

 It also determines the probability (confidence level) that a


measurement will fall within one ore more standard deviations of
the mean, e.g. 1σ, 2σ or 3σ

41
Example
( x '  (1.0)s )  x '
x  (1.0)s  z 
'
 1.0
s

tabulated

42
The probability that the ith measured value of x will have a
value between

43
Example - Heights of U.S. Adults
• Female and Male adult heights are well approximated by
normal distributions: xF~N(63.7,2.5) xM~N(69.1,2.6)

20
20
18

16

14

12

10 10

4 Std. Dev = 2.48 Std. Dev = 2.61

Mean = 63.7 Mean = 69.1


2
0 N = 99.23
0 N = 99.68
59.5 61.5 63.5 65.5 67.5 69.5 71.5 73.5 75.5
55.5 57.5 59.5 61.5 63.5 65.5 67.5 69.5
60.5 62.5 64.5 66.5 68.5 70.5 72.5 74.5 76.5
56.5 58.5 60.5 62.5 64.5 66.5 68.5 70.5

INCHESM
INCHESF
Cases weighted by PCTM
Cases weighted by PCTF

44 Source: Statistical Abstract of the U.S. (1992)


Example - Adult Female Heights
 What is the probability a randomly selected female is 5’10”
(70 inches) or taller?
0.5 0.4941
 Step 1 - z ~ N(63.7 , 2.5)
 Step 2 - xL = 70.0 xU = 
 Step 3 -
70.0  63.7
zL   2.52 zU  
2.5 0.0059

• Step 4 - P(x  70) = P(z  2.52) = .0059 (  1/170)

46
Example

0.3413
0.5

𝑧≤1
0.4772

𝑧≤2

47
48
Statistics of Finite Data Sets
 Estimate the true mean and true variance of a population
from a small sample (inferential statistics).
 Measuring some samples for a batch of manufactured goods

 Infinite statistics describe the population parameters,


whereas finite statistics describe a small sample.
sample mean value

sample variance

sample standard deviation


53
Student’s t Distribution
 Use the resulting statistics from the sample to
characterize the statistics of the population.

 When data sets are finite, the z variable does not


provide a reliable estimate of the probability

 Student’s t variable is used for finite data set


(normal distribution of x is assumed)
xi  x  tv , P S x (P %) n  sample size, P %  Probability
xm v  degrees of freedom  N  1
t
Sx / N S x  Sample standard deviation
t values are tabulated
54
Precision Interval

xi  x  tv , P S x (P %)

 Precision interval:

 One should expect any measured value to fall within P%


probability.

55
xi  x  tv , P S x (P %)

56
The Standard Error
(Standard Deviation of the Means)
• With different samples, we would get different estimates of the
sample mean and sample variance
• Each mean value would be normally distributed about some
central value
The normal distribution tendency
of the sample means about a true
value in the absence of
systematic error.

57
The Standard Error
(Standard Deviation of the Means)

(a) (b)
58 The width of the histogram of the means decreases as the size of the sample used to calculate the
means increases—this is a consequence of averaging over statistical fluctuations.
The Standard Error
(Standard Deviation of the Means)

(c) (d)
59 The width of the histogram of the means decreases as the size of the sample used to calculate the
means increases—this is a consequence of averaging over statistical fluctuations.
Interval Estimation of Population Mean
• The variance of the distribution of mean values
is estimated from a finite data set through the
standard deviation of the means:
Sx
Sx 
n
• The estimate of the true mean value based on
the finite data set is
m  x  tv , P S x ( P%)
Confidence interval

Compared to: xi  x  tv , P S x (P %)
60
Example

(a) Compute the sample statistics for this data set. (b) Estimate
the interval of values over which 95% of the measurements of x
should be expected to lie. (c) Estimate the true mean value of x
at 95% probability based on this finite data set.
62
Example

The interval of values in which 95% of the measurements of x should lie is:

There is a 95% probability that the value of the 21st data point would lie between 0.69 and 1.35

The true mean value is estimated by the sample mean value:

So at 95% confidence, the true mean value lies


between 0.94 and 1.10.

63
t-table
Pooled Statistics

64
Hypothesis Testing

Hypothesis is a statement that something is believed to be true.

Null Hypothesis: Difference between the measured behavior


of a population and assumed (hypothesized) behavior is small
enough to be considered due to random variations (chance).

𝐻𝑜 : 𝑥 ′ = 𝑥𝑜 Null hypothesis

𝑥𝑜 : population or target value

US justice system: Beyond reasonable doubt?


65
Hypothesis Testing

𝐻𝑜 : 𝑥 ′ = 𝑥𝑜 Null hypothesis

𝐻𝑎 : 𝑥 ′ ≠ 𝑥𝑜 Two-tailed test
𝐻𝑎 : 𝑥 ′ < 𝑥𝑜 Left-tailed test 𝑎𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑖𝑣𝑒 ℎ𝑦𝑝𝑜𝑡ℎ𝑒𝑠𝑖𝑠
𝐻𝑎 : 𝑥 ′ > 𝑥𝑜 Right-tailed test
𝛼 significance value
(level of significance)

66
Critical values for a hypothesis test at level of significance 𝛼
Population 𝜎 known? z-test : t-test
Need a sample
(Measured) Test statistic is the z-variable:

Evaluate the z-variable against critical values of z at a desired level of


significance, 𝛼
𝑃 𝑧 ≡1−𝛼
Rule: Do not reject the null hypothesis if the values of the test
statistic lie within a defined “Do not reject region”.
𝐻𝑎 : 𝑥 ′ ≠ 𝑥𝑜 𝐻𝑎 : 𝑥 ′ < 𝑥𝑜 𝐻𝑎 : 𝑥 ′ > 𝑥𝑜

67 Do not reject
regions
Population 𝜎 known? z-test : t-test
Test statistic is the t-variable:

Evaluate the t-variable against critical values of t at a desired level of


significance, 𝛼
𝑃 𝑡 ≡1−𝛼
Rule: Do not reject the null hypothesis if the values of the test
statistic lie within a defined “Do not reject region”.
𝐻𝑎 : 𝑥 ′ ≠ 𝑥𝑜 𝐻𝑎 : 𝑥 ′ < 𝑥𝑜 𝐻𝑎 : 𝑥 ′ > 𝑥𝑜

68 Do not reject
regions
Hypothesis testing steps


1. Establish the null hypothesis 𝐻𝑜 : 𝑥 = 𝑥𝑜 , where 𝑥𝑜 is the
population or target value,

2. Assign level of significance, 𝛼. This will determine the


critical values and acceptance regions,

3. Calculate the observed value of the test statistic (i.e., make a


measurement(s) and calculate the corresponding z- or t-
variable),

4. Compare the test statistic to observed values.

70
Example:
Null Hypothesis

𝐻𝑜 : 𝑥 ′ = 𝑥𝑜 = 2.0 mm

𝐻𝑎 : 𝑥 ′ ≠ 𝑥𝑜 = 2.0 mm Two-tailed test

71
Example:
Acceptance Region(s)

0.475
0.5-0.025=0.475

Assuming bearing population is normally distributed

0.025 0.025

Our tables are based on this area


72
Example:
Critical Value 0.475
0.475

0.025 0.025

73
z-table
Example:
Observed Test Statistic

Eliminate the entire batch


due to manufacturing defect

Observed
>
Reject the null hypothesis (i.e., the difference between the means of the
population and sample is larger than it would be by just random chance).
74
Example

75
Example:
Null Hypothesis

0.95

𝐻𝑜 : 𝑥 ′ = 𝑥𝑜 = 180.0 yards

𝐻𝑎 : 𝑥 ′ ≠ 𝑥𝑜 = 180.0 yards Two-tailed test 0.025


0.025

76
𝐻𝑜 : 𝑥 ′ = 𝑥𝑜 = 180.0 yards
Example (cont’d)
0.95

Do NOT reject the null hypothesis (i.e., the difference between the means of the
population and sample is NOT larger than it would be by just random chance).

77
The club meets the customer’s needs
Chi-Square Testing
 The test provides a measure of the discrepancy between
the measured variation in a quantity (characterized by 𝑆𝑥 )
and the variation predicted by the assumed probability
distribution function, (characterized by 𝜎).

 It involves quantifying how well 𝑆𝑥 approximates 𝜎 by


comparing the number of measurements observed within
various ranges of values to the number that would be
expected for the given distribution function.
Area = 1-𝜶

DOF dependency
Area = 𝜶
(measured from
the right)
78
Inference of Population Variance

𝜶 is measured
from the right.

* For example, the 95% precision interval by which 𝑠𝑥2


estimates 𝜎 2 , is given by
79 *Interval is bounded by the 2.5% and 97.5% levels of significance (for 95% coverage).
Tabulated Chi-Squared Values
Values for are tabulated as a function of the degrees of
freedom.

81
Confidence interval for sample
variance
 Example: A sample of 20 ball bearings is chosen and measured
with sample mean 0.32500 in and S.D. = 0.00010 in. Determine
a 95% CI for the standard deviation of the production batch
(population; not known).

 n = n-1 = 19
 a/2 = 0.025, 1- a/2 = 0.975
 c219, 0.025=32.9, c219, 0.975=8.91

19(0.000102)/32.9 ≤ s2 ≤ 19(0.000102)/8.91

0.000076 ≤ s ≤ 0.00015 (95%)


82
Example

83
Goodness-of-fit Test
The chi-squared test provides a measure of the discrepancy between the
measured variation of a data set and the variation predicted by the
assumed density function.

87
Regression Analysis
We can use regression analysis to establish a functional
relationship between the dependent variable and the
independent variable.

dependent variable (output)


This discussion
pertains directly to
polynomial curve fits.

independent variable (input)

88
N 2

D    yi  yci 
i

89
90
Least Squares Regression
 There is some deviation of data from polynomial, yi - yci
 Can calculate standard error of fit:
N

 i ci
( y  y ) 2
Standard error of fit Higher-order curves
reduce Syx, but they are
S yx  i 1

n unlikely to represent the


Degrees of freedom physics of the problem
n  N  (m  1)

 Finds how closely a polynomial fits the data set


 Best fit is one that minimizes error (numerator),
without over fitting the data

91 N-number of data points; m- polynomial order


Confidence interval for regression
 The independent variable is a well-controlled value
 We assume that the principal source of variation in the curve
fit is due to the random error in the dependent (measured)
variable.
 Principal source of variation in the curve fit is due to the
random error in the measured variable

S yx
yc  tn , P P% confidence interval
N

92
Least-Square Linear Fit

• Uses a straight line to fit the data

yc  a0  a1 x
• Minimizing the square of the error
N 2

D    yi  yci 
i

93
Least-Square Linear Fit

a0 
 i  i i  i  yi
x x y  x 2

; a1 
 x  y  N x y
i i i i

 x 
i
2
 N x
2
i  x   N  x
i
2 2
i

94
Example

95
Regression analysis
• To evaluate how well the relationship between
independent and dependent variables can be described
by a linear relationship, can determine coefficient of
determination, r2 (r is correlation coefficient)
• Indicative of how well the variance in y is accounted for
by the fit sum of squared data residuals;
If r2 is zero, there is no improvement over N If equal to zero, perfect fit…
Simply picking the mean. (y i  yci ) 2

r 2  1 i 1
N

 i
( y
i 1
 y ) 2

y i
y i 1

N
96
 Coefficient of determination, r2 (r is correlation coefficient),
only indicates an association between the dependent and
independent variable.

 r2 doesn’t prove a cause-effect relationship.

 r2 doesn’t estimate the random error in yC effectively (use Syx


for this purpose).

 For, ∓0.9 < r ≤ ∓1 a linear regression can be considered a


reliable relation between x and y (dependent and
independent variables).
98
Example

100
Example

 i ci
( y  y ) 2

r 2  1 i 1
N

(y
i 1
i  y)2
Compute the coefficient of N
determination (r2) and the standard
error of the fit for the data
y i
y i 1

N
101 of the variance in y is accounted for by the fit, whereas only 1% is unaccountable.
99%
Example

Estimate the random uncertainty


associated with the fit (standard error
of the fit)

102
t-table
Another Example

The following data represents the output (volts) of a


linear variable differential transformer (LVDT; an
electrical device used for measuring displacement)
for five length inputs

L 0.00 0.50 1.00 1.50 2.00 2.50


(cm)
V(v) 0.05 0.52 1.03 1.50 2.00 2.56

Determine the best linear fit of these data, and


calculate the standard error of the fit (Sxy) as well as
the coefficient of determination (r2).

103
Another Example
xi xi2 yi xi yi yi2
0 0 0.05 0 0.0025

0.5 0.25 0.52 0.26 0.2704

1.0 1.0 1.03 1.03 1.0609

1.5 2.25 1.5 2.25 2.25

2.0 4.0 2.0 4.0 4.0

2.5 6.25 2.56 6.4 6.5536

∑ 7.5 13.75 7.66 13.94 14.137

104
Another Example

a0 
 i  i i  i  yi
x x y  x 2

 .0295
 x   N  x
i
2 2
i

a1 
 x  y  N x y
i i i i
 0.9977
 x   N  x
i
2 2
i

yc  0.0295  0.9977 x

105
Another Example
standard error of the fit (Sxy) and the coefficient
of determination (r2)
N
N  i ci
( y  y ) 2

 i ci
( y  y ) 2
r 2  1 i 1
N
 0.999286
S yx  i 1

n
 0.0278 (y
i 1
i  y)2

n  N  (m  1) N

y i
y i 1

r2= 0.999286 indicates that 99.9% of the variance in y is


accounted for by the linear fit.
106
Another Example
• Now, can we determine the random uncertainty associated
with the fit with 95% confidence interval?
S yx
yc  tn , P
N
degrees of freedom n  N - (m  1)  4
t 4,95  2.77
0.0278
yc  0.0295  0.9977 x  2.77
6
 0.0295  0.9977 x  0.0314 V (95%)
• Similarly, calculate static sensitivity error and zero offset error
107
t-table
Data Outlier Detection

• Spurious* data; does not fit with the trend

• Must be very careful when determining data is


an outlier

• Requires extreme confidence in sample variance

• Risk: exclude “good” data-point

• Affects the hypothesis testing stage


108 * not genuine, authentic, or true; not from the claimed, pretended, or proper source.
Data Outlier Detection:
assumes normal distribution
Chauvenet’s Criterion

• The data point is a potential outlier if:


xi  x
( ) where z0 
sx

𝑷(𝒛𝟎 )
𝑷(𝒛𝟎 )

𝒛𝟎
?
𝟏 − 𝟐 ∗ 𝑷(𝒛𝟎 )

109
Data Outlier Detection: assumes normal distribution
Three-sigma Test
• For large data sets (n>10), the three-sigma test, is to identify those data
points that lie outside the range of 99.73% probability of occurrence, as
potential outliers.

110
xi  x
z0 
sx

i x(i) z P(z) 1-2P(z) 1/(2*N) Outlier? i x(i)


1 28 0.305 0.1198 0.7604 0.050 No 1 28
2 31 1.138 0.3724 0.2551 0.050 No 2 31
3 27 0.028 0.0112 0.9777 0.050 No 3 27
4 28 0.305 0.1198 0.7604 0.050 No 4 28
5 29 0.583 0.2201 0.5599 0.050 No 5 29
6 24 0.805 0.2896 0.4208 0.050 No 6 24
7 29 0.583 0.2201 0.5599 0.050 No 7 29
8 28 0.305 0.1198 0.7604 0.050 No 8 28
9 18 2.469 0.4932 0.0135 0.050 Yes 10 27
10 27 0.028 0.0112 0.9777 0.050 No
mean 27.889
mean 26.900 stdev 1.900
stdev 3.604

Statistics before Statistics after


= 27.889 ± 1.46 psi (95%)
111
z-table t-table
Number of Measurements
How many measurements, N, are required to reduce the
estimated value for random error in the sample mean to
an acceptable (desired) level?

one-sided precision value d

desired
the required number of
measurements is estimated by
112
Number of Measurements

the degrees of freedom in t


depends on N, requires
iteration.

A shortcoming of this method is the need to estimate s , which


x

could be based on experience or other knowledge of the


population.
• Make a preliminary number of measurements, N , to obtain
1

an estimate of the sample variance, s , to be expected.


1

• Then use s1 to estimate the number of measurements required

113
Number of Measurements

1 No preliminary measurements were made.

114 t-table
Another Example

Preliminary measurements were made;


need to find how many more is needed.

2.086
124

124
103
115
t-table
Monte Carlo Simulation

This iteration process continues updating the data set for R


until the predicted standard deviation for R converges to an
asymptotic value.
116
Estimating 𝜋:
Monte Carlo Simulation
• The methods are named after the area of Monaco
where a famous casino is located.
• Random numbers are a defining feature of Monte
Carlo calculations, much as the laws of chance
govern gambling games.

Imagine we can generate a random number, r ,


constrained to the interval 0 < r < 1. The first
two numbers chosen are used as the (x, y)
coordinates of a point within the perimeter of
the square.

After N trials let N denote the number of points inside the circle.
TOT IN

117
Estimating 𝜋:
Monte Carlo Simulation
𝐴𝑐 𝜋
=
𝐴𝑠 4

After N trials let N denote the


TOT IN

number of points inside the


circle.

118
𝜋 =3.14159265359
Example

MS Excel

119
Sample Results (100K iterations)

120
clc;clear;
% Known distributions of the input(s)
% Resistance in Ohm (normal distributon)
R_mean = 1000; R_std = 100;
% Current in Ampere (uniform distribution)
I_min = 0.095; I_max = 0.105;
% Monte Carlo stopping criterion (Tolerance)
TOL = input('Enter the tolerance for standard deviation check: ');
max_iter = input('Enter the maximum number of iterations: ');
% No input from the user?
if isempty(TOL), TOL = 1e-15; end;
if isempty(max_iter), max_iter = 1e5; end;

for i = 1:max_iter
I(i) = I_min + rand()*(I_max-I_min); %current
R(i) = normrnd(R_mean, R_std); %resistance
MCS(i) = I(i) * R(i); % voltage
MCS_std(i) = std(MCS);
if i > 1 && (abs(MCS_std(i) - MCS_std(i-1))/MCS_std(i-1)) < TOL
break;
end
end
% Total number of iterations
if i == max_iter
fprintf(1, '\nUser-specified maximum number of iterations (%d) has been reached!\n', i);
else
fprintf(1, '\nTotal number of iterations: %d\n', i);
end
fprintf(1, '\nTolerance value (TOL): %f\n', TOL);

figure();
str = sprintf('Number of iterations: %d: ', i); title(str);
subplot(3,1,1);
histogram(R); str = sprintf('Resistance (Ohm)\n Mean = %f Ohm; Stdev = %f Ohm', mean(R), std(R));
xlabel(str);
subplot(3,1,2);
histogram(I); str = sprintf('Current (A)\n Mean = %f A; Stdev = %f A', mean(I), std(I)); xlabel(str);
subplot(3,1,3);
histogram(MCS);str = sprintf('Voltage (V)\n Mean = %f V; Stdev = %f V', mean(MCS), std(MCS)); xlabel(str);
121
Another run… 100K iterations

122
Yet another run… 100K iterations

123

You might also like