Module_4_Class
Module_4_Class
Dr. P. Rajendra
CMRIT, Bengaluru.
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 1 / 17
Sampling Variables: predict the value of a variable in a population using
sample data.For example, Estimating the average accounts receivable from
a sample of customer balances. With sampling, we derive:
n
1X
Sample Mean = X̄ = Xi
n
i=1
This sample mean provides an estimate for the population mean µ, with
some uncertainty.
Central Limit Theorem (CLT):
Given a population with mean µ and standard deviation σ, the sample
mean X̄ of size n follows:
σ
X̄ ∼ N µ, √
n
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 3 / 17
Example of CLT:
A sample of size n = 50 is taken from a population with:
Population mean µ = 100
Population standard deviation σ = 15
Solution:
15
Standard Error = √ ≈ 2.12
50
The sample mean X̄ will follow N(100, 2.12).
Confidence Interval(CI)
A confidence interval provides a range of values within which the true
population mean µ is likely to lie. Formula for CI is given by
σ
X̄ ± Z · √
n
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 4 / 17
Confidence Level Z-value
99% 2.58
95% 1.96
90% 1.645
Example of CI for Unknown Mean: A sample of n = 36 students has a
mean score X̄ = 85 with a standard deviation s = 10. Find the 95%
confidence interval for the population mean.
Solution:
10
CI = 85 ± 1.96 · √ = 85 ± 1.96 · 1.67 = 85 ± 3.27
36
95% CI = (81.73, 88.27)
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 5 / 17
Problem 1: State the Central Limit Theorem. Use CLT to evaluate
P[50 < X̄ < 56], where X̄ represents the mean of a random sample of size
100 from an infinite population with Mean µ = 53 Variance σ 2 = 400.
Solution:
Central Limit Theorem (CLT): Let X1 , X2 , . . . , Xn be a random sample
of size n drawn from a population with mean µ and variance σ 2 . As the
sample size n becomes large, the sampling distribution of the sample mean
X̄ approaches a normal distribution with mean µ and variance σ 2 /n,
regardless of the shape of the original population distribution.
Mathematically:
σ2
X̄ ∼ N µ, for large n.
n
We need to evaluate P[50 < X̄ < 56], where n = 100, µ = 53, σ 2 = 400
⇒ σ = 20. Using CLT, the sampling distribution of the sample mean X̄ is:
σ 20
X̄ ∼ N µ = 53, √ = √ =2 .
n 100
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 6 / 17
Standardize the bounds using the z-score formula:
X̄ − µ
Z=
√σ
n
For X̄ = 50:
50 − 53
Z1 = = −1.5
2
For X̄ = 56:
56 − 53
Z2 = = 1.5
2
Therefore:
P(50 < X̄ < 56) = P(−1.5 < Z < 1.5)
= 2 × P(0 < Z < 1.5) = 2 × (0.4332) = 0.8664
The probability that the sample mean X̄ lies between 50 and 56 is:
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 7 / 17
Problem 2: An unknown distribution has a mean of 90 and a standard
deviation of 15. Samples of size n = 25 are drawn randomly from the
population. Find the probability that the sample mean is between 85 and
92.
Solution: Given:
Population mean: µ = 90
Population standard deviation: σ = 15
Sample size: n = 25
Using the Central Limit Theorem, the sampling distribution of the sample
mean X̄ is:
σ2
15
X̄ ∼ N µ = 90, = √ =3 .
n 25
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 8 / 17
Standardize the bounds using the z-score formula:
X̄ − µ
Z=
√σ
n
For X̄ = 85:
85 − 90
Z1 = = −1.67
3
For X̄ = 92:
92 − 90
Z2 = = 0.67
3
Therefore:
P(85 < X̄ < 92) = P(−1.67 < Z < 0.67)
= P(−1.67 < Z < 0) + P(0 < Z < 0.67)
= 0.4514 + 0.2454 = 0.6965
The probability that the sample mean X̄ lies between 85 and 92 is:
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 9 / 17
Problem 3: An electrical firm manufactures light bulbs with a lifespan
that is approximately normally distributed with mean (µ) 800 hours
Standard deviation (σ) 40 hours. If a random sample of 16 bulbs is
selected, what is the probability that the sample mean lifespan will be less
than 775 hours?
Solution: To solve this, we use the Central Limit Theorem (CLT) which
states that the sampling distribution of the sample mean follows a normal
distribution with:
σ
µX̄ = µ = 800 and σX̄ = √
n
where σX̄ is the standard error of the mean n = 16 is the sample size
40 40
σX̄ = √ = = 10
16 4
We need to find the probability that the sample mean X̄ is less than 775
hours. This corresponds to:
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 11 / 17
Problem 4: A random sample of size 64 is taken from an infinite
population having mean 112 and variance 144. Using the Central Limit
Theorem, find the probability of getting the sample mean X̄ greater than
114.5.
Solution:
By the Central Limit Theorem, the distribution of the sample mean X̄
follows:
σ2
X̄ ∼ N µ,
n
Given µ = 112, σ 2 = 144, and n = 64, the standard error of the mean
is:
σ 12 12
σX̄ = √ = √ = = 1.5
n 64 8
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 12 / 17
Convert 114.5 to the corresponding Z -score:
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 13 / 17
Problem 5:
The mean and standard deviation (SD) of the diameters of a sample
of 250 rivet heads manufactured by a company are given as:
Mean (µ) = 7.2642 mm, SD (σ) = 0.0058 mm.
Find the confidence limits for the mean diameter at the following
confidence levels:(i) 99% (ii) 98% (iii) 95% (iv) 90%
(1). Let the observed value of the mean X̄ of a random sample of size 20
from a normal distribution with mean µ and variance σ 2 = 80 be 81.2.
Find a 90% and 95% confidence interval for µ.
Dr. P. Rajendra (Professor, Dept. of Maths) Module - 4 : Statistical Inference - II CMRIT, Bengaluru. 17 / 17
Topic - 2 : Small Samples - Student’s t-Test
Dr. P. Rajendra
CMRIT, Bengaluru.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 1 / 20
1. Sampling Distribution for Small Samples: In case of Large samples,
sampling distribution approaches a normal distribution and values of
sample statistic are considered best estimates of the parameters in a
population. It will no longer be possible to assume that statistics
computed from small samples are normally distributed. As such, a new
technique has been devised for small samples which involves the concept
of ‘degrees of freedom’.
Degrees of Freedom: The Degrees of freedom (d.f.) are particularly
important for small samples because small sample sizes tend to introduce
more variability and uncertainty in statistical estimates. The d.f. plays a
key role in adjusting for this added uncertainty to make statistical tests
more reliable. The d.f is defined as the number of independent values in a
set of observations. For example: If x1 + x2 + x3 = 15, knowing two
values determines the third. Thus, the degrees of freedom (d.f.) are 2.
When you compute the mean of a sample, one degree of freedom is used
to calculate the sample mean, meaning only n − 1 values are free to vary
when calculating other statistics, such as the sample variance. Therefore
for a sample of size n, the degrees of freedom used in variance calculations
are d.f. = n − 1
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 2 / 20
2. Student’s t-Test: The Student’s t-distribution is a probability
distribution used to estimate population parameters (like the mean) when
the sample size is small or the population standard deviation is unknown.
It plays a key role in hypothesis testing and confidence intervals for small
samples. For sample sizes n ≤ 30, t-statistic formula is given by:
x̄ − µ
t= √
s/ n
2 1 Pn 2
Where s = n−1 i=1 (xi − x̄) = Sample variance. The null hypothesis
H0 is accepted if the calculated |t| is less than the critical value at the
given level of significance. The t-distribution was developed by William
Sealy Gosset under the pseudonym Student.
Aspect t-Distribution Normal Distribution
Shape Bell-shaped, with heavier tails Bell-shaped
Spread Wider (more spread out) Narrower
d.f Varies with n − 1 Not dependent on d.f.
Usage Small samples, unknown σ Large samples, known σ
Table: Comparison of t-Distribution and Normal Distribution
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 3 / 20
Problem 1: A certain stimulus was administered to each of 12 patients,
and the following changes in blood pressure were recorded:
5, 2, 8, −1, 3, 0, 6, −2, 1, 5, 0, 4
Can it be concluded that the stimulus increases blood pressure?
Note: The critical value t0.05 for 11 degrees of freedom is 2.201.
Solution: The Sample Mean x̄ is
1X 31
x̄ = x= = 2.5833
n 12
The Sample Variance s 2 is
1 X
s2 = (xi − x̄)2
(n − 1)
1
s2 = (5 − 2.58)2 + (2 − 2.58)2 + (8 − 2.58)2 + (−1 − 2.58)2
11
+(3 − 2.58)2 + (0 − 2.58)2 + (6 − 2.58)2 + (−2 − 2.58)2
+(1 − 2.58)2 + (5 − 2.58)2 + (0 − 2.58)2 + (4 − 2.58)2
√
s 2 = 9.538 ⇒ s = 9.538 = 3.088
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 4 / 20
Hypothesis Test:
H0 : The stimulus does not affect blood pressure, i.e., µ = 0.
H1 : The stimulus increases blood pressure, i.e., µ > 0.
The t-statistic:
x̄ − µ 2.5833 − 0
t= √ = √
s/ n 3.088/ 12
2.5833
= 2.8979 ≈ 2.9
=⇒ t =
0.8911
Compare with the Critical Value: The Critical value t0.05,11 = 2.201
and the calculated t = 2.9.
Since t = 2.9 > 2.201, we reject the null hypothesis at the 5
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 5 / 20
Figure: t-Distribution Curve with Critical Region
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 6 / 20
Problem 2: A random sample of 10 boys had the following I.Q scores:
70, 120, 110, 101, 88, 83, 95, 98, 107, 100.
Does this data support the assumption of a population mean I.Q. of 100
at a 5% level of significance? (Note: t0.05 = 2.262 for 9 d.f.)
Solution: I.Q. of 10 boys are
x : 70, 120, 110, 101, 88, 83, 95, 98, 107, 100
Sample Mean (x̄) is:
1X 972
x̄ = x= = 97.2
n 10
The Variance (s 2 ) is:
1 X
s2 = (x − x̄)2
n−1
1
⇒ s2 =
× 1833.6 ≈ 203.73333
9
The Standard Deviation (s) is:
√
s = s 2 ≈ 14.2735
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 7 / 20
Given Population Mean (µ) = 100
The t-statistic:
x̄ − µ
t=
√s
n
97.2 − 100
= 14.2735
≈ −0.6203
√
10
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 8 / 20
Figure: t-Distribution Curve with Critical Region
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 9 / 20
Problem 3: Ten individuals are chosen at random from a population, and
their heights in inches are found to be:
63, 63, 66, 67, 68, 69, 70, 70, 71, 71.
Test the hypothesis that the mean height of the universe is 66 inches at a
5% level of significance. (Note: t0.05 = 2.262 for 9 d.f.)
Solution: Given the Heights of the individuals (in inches):
x : 63, 63, 66, 67, 68, 69, 70, 70, 71, 71
Sample Mean (x̄) is:
1X 678
x̄ = x= = 67.8
n 10
The Variance (s 2 ) is:
1 X
s2 = (x − x̄)2
n−1
1
s 2 = [(63−67.8)2 +(63−67.8)2 +(66−67.8)2 +(67−67.8)2 +(68−67.8)2
9
+(69 − 67.8)2 + (70 − 67.8)2 + (70 − 67.8)2 + (71 − 67.8)2 + (71 − 67.8)2 ]
s 2 ≈ 9.067 ⇒ s ≈ 3.011
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 10 / 20
Given Population Mean (µ) = 66
The t-statistic:
x̄ − µ
t= =
√s
n
67.8 − 66
3.011
√
10
⇒ t ≈ 1.8979 ≈ 1.89
Comparison with Critical Value:
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 11 / 20
Figure: t-Distribution Curve with Critical Region
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 12 / 20
Problem 4: Two types of batteries are tested for their length of life, and
the following results are obtained:
Battery A: n1 = 10, x̄1 = 500 hrs, σ12 = 100
Battery B: n2 = 10, x̄2 = 560 hrs, σ22 = 121
Compute the Student’s t-statistic and test whether there is a significant
difference in the two means.
Solution: Given:
n1 = 10, x̄1 = 500 hrs, σ12 = 100
n2 = 10, x̄2 = 560 hrs, σ22 = 121
The pooled variance s 2 is given by:
n1 σ12 + n2 σ22
s2 =
n1 + n2 − 2
10 × 100 + 10 × 121 1000 + 1210
s2 = = = 122.78
10 + 10 − 2 18
Thus, the pooled standard deviation s is:
√
s = 122.78 ≈ 11.0805
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 13 / 20
The formula for the t-statistic is:
x̄2 − x̄1
t= q
s n11 + n12
Solution: Let the variables x and y represent the times for Horse A and
Horse B, respectively.
n1
1 X 219
x̄ = xi = = 31.30
n1 7
i=1
n2
1 X 169
ȳ = yi = = 28.20
n2 6
i=1
where n1 = 7 and n2 = 6.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 17 / 20
X
(xi − x̄)2 = (28 − 31.3)2 + (30 − 31.3)2 + (32 − 31.3)2 + (33 − 31.3)2
X
(yi − ȳ )2 = (29 − 28.20)2 + (30 − 28.20)2 + (30 − 28.20)2
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 18 / 20
Given a significance level of 5% (α = 0.05), the critical value t0.05 from
the t-distribution is:
Since:
we reject the null hypothesis at 5% but not at 2%. This suggests that
there is a statistically significant difference in the performance of the two
horses at the 5% level.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 19 / 20
Assignment Questions
1 Ten individuals are chosen at random from the population, and their
heights (in inches) are:
Does the mean of these values differ significantly from the assumed
mean of 47.5? Hint: |t| = 1.84 < t0.05 = 2.31 for ν = 8.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 2 : Small Samples - Student’s t-Test CMRIT, Bengaluru. 20 / 20
Topic - 3 : Chi-Square Test and Goodness of Fit
Dr. P. Rajendra
CMRIT, Bengaluru.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 3 : Chi-Square Test and Goodness of Fit CMRIT, Bengaluru. 1 / 16
Introduction to Chi-Square Test
where: X X
Oi = Ei = N (total frequency)
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 3 : Chi-Square Test and Goodness of Fit CMRIT, Bengaluru. 2 / 16
Goodness of Fit Test using χ2
The Chi-Square test helps in checking the goodness of fit for various
theoretical distributions, including Binomial, Poisson and Normal.
Decision Rule:
If the calculated value of χ2 is less than the table value of χ2 at a
specified level of significance, the hypothesis is accepted.
Otherwise, the hypothesis is rejected.
Conditions for Applying the Chi-Square Test: The following conditions
must be satisfied for the valid application of the Chi-Square test:
1 No theoretical (expected) frequency should be smaller than 5.
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 3 : Chi-Square Test and Goodness of Fit CMRIT, Bengaluru. 3 / 16
Problem 1: Four coins are tossed 100 times, and the following results
were observed:
No.of Heads Frequency
0 5
1 29
2 36
3 25
4 5
Fit a binomial distribution to the data. Test the goodness of fit using the
χ2 -test with χ20.05,4 = 9.49.
Solution:
4
P(X = x) = (0.5)x (0.5)4−x
x
Using the formula:
4
P(0) = (0.5)0 (0.5)4 = 0.0625
0
4
P(1) = (0.5)1 (0.5)3 = 0.25
1
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 3 : Chi-Square Test and Goodness of Fit CMRIT, Bengaluru. 4 / 16
4
P(2) = (0.5)2 (0.5)2 = 0.375
2
4
P(3) = (0.5)3 (0.5)1 = 0.25
3
4
P(4) = (0.5)4 (0.5)0 = 0.0625
4
Expected Frequencies:
Ei = 100 × P(X = i)
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 3 : Chi-Square Test and Goodness of Fit CMRIT, Bengaluru. 5 / 16
No.of Heads Oi Ei
0 5 6.25
1 29 25
2 36 37.5
3 25 25
4 5 6.25
Chi-Square Calculation:
4
2
X (Oi − Ei )2
χ =
Ei
i=0
Test if the male and female births are equally probable at a 5% level of
significance.
∴ E (0) = 10, E (1) = 50, E (2) = 100, E (3) = 100, E (4) = 50, E (5) = 10
Dr. P. Rajendra (Professor, Dept. of Maths)Topic - 3 : Chi-Square Test and Goodness of Fit CMRIT, Bengaluru. 12 / 16
Observed and Expected Frequencies:
(Oi −Ei )2
No. of Boys Observed (Oi ) Expected (Ei ) Oi − E i Ei
5 14 10 4 1.6
4 56 50 6 0.72
3 110 100 10 1
2 88 100 -12 1.44
1 40 50 -10 2
0 12 10 2 0.4
6
X (Oi − Ei )2
χ2 = = 7.16
Ei
i=1
Dr. P. Rajendra
CMRIT, Bengaluru.
Dr. P. Rajendra (Professor, Dept. of Maths) Topic - 4: F-Test (Fisher’s Test) CMRIT, Bengaluru. 1 / 12
Introduction to F-Test (Fisher’s Test)
Let x̄1 and x̄2 be the sample means, and let the sample variances be
defined as:
Dr. P. Rajendra (Professor, Dept. of Maths) Topic - 4: F-Test (Fisher’s Test) CMRIT, Bengaluru. 2 / 12
1 n
1 X
s12 = (xi − x̄)2 ,
n1 − 1
i=1
n2
1 X
s22 = (yi − ȳ )2 .
n2 − 1
i=1
ν1 = n1 − 1,
ν2 = n2 − 1.
Dr. P. Rajendra (Professor, Dept. of Maths) Topic - 4: F-Test (Fisher’s Test) CMRIT, Bengaluru. 6 / 12
Since S22 > S12 , we place the larger variance in the numerator:
S22 28.54
F0 = 2
= = 2.14
S1 13.33
Dr. P. Rajendra (Professor, Dept. of Maths) Topic - 4: F-Test (Fisher’s Test) CMRIT, Bengaluru. 7 / 12
Problem 3: The I.Q.’s of 25 students from one college showed a variance
of 16, and those of an equal number from another college had a variance
of 8. Discuss whether there is any significant difference in variability of
intelligence.
Solution:
Let σ12 = 16 and σ22 = 8.
The F-statistic is calculated as:
σ12 16
F = 2
= =2
σ2 8
Solution:
H0 : The samples have been drawn from normal populations having
the same variance.
Mean of Sample A:
219 169
x̄ = = 31.285, ȳ = = 28.166
7 6
1 X 1
S12 = (xi − x̄)2 = (28 − 31.285)2 + · · · + (34 − 31.285)2
n1 − 1 6
1
= [10.791 + 1.651 + 0.511 + 2.941 + 2.941 + 5.221 + 7.371] = 5.238
6
Dr. P. Rajendra (Professor, Dept. of Maths) Topic - 4: F-Test (Fisher’s Test) CMRIT, Bengaluru. 10 / 12
1 X 1
S22 = (yi − ȳ )2 = (29 − 28.166)2 + · · · + (29 − 28.166)2
n2 − 1 5
1
= [0.695 + 3.364 + 3.364 + 17.355 + 1.359 + 0.695] = 5.366
5
S22 5.366
F = 2
= = 1.025
S1 5.238
Degrees of freedom: ν1 = 6, ν2 = 5.
Tabulated F0.05 (6, 5) = 4.95.
Since the calculated F is less than the tabulated F , we accept H0 .
Therefore, the samples have been drawn from normal populations with the
same variance.
Dr. P. Rajendra (Professor, Dept. of Maths) Topic - 4: F-Test (Fisher’s Test) CMRIT, Bengaluru. 11 / 12
Assignment Questions
Dr. P. Rajendra (Professor, Dept. of Maths) Topic - 4: F-Test (Fisher’s Test) CMRIT, Bengaluru. 12 / 12