PSCV Unit-Iii Digital Notes
PSCV Unit-Iii Digital Notes
PSCV Unit-Iii Digital Notes
𝑧𝛼 𝜎 2
n= ( ) where 𝑧𝛼 – Critical value of z at 𝛼 Level of significance
𝐸
P − Population proportion
𝑄 − 1-P
𝐸 − Maximum Sampling error = p-P
Testing of Hypothesis :
It is an assumption or supposition and the decision making procedure about the assumption
whether to accept or reject is called hypothesis testing .
Def: Statistical Hypothesis : To arrive at decision about the population on the basis of
sample information we make assumptions about the population parameters involved such
assumption is called a statistical hypothesis .
Null hypothesis: A definite statement about the population parameter. Usually a null
hypothesis is written as no difference , denoted by 𝐻0 .
Ex. 𝐻0 : 𝜇 = 𝜇0
Alternative hypothesis : A statement which contradicts the null hypothesis is called
alternative hypothesis. Usually an alternative hypothesis is written as some difference
, denoted by 𝐻1 .
Left one tailed test: If 𝐻1 has < sign , the critical region is taken in the left side of the
distribution.
Right one tailed test : If 𝐻1 has > sign , the critical region is taken on right side of the
distribution.
Errors of Sampling :
While drawing conclusions for population parameters on the basis of the sample results , we
have two types of errors.
Type I error : Reject 𝐻0 when it is true i.e, if the null hypothesis 𝐻0 is true but it is
rejected by test procedure .
Type II error : Accept 𝐻0 when it is false i.e, if the null hypothesis 𝐻0 is false but it is
accepted by test procedure.
𝑯𝟎 is accepted 𝑯𝟎 is rejected
𝑯𝟎 is true Correct Decision Type I Error
𝑯𝟎 is false Type II Error Correct Decision
Problems:
1.If the population is 3,6,9,15,27
a) List all possible samples of size 3 that can be taken without replacement
from finite population
b) Calculate the mean of each of the sampling distribution of means
c) Find the standard deviation of sampling distribution of means
3+6+9+15+27 60
Sol: Mean of the population , 𝜇 = = =12
5 5
81+36+9+9+225 360
=√ =√ = 8.4853
5 5
= 13.3
2.A population consist of five numbers 2,3,6,8 and 11. Consider all possible samples of
size two which can be drawn with replacement from this population .Find
16+9+0+4+25
= = 10.8 ∴ 𝜎 = 3.29
5
3.When a sample is taken from an infinite population , what happens to the standard
error of the mean if the sample size is decreased from 800 to 200
𝜎
Sol: The standard error of mean =
√𝑛
let n= 𝑛2 =200
𝜎 𝜎
Then S.E2 = = 10
√200 √2
Hence if sample size is reduced from 800to 200, S. E. of mean will be multiplied by 2
4.The variance of a population is 2 . The size of the sample collected from the
population is 169. What is the standard error of mean
𝜎 √2 1.41
Standard Error of mean = = = = 0.185
√ 𝑛 √169 13
5.The mean height of students in a college is 155cms and standard deviation is 15 . What
is the probability that the mean height of 36 students is less than 157 cms.
Thus the probability that the mean height of 36 students is less than 157 = 0.7881
6.A random sample of size 100 is taken from a population with 𝝈 = 5.1 . Given that the
̅ = 21.6 Construct a 95% confidence limits for the population mean .
sample mean is 𝒙
7.It is desired to estimate the mean time of continuous use until an answering machine
will first require service . If it can be assumed that 𝝈 = 60 days, how large a sample is
Sol: We have maximum error (E) = 10 days , 𝜎 = 60 days and 𝑧𝛼⁄2 = 1.645
𝑧𝛼⁄ .𝜎 2 1.645 x 60 2
2
∴n=[ ] =[ ] = 97
𝐸 10
8.A random sample of size 64 is taken from a normal population with 𝝁 = 𝟓𝟏. 𝟒 and 𝝈 =
6.8.What is the probability that the mean of the sample will a) exceed 52.9 b) fall
between 50.5 and 52.3 c) be less than 50.6
∴ P(̅̅̅
𝑥 > 52.9 ) = P(z > 1.76)
𝑥
̅̅̅2̅−𝜇 52.3−51.4
𝑧2 = 𝜎 = = 1.06
0.85
√𝑛
P(50.5 < 𝑥̅ < 52.3) = P(-1.06 < z < 1.06)
= P(-1.06 < z < 0) + P(0 < z < 1.06)
= P(0 < z < 1.06) + P(0 < z < 1.06)
= 2( 0.3554) = 0.7108
𝜎
𝑥̅ −𝜇 𝑥̅ −
8
We know z = 𝜎 = 𝜎
√𝑛 6
6𝑥̅ 3
= -
𝜎 4
If Z < 0.75, ̅𝑥 is negative
P(z < 0.75) = P( − ∞ < 𝑧 < 0.75 )
0 0.75
= ∫− ∞ ∅(𝑧) dz + ∫0 ∅(𝑧)dz = 0.50 + 0.2734
= 0.7734
10.The guaranteed average life of a certain type of electric bulbs is 1500hrs with a S.D
of 10 hrs. It is decided to sample the output so as to ensure that 95% of bulbs do not fall
short of the guaranteed average by more than 2% . What will be the minimum sample
size ?
𝑥̅ − 𝜇 1470−1500 𝑛
∴ |𝑧 | = | 𝜎 | =| 120 |=√
4
√𝑛 √𝑛
From the given condition , the area of the probability normal curve to the left of
√𝑛
4
should be 0.95
√n
∴ The area between 0 and is 0.45
4
We do not want to know about the bulbs which have life above the guranteed life .
√𝑛
∴ = 1.65 i.e., √𝑛 = 6.6
4
∴ n = 44
12.In a study of an automobile insurance a random sample of 80 body repair costs had a
mean of Rs 472.36 and the S.D of Rs 62.35. If ̅𝒙 is used as a point estimator to the true
average repair costs , with what confidence we can assert that the maximum error
doesn’t exceed Rs 10.
13.If we can assert with 95% that the maximum error is 0.05 and P = 0.2 find the size of
the sample.
0.2 x 0.8
⇒ 0.05 = 1.96 √ 𝑛
14.The mean and standard deviation of a population are 11,795 and 14,054 respectively
What can one assert with 95 % confidence about the maximum error if 𝒙 ̅ = 11,795 and
n = 50. And also construct 95% confidence interval for true mean .
𝜎 𝜎
∴ Confidence interval = ( 𝑥̅ − 𝑍𝛼⁄2 . , 𝑥̅ + 𝑍𝛼⁄2 . )
√𝑛 √𝑛
= (11795-3899, 11795+3899)
= (7896, 15694)
15.Find 95% confidence limits for the mean of a normally distributed population from
which the following sample was taken 15, 17 , 10 ,18 ,16 ,9, 7, 11, 13 ,14.
15+17+10+18+16+9+7+11+13+14
Sol: We have 𝑥̅ = = 13
10
(𝑥𝑖 −𝑥̅ )2
𝑆2 = ∑ 𝑛−1
1
= [(15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 +
9
(15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 ]
40
= 3
𝑠 √40
𝑍𝛼⁄2 . = 1.96 . = 2.26
√𝑛 √10.√3
𝑠
∴ Confidence limits are 𝑥̅ ± 𝑍𝛼⁄2 . = 13 ± 2.26 = ( 10.74 , 15.26 )
√𝑛
16.A random sample of 100 teachers in a large metropolitan area revealed mean weekly
salary of Rs. 487 with a standard deviation Rs.48. With what degree of confidence can
we assert that the average weekly of all teachers in the metropolitan area is between 472
to 502 ?
P ( 472 < ̅
x < 502 ) = P ( -3.125 < z < 3.125 )
= 2 ( 0.4991) = 0.9982
3.A random sample of size 100 is taken from a population with 𝜎 = 5.1 . Given that the
sample mean is 𝑥̅ = 21.6 Construct a 95% confidence limits for the population mean .
4.A normal population has a mean of 0.1 and standard deviation of 2.1 . Find the probability
that mean of a sample of size 900 will be negative .
5.A random sample of size 64 is taken from a normal population with 𝜇 = 51.4
and 𝜎 = 6.8.What is the probability that the mean of the sample will
a) exceed 52.9
b) fall between 50.5 and 52.3 c) be less than 50.6.
1. A manufacturer claimed that at least 95% of the equipment which he supplied to factory
conformed to specifications . An examination of a sample of 200 pieces of equipment
revealed that 180 were faulty .Test his claim at 5% an 1% LOS.
2. Write about i) critical region ii) one tailed and two tailed test
3. Define sample. Explain the different methods that are involved in selecting the sample.
4. Explain about i) Type I error ii) Type II error
5.a)Explain the five step procedure for testing of hypothesis
b) Explain about i) point estimation ii) interval estimation
OUTCOME
Draw statistical inference using samples of a given size which is
experimental data.
Large Samples: Let a random sample of size n >30 is defined as large sample.
i.e 𝐻0 : µ = 𝜇0
NOTE: Confidence limits for the mean of the population corresponding to the given sample.
𝜇 = 𝑋̅ ± 𝑍∝⁄2 ( S.E of 𝑋
̅ ) i.e,
𝜎 𝜀
𝜇 = 𝑋̅ ± 𝑍∝⁄2 (√𝑛) (or) 𝜇 = 𝑋̅ ± 𝑍∝⁄2 (√𝑛)
𝑥1 & ̅̅̅
Let ̅̅̅ 𝑥2 be the means of the samples of two ramdom sizes 𝓃1 & 𝓃2 drawn from two
populations having means 𝜇1 &𝜇2 and SD’s 𝜎1 &𝜎2
i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬: 𝐻0 : 𝜇1 = 𝜇2
1 ̅ −×
(× 2̅ )−𝛿 ̅ 1 −×
(× ̅ 2)−𝛿
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 𝑆𝐸 𝑜𝑓 ̅ −×
(× ̅ )
=-
1 2 𝜎2 𝜎 2
√ 1+ 2
𝑛1 𝑛2
𝜎12 𝜎22
= (𝑋̅1 − 𝑋̅2 ) ± 𝑧∝⁄2 [√ + ]
𝑛1 𝑛1
i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬 : 𝐻0 : 𝑃 = 𝑃0
ii) Alternative hypothesis : a) H1 : P≠ 𝑃0 (Two Tailed test )
b) H1 ∶ 𝑃 < 𝑃0 (Left one- tailed)
c) H1 ∶ P > 𝑃0 (Right one tailed)
𝑝−𝑃
iii) Test statistic :𝑍𝑐𝑎𝑙 = when P is the Population proportion 𝑄 = 1 − 𝑃
√𝑃𝑄
𝑛
Let p1 and p2 be the sample proportions in two large random samples of sizes n1 & n2
drawn from two populations having proportions P 1 & P2
i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬 : 𝐻0 : 𝑃1 = 𝑃2
𝑝1−𝑝2 x x
𝑍𝑐𝑎𝑙 = 𝑃1 𝑞 1 𝑃2 𝑞 1
where p1 = n1 & p1 = n2
√ + 𝑛 1 2
𝑛1 2
OR
CRITICAL VALUES OF Z
LOS ∝ 1% 5% 10%
µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645
µ> µ0 Z>2.33 z>1.645 Z>1.28
µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28
𝑃1 − 𝑃2 = (𝑝1 − 𝑝2 ) ± 𝑍∝ (𝑆 . 𝐸 𝑜𝑓 𝑃1 − 𝑃2 )
2
Problems:
1. A sample of 64 students have a mean weight of 70 kgs . Can this be regarded as
asample mean from a population with mean weight 56 kgs and standard
deviation 25 kgs.
𝑥 = mean of he sample = 70 kgs
Sol : Given ̅
𝜇 = Mean of the population = 56 kgs
𝜎 = S.D of population = 25 kgs
Calculate the arithmetic mean and standard deviation of this distribution and
use these values to test his claim at 5% los.
𝑑𝑖 = 𝑥 𝑖 – A
ℎ ∑ 𝑓𝑖 𝑑𝑖
𝑥̅ = A + 𝑁
5 x 16
= 28 + 100
= 28.8
∑ 𝑓𝑑2 ∑ 𝑓𝑑 2 164 16 2
S.D : S = h √ 𝑁
− ( 𝑁
) = 5. √
100
− (100 ) = 6.35
5. An ambulance service claims that it takes on the average less than 10 minutes to
reach its destination in emergency calls . A sample of 36 calls has a mean of 11
minutes and the variance of 16 minutes .Test the claim at 0.05 los?
𝑥 =11 , 𝜇 = 10 and 𝜎 = √16 = 4
Sol : Given n = 36 , ̅
i) Null Hypothesis 𝐻0 : 𝜇 = 10
ii) Alternative Hypothesis 𝐻1 : 𝜇 < 10 ( Left one –tailed test )
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑥̅ − 𝜇 11−10 6
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 4 = 4 = 1.5
√𝑛 √36
6. The means of two large samples of sizes 1000 and 2000 members are 67.5 inches
and 68 inches respectively . Can the samples be regarded as drawn from the
same population of S.D 2.5 inches.
Sol: Let 𝜇1 and 𝜇2 be the means of the two populations
7. Samples of students were drawn from two universities and from their weights in
kilograms , mean and standard deviations are calculated and shown below.
Make a large sample test to test the significance of the difference between the
means.
Mean S .D Size of the sample
University A 55 10 400
University B 57 15 100
8. The average marks scored by 32 boys is 72 with a S.D of 8 . While that for 36
girls is 70 with a S.D of 6. Does this data indicate that the boys perform better
than girls at 5% los ?
Sol: Let 𝜇1 and 𝜇2 be the means of the two populations
Given 𝑛1 = 32 , 𝑛2 = 36 and ̅ 𝑥1 = 72 , ̅
𝑥2 = 70
𝜎1 = 8 and 𝜎2 = 6
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 > 𝜇2 ( Right One Tailed test)
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑥̅ 1− 𝑥̅ 2 72− 70 2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = = = √2+1 = 1.1547
𝜎 2
𝜎 2 2 2
√ 1 + 2 √8 +6
𝑛1 𝑛2 32 36
67.85 − 68.55
=
6.5536 6.35
√ 6400 + 1600
− 0.7 − 0.7
= √0.001+0.004 = 0.0707 - 9.9
v) Conclusion: Since |Z cal | value > Zα value , we reject 𝐻0
Hence , we conclude that Australians are taller than Englishmen.
10. At a certain large university a sociologist speculates that male students spend
considerably more money on junk food than female students. To test her
hypothesis the sociologist randomly selects from records the names of 200
students . Of thee , 125 are men and 75 are women . The mean of the average
amount spent on junk food per week by the men is Rs. 400 and S.D is 100. For
the women the sample mean is Rs. 450 and S.D is 150. Test the hypothesis at 5 %
los ?
Sol: Let 𝜇1 and 𝜇2 be the means of the two populations
Given 𝑛1 = 125 , 𝑛2 = 75 and ̅𝑥1 = Mean of men = 400 , ̅𝑥2 = Mean of women = 450
𝜎1 = 100 and 𝜎2 = 150
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 > 𝜇2 ( Right One Tailed test)
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑥̅ 1− 𝑥̅ 2 400− 450
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = =
𝜎 2 𝜎 2 2 2
√ 1 + 2 √100 +150
𝑛1 𝑛2 125 75
− 50
=
√80 + 300
− 50 − 50
= √380
= 19.49
= - 2.5654
v) Conclusion: Since Zcal value < Zα value , we accept 𝐻0
Hence , we conclude that difference between the means are equal
100
=
√400 + 540
100
= 30.66
= 3.26
v) Conclusion: Since Zcal value > Zα value , we reject 𝐻0
Hence , we conclude that there is a significant difference in the salaries of MBA
grades two cities.
𝜎 2 𝜎2 2
∴ 95% confidence interval is𝜇1 - 𝜇2 = (𝑥 𝑥2 )± 1.96 √ 𝑛1 +
̅̅̅1 − ̅̅̅
𝑛2
1
40000 32400
= (20,150 – 20,250) )± 1.96√ 100 + 60
= (39.90, 160.09)
12. A die was thrown 9000 times and of these 3220 yielded a 3 or 4. Is this consistent
with the hypothesis that the die was unbiased?
Sol : Given n = 9000
P = Population of proportion of successes
1 1 2 1
= P( getting a 3 or 4 ) = 6 + 6 = 6 = 3 0.3333
Q = 1- P = 0.6667
3220
P = Proportion of successes of getting 3 or 4 in 9000 times = 9000 = 0.3578
i) Null Hypothesis 𝐻0 : The die is unbiased
i.e., 𝐻0 : P = 0.33
14. A manufacturer claimed that at least 95% of the equipment which he supplied to
a factory conformed to specifications . An experiment of a sample of 200 piece of
equipment revealed that 18 were faulty .Test the claim at 5% los ?
Sol : Given n = 200
Number of pieces confirming to specifications = 200-18 = 182
182
∴ p = Proportion of pieces confirming to specification = 200 = 0.91
95
P = Population proportion = 100 = 0.95
v) Conclusion: We reject 𝐻0
Hence , we conclude that the manufacturer’s claim is rejected.
15. Among 900 people in a state 90 are found to be chapatti eaters . Construct 99%
confidence interval for the true proportion and also test the hypothesis for single
proportion ?
Sol: Given x = 90 , n = 900
𝑥 90 1
∴ p = 𝑛 = 100 = 10 = 0.1
And q = 1- p= 0.9
𝑝𝑞 (0.1) (0.9)
Now √ 𝑛 = √ 900
= 0.01
i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 ≠ 𝑃2 ( Two Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑝 − 𝑝
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1 21
√𝑝𝑞( + )
𝑛1 𝑛2
𝑛1 𝑝1+𝑛2𝑝2 𝑥1 +𝑥2 200+40 240 2
We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 400+200 = 600
=5
3
q = 1- p = 5
0.5−0.2
= 1 1
= 7.07
√(0.4) (0.6)( + )
400 200
17. A machine puts out 16 imperfect articles in a sample of 500 articles . After the
machine is overhauled it puts out 3 imperfect articles in a sample of 100 articles .
Has the machine is improved ?
Sol : Let 𝑃1 and 𝑃2 be the proportions of imperfect articles in the proportion of
articles manufactured by the machine before and after overhauling , respectively.
Given 𝑛1 = Sample size before the machine overhauling = 500
𝑛2 = Sample size after the machine overhauling = 100
𝑥1 = Number of imperfect articles before overhauling = 16
𝑥2 = Number of imperfect articles after overhauling = 3
𝑥 16 𝑥 3
∴ 𝑝1 = 𝑛1 = = 0.032 and 𝑝2 = 𝑛2 = 100
= 0.03
1 500 2
i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 > 𝑃2 ( Left one Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑝 − 𝑝
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1 21
√𝑝𝑞( + )
𝑛1 𝑛2
𝑛1 𝑝1+𝑛2𝑝2 𝑥1 +𝑥2 16+3 19
We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 500+100 = 600
= 0.032
q = 1- p = 0.968
0.032−0.03
= 1 1
√(0.032) (0.968)( + )
500 100
Sol: Let 𝑃1 and 𝑃2 be the proportions of defective units in the population of units inspected
in machine 1 and Machine 2 respectively.
i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 ≠ 𝑃2 ( Two Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑝 − 𝑝
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1 21
√𝑝𝑞( + )
𝑛1 𝑛2
𝑛1 𝑝1+𝑛2𝑝2 𝑥1 +𝑥2 17+22 39
We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 375+450 = 825
= 0.047
q = 1- p = 1- 0.047 = 0.953
0.045−0.049
= 1 1
√(0.047) (0.953)( + )
375 450
= - 0.267
v) Conclusion: Since |Zcal |value < Zα value , we accept 𝐻0
Hence we conclude that there is no significant difference in performance of
machines.
19. A cigarette manufacturing firm claims that its brand A line of cigarettes outsells
its
brand B by 8% . If it is found that 42 out of 200 smokers prefer brand A and 18
out of another sample of 100 smokers prefer brand B . Test whether 8%
difference is a valid claim?
Sol: Given 𝑛1 = 200
𝑛2 = 100
𝑥1 = Number of smokers preferring brand A= 42
𝑥2 = Number of smokers preferring brand B = 18
𝑥 42 𝑥 18
∴ 𝑝1 = 𝑛1 = = 0.21 and 𝑝2 = 𝑛2 = 100
= 0.18
1 200 2
20. In a city A , 20% of a random sample of 900 schoolboys has a certain slight
physical defect . In another city B ,18.5% of a random sample of 1600 school
boys has the same defect . Is the difference between the proportions significant at
5% los?
Sol: Given 𝑛1 = 900
𝑛2 = 1600
𝑥1 = 20% of 900 = 180
𝑥2 = 18.5% of 1600 = 296
𝑥 180 𝑥 296
∴ 𝑝1 = 𝑛1 = = 0.2 and 𝑝2 = 𝑛2 = 1600
= 0.185
1 900 2
i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 ≠ 𝑃2 ( Two Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
(𝑝1 − 𝑝2)
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1
√𝑝𝑞( + )
𝑛1 𝑛2
Introduction When the sample size n < 30, then if is referred to as small samples. In this
sampling distribution in many cases may not be normal ie., we will not be justified in
estimating the population parameters as equal to the corresponding sample values.
Degree Of Freedom The number of independent variates which make up the statistic is
known as the degrees of freedom (d.f) and it is denoted by 𝜗.
For Example: If 𝑥1 + 𝑥2 + 𝑥3 = 50 and we assign any values to two os the variables (say
x1,x2 ), then the values of x3 will be known. Thus, the two variables are free and independent
choices for finding the third.
𝑥̅ −𝜇
𝑡=𝑆 is a random variable having the 𝑡 − distribution with 𝜗 = 𝑛 − 1 degrees of freedom.
⁄
√𝑛
Properties of 𝒕 − Distribution
1. The shape of 𝑡 −distribution is bell shaped, which is similar to that of normal
distribution and is symmetrical about the mean.
2. The mean of the standard normal distribution as well as 𝑡 −distribution is zero, but
the variance of 𝑡 −distrubution depends upon the parometer 𝜗 which is called the
degrees of freedom.
3. The variance of 𝑡 −distribution exceeds 1, but approaches 1 as 𝑛 → ∞.
𝑋̅−𝜇
If ′𝜎′ is unknown, then 𝑡 = 𝑆 where
⁄
√𝑛
̅ )2
(𝑋𝑖−𝑋
𝑆2 = ∑ 𝑛−1
𝑥̅1 −𝑥̅2
𝑡= -------(1) where
√𝑆 2 (𝑛1 +𝑛1 )
1 2
∑ 𝑥1 ∑𝑥
̅1 =
𝑥 ̅2 = 2 and
,𝑥
𝑛1 𝑛2
2 1
𝑥1 2 + ∑(𝑥2 − ̅̅̅)
𝑆 =𝑛 +𝑛 −2 [∑(𝑥1 − ̅̅̅) 𝑥2 2
1 2
1
OR 𝑆2 =𝑛 +𝑛 −2 [(𝑛1 𝑠21 ) + (𝑛2 𝑠22 )]
1 2
For Example: To test the effectiveness of ‘drug’ some // person’s blood pressure is measured
before and after the intake of certain drug. Here the individual person is the experimental unit
and the two populations are blood pressure “before” and “after” the drug is given
Paired t-test is applied for n paired observations by taking the differences d1,d2 ------dn of the
paired data. To test whether the differences di from a random sample of a population with
mean 𝜇.
𝑑̅ 1 1 2
𝑡=𝑠 𝑤ℎ𝑒𝑟𝑒 𝑑̅ = 𝑛 𝜖 𝑑𝑖 and 𝑠2 = 𝑛−1 ∑(𝑑 − 𝑑
̅)
⁄ 𝑛
√
Problems:
1. A sample of 26 bulbs gives a mean life of 990 hours with a S.D of 20 hours. The
manufacturer claims that the mean life of bulbs is 1000 hours . Is the sample not
upto the standard?
Sol: Given n = 26
𝑥 = 990
̅
𝜇 = 1000 and S.D i.e., s = 20
i) Null Hypothesis : 𝐻0 : 𝜇 = 1000
ii) Alternative Hypothesis: 𝐻1 : 𝜇 < 1000( Left one tailed test )
(Since it is given below standard)
iii) Level of significance : 𝛼 = 0.05
t tabulated value with 25 degrees of freedom for left tailed test is 1.708
𝑥̅ − 𝜇 990−1000
iv) Test Statistic : 𝑡 𝑐𝑎𝑙 = 𝑠 = 20 = − 2.5
√𝑛−1 √25
𝑥)2 = 150
𝜇 = 56 and ∑(𝑥𝑖 − ̅
̅ )2
∑(𝑥𝑖 −𝑥 150
∴ 𝑆2 = 𝑛−1
= 15
= 10 ⇒ S = √10
i) Null Hypothesis 𝐻0 : 𝜇 = 56
ii) Alternative Hypothesis 𝐻1 : 𝜇 ≠ 56 (Two tailed test )
iii) Level of significance : 𝛼 = 0.05
t tabulated value with 15 degrees of freedom for two tailed test is 2.13
𝑥̅ − 𝜇 53−56
iv) Test Statistic : 𝑡 𝑐𝑎𝑙 = 𝑆 = √10
= − 3.79
√𝑛 √15
3. A random sample of 10 boys had the following I.Q’s : 70, 120 ,110, 101,88,
83,95,98,107 and 100.
a) Do these data support the assumption of a population mean I.Q of 100?
b) Find a reasonable range in which most of the mean I.Q values of samples of
10 boys lie
Sol: Since mean and s.d are not given
We have to determine these
x x − x̅ (x − x̅ )2
70 -27.2 739.84
120 22.8 519.84
110 12.8 163.84
101 3.8 14.44
88 -9.2 84.64
83 -14.2 201.64
95 -2.2 4.84
98 0.8 0.64
107 9.8 96.04
100 2.8 7.84
∑ 𝑥 = 972 ∑(x − x̅ )2
= 1833.60
1 1833.6
𝑆2 = 𝑛−1 ∑(x − x̅ )2 = 9
∴ S = √203.73 = 14.27
4. Samples of two types of electric bulbs were tested for length of life and following
data were obtained
Type 1 Type 2
Sample number , 𝒏𝟏 = 8 𝒏𝟐 = 7
Sample mean , ̅̅ 𝒙̅̅𝟏 = 1234 𝒙̅̅𝟐 = 1036
̅̅
Sample S.D , 𝒔𝟏 = 36 𝒔𝟐 = 40
Is the difference in the mean sufficient to warrant that type 1 is superior to type
2 regarding length of life .
Sol: i) Null Hypothesis 𝐻0 : The two types of electric bulbs are identical
i.e., 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2
𝑥̅ 1− 𝑥̅ 2
iii)Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
√𝑠( + )
𝑛1 𝑛2
𝑛1 𝑠1 2+𝑛1 𝑠1 2
Where 𝑆2 = 𝑛1 +𝑛2
1
= 8+7−2(8(36)2 + 7(40)2 ) = 1659.08
1234− 1036
∴t= 1 1
= 9.39
√1659.08 ( + )
8 7
iv)Degrees of freedom = 8+7-2 =13 ,tabulated value of t for 13 d.f at 5% los is 2.16
v)Conclusion: Since |𝑡 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑡𝛼 value , we reject 𝐻0
Hence we conclude that the two types 1 and 2 of electric bulbs are not identical .
𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦
28 -3.286 10.8 29 0.84 0.7056
30 -1.286 1.6538 30 1.84 3.3856
32 0.714 0.51 30 1.84 3.3856
33 1.714 2.94 24 -416 17.3056
33 1.714 2.94 27 -1.16 1.3456
29 -2.286 5.226 29 0.84 0.7056
34 2.714 7.366
∑𝑥 𝑥)2
∑(𝑥 − ̅ ∑𝑦 ̅)2
∑(𝑦 − 𝑦
= 219 = 31.4358 = 169 = 26.8336
1
Now 𝑆2 = 𝑛 ̅)2 + ∑(𝑦 − 𝑦)2 ]
[(∑(𝑥 − 𝑥
1 +𝑛2 −2
1
= 11 [31.4358 + 26.8336]
1
= 11 (58.2694)
= 5.23
∴ S = √5.23 = 2.3
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2
𝑥̅ 1− 𝑥̅ 2
iii) Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
𝑆√(𝑛 +𝑛 )
1 2
31.286 − 28.16
= = 2.443
1 1
2.3 (√7 + 6)
∴ 𝑡𝑐𝑎𝑙 = 2.443
Hence we conclude that both horses do not have the same running capacity.
6. Ten soldiers participated in a shooting competition in the first week. After intensive
training they participated in the competition in the second week . Their scores
before and after training are given below :
Scores 67 24 57 55 63 54 56 68 33 43
before
Scores 70 38 58 58 56 67 68 75 42 38
after
Do the data indicate that the soldiers have been benefited by the training.
Sol: Given 𝑛1 = 10 , 𝑛2 = 10
We first compute the sample means and standard deviations
1
𝑥
̅ = Mean of the first sample =
10
(67 + 24 + 57 +55+63+54+56+68+33+43)
1
= 10 (520) = 52
1
𝑦
̅ = Mean of the second sample =
10
(70+38+58+58+56+67+68+75+42+38)
1
= 10 (570) = 57
𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦
67 15 225 70 13 169
24 -28 784 38 -19 361
57 5 25 58 1 1
55 3 9 58 1 1
63 11 121 56 -1 1
54 2 4 67 10 100
56 4 16 68 11 121
68 16 256 75 18 324
33 -19 361 42 -15 225
43 -9 81 38 -19 361
∑ 𝑥 = 520 𝑥)2
∑(𝑥 − ̅ ∑ 𝑦 = 570 ̅)2
∑(𝑦 − 𝑦
= 1882 = 1664
1
Now 𝑆2 = 𝑛 ̅)2 + ∑(𝑦 − 𝑦)2 ]
[(∑(𝑥 − 𝑥
1 +𝑛2 −2
1
= 18 [1882 + 1664]
1
= 18 (3546)
= 197
∴ S = √197 = 14.0357
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 < 𝜇2 (Left one tailed test)
𝑥̅ 1− 𝑥̅ 2
iii) Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
𝑆√(𝑛 +𝑛 )
1 2
Hence we conclude that the soldiers are not benefited by the training.
7. The blood pressure of 5 women before and after intake of a certain drug are given
below:
Before 110 120 125 132 125
After 120 118 125 136 121
Test whether there is significant change in blood pressure at 1% los?
Sol: Given n = 5
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 < 𝜇2 (Left one tailed test)
̅
𝑑
iii) Test Statistic 𝑡𝑐𝑎𝑙 = 𝑠
⁄ 𝑛
√
where d̅ =
∑d 1 ̅ )2
and 𝑆2 = 𝑛−1 ∑(𝑑 − 𝑑
n
B.P before training B.P after training 𝑑 = 𝑦−𝑥 𝑑 − 𝑑̅ ̅ )2
(𝑑 − 𝑑
110 120 10 8 64
120 118 -2 -4 16
123 125 2 0 0
132 136 4 2 4
125 121 -4 -6 36
∑ 𝑑 = 10 ̅ )2 =
∑(𝑑 − 𝑑
120
10 120
̅̅̅̅ =
∴𝑑 5
= 2 and 𝑆2 = 4
= 30
∴ S = 5.477
̅
𝑑 2
𝑡𝑐𝑎𝑙 = 𝑠 = 5.477 = 0.862
⁄ 𝑛 ⁄
√ √5
Hence we conclude that there is no significant difference in Blood pressure after intake of a
certain drug.
where d̅ =
∑d 1 ̅ )2
and 𝑆2 = 𝑛−1 ∑(𝑑 − 𝑑
n
Before(𝑥) After(𝑦) 𝑑 = 𝑦−𝑥 𝑑2
12 15 -3 9
14 16 -2 4
11 10 1 1
8 7 1 1
7 5 2 4
10 12 -2 4
3 10 -7 49
0 2 -2 4
5 3 2 4
6 8 -2 4
2
∑𝑑 ∑𝑑
= −12 = 84
̅= −12
𝑑 10
= -1.2
84− (−1.2) 2x 10
𝑆2 = 9
= 7.73
∴ S = 2.78
̅
𝑑 −1.2
𝑡𝑐𝑎𝑙 = 𝑠 = 2.78 = -1.365 and d.f = n-1 = 9
⁄ 𝑛 ⁄
√ √10
Hence we conclude that there is no significant difference in memory capacity after the
training program.
If the calculated value of 𝝌𝟐 is greater than the table value, the fit is considered to be poor.
iv)At specified level of significance for n-1 d.f if the given problem is binomial
distribution
At specified level of significance for n-2 d.f if the given problem is Poisson distribution
v)Conclusion :If 𝝌𝟐 cal value < 𝝌𝟐 tab value , then we accept H0 , Otherwise reject H0 .
A is divided into two classes and B is divided into two classes. The various cell frequencies
can be expressed in the following table known as 2x2 contingency table.
a b a+b
c d c+ d
a + c b + d N =a + b + c + d
The expected frequencies are given by
(𝑎 + 𝑐 )(𝑎 + 𝑏)
𝐸 (𝑎 ) =
𝑁
(𝑏 + 𝑐 )(𝑎 + 𝑏)
𝐸 (𝑏 ) =
𝑁
(𝑎 + 𝑐 )(𝑐 + 𝑑 )
𝐸 (𝑐 ) =
𝑁
(𝑏 + 𝑑 )(𝑐 + 𝑑 )
𝐸 (𝑑 ) =
𝑁
(𝑂 − 𝐸)2
𝝌𝟐 𝑐𝑎𝑙 = ∑
𝐸
𝝌𝟐 𝑐𝑎𝑙 value to be compared with 𝝌𝟐 𝑡𝑎𝑏 value at 1% (5.1 or10%) level of significance for
c-number of columns.
Note: In 𝝌𝟐 distribution for independence of attributes, we test if two attributes A and B are
independent or not.
(O − E)2
iii) Test Statistic χ2 cal = ∑
E
𝐑𝐨𝐰 𝐭𝐨𝐭𝐚𝐥 𝐱 𝐂𝐨𝐥𝐮𝐦𝐧 𝐭𝐨𝐭𝐚𝐥
where E = 𝐆𝐫𝐚𝐧𝐝 𝐭𝐨𝐭𝐚𝐥
iv)At specified level of significance for (m-1) (n-1) d.f where m- no. of rows and n- no. of
columns
v)Conclusion : If 𝛘𝟐 cal value < 𝛘𝟐 tab value , then we accept H0 , Otherwise reject H0 .
1. Fit a Poisson distribution to the following data and test for its goodness of fit at 5%
los
x 0 1 2 3 4
f 419 352 154 56 19
Sol:
X f fx
0 419 0
1 352 352
2 154 308
3 56 168
4 19 76
N=1000 ∑ 𝑓𝑥 = 904
∑ 𝑓𝑥 904
Mean 𝜆 = N
= 1000 = 0.904
𝑒−𝜆 𝜆𝑥
= N x p(x) = 1000 x 𝑥!
x 0 1 2 3 4 Total
f = 1000 x 406.2 366 165.4 49.8 12.6 1000
𝑒−𝜆𝜆𝑥
𝑥!
(𝑂−𝐸)2
iii)𝝌𝟐 𝑐𝑎𝑙 = ∑ 𝐸
O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
419 406.2 (419 − 406.2)2 (419 − 406.2)2
406.2
352 366 (352 − 366)2 (352 − 366)2
366
154 165.4 (154 − 165.4)2 (154 − 165.4)2
165.4
3. A die is thrown 264 times with following results. Show that the die is biased [ Given
𝝌𝟐 𝟎.𝟎𝟓 = 11.07 for 5 d.f]
No. appeared 1 2 3 4 5 6
on the die
Frequency 40 32 28 58 54 52
𝟐 (𝑂−𝐸)2
iii)𝛘 cal
=∑ 𝐸
264
The expected frequency of each of the number 1,2,3,4,5,6 is 6
= 44
Calculation of 𝝌𝟐 :
O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
40 44 16 0.3636
32 44 144 3.2727
28 44 256 5.8181
58 44 196 4.4545
54 44 100 2.2727
52 44 64 1.4545
(𝑂−𝐸)2
∑ = 17.6362
𝐸
𝝌𝟐 𝑐𝑎𝑙 = 17.6362
(𝑂 − 𝐸)2
iii) 𝛘𝟐 cal = ∑
𝐸
90 x 100 90 x 100 90
= 45 = 45
200 200
100x 110 100 x 110 11
= 55 = 55
200 200
100 100 200
𝐑𝐨𝐰 𝐭𝐨𝐭𝐚𝐥 𝐱 𝐂𝐨𝐥𝐮𝐦𝐧 𝐭𝐨𝐭𝐚𝐥
where E = 𝐆𝐫𝐚𝐧𝐝 𝐭𝐨𝐭𝐚𝐥
Calculation of 𝝌𝟐 :
O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
60 45 225 5
30 45 225 5
40 55 225 4.09
70 55 225 4.09
(𝑂−𝐸)2
∑ = 18.18
𝐸
𝝌𝟐 𝑐𝑎𝑙 = 18.18
Hence we conclude that new and conventional treatment are not independent.
Greater Variance
Fcal =
Smaller Varinace
𝑆1 2 𝑆2 2
𝐹𝑐𝑎𝑙 = Or
𝑆2 2 𝑆1 2
Where,
𝑛1 𝑠1 2 1
𝑆1 2 is the unbiased estimator of σ12 and is calculated as: 𝑆1 2 = =𝑛 𝑥1 )2
∑(𝑥1 − ̅̅̅
𝑛1 −1 1−1
𝑛2 𝑠22 1
𝑆2 2 is the unbiased estimator of σ22 and is calculated as: 𝑆2 2 = 𝑛2 −1
=𝑛 𝑥 2 )2
∑(𝑥2 − ̅̅̅
2 −1
To test the hypothesis that the two population variances 𝝈𝟏 𝟐 and 𝝈𝟐 𝟐 are
equal
i) H0 : σ1 2 = σ2 2
ii) H1 : σ1 2 ≠ σ2 2
𝐆𝐫𝐞𝐚𝐭𝐞𝐫 𝐕𝐚𝐫𝐢𝐚𝐧𝐜𝐞
iii) Fcal = 𝐒𝐦𝐚𝐥𝐥𝐞𝐫 𝐕𝐚𝐫𝐢𝐧𝐚𝐜𝐞
𝐹𝑐𝑎𝑙 (𝜗1 , 𝜗2 ) is the value of F with 𝜗1 and 𝜗2 degrees of freedom such that the area under the
F – distribution to the right of 𝐹𝛼 is 𝛼.
1. In one sample of 8 observations from a normal population, the sum of the squares of
deviations of the sample values from the sample mean is 84.4 and in another sample
of 10 observations it was 102.6. Test at 𝟓% level whether the populations have the
same varience.
Sol: Let 𝜎1 2 and 𝜎2 2 be the variances of the two normal populations from which the
samples are drawn.
Here 𝑛1 = 8, 𝑛2 = 10
2
𝑥)2 = 84.4, ∑(𝑦𝑖 − 𝑦
Also ∑(𝑥𝑖 − ̅ ̅) = 102.6
1 84.4
𝑆1 2 = ̅)2 =
∑(𝑥𝑖 − 𝑥 = 12.057
𝑛1 − 1 7
and
1 2 102.6
𝑆2 2 = ∑(𝑦𝑖 − 𝑦
̅) = = 11.4
𝑛2 − 1 9
𝑆 2 12.057
𝐹 = 𝑆1 2 = = 1.057
2 11.4
and 𝑣2 = 𝑛2 − 1 = 10 − 1 = 9
Method II 27 33 42 35 32 34 38
Do the data show that the variances of time distribution from population from which
these samples are drawn do not differ significantly?
Sol: Let the Null Hypothesis be 𝐻0 : 𝜎1 2 = 𝜎2 2 where 𝜎1 2 and 𝜎2 2 are the variances of the
two populations from with the samples are drawn.
𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦
20 -2.3 5.29 27 -7.4 54.76
16 -6.3 39.69 33 -1.4 1.96
26 3.7 13.69 42 7.6 57.76
27 4.7 22.09 35 0.6 0.36
23 0.7 0.49 32 -2.4 5.76
22 -0.3 0.09 34 -0.4 0.16
38 3.6 12.96
134 81.34 241 133.72
𝑛1 = 6, 𝑛2 = 7
∑ 𝑥 134 ∑ 𝑦 241
∴ 𝑥̅ = = = 22.3, 𝑦̅ = = = 34.
𝑛1 6 𝑛2 67
2
̅)2 = 81.34, ∑(𝑦𝑖 − 𝑦
∑(𝑥𝑖 − 𝑥 ̅) = 133.72
1 81.34
𝑆1 2 = ̅)2 =
∑(𝑥𝑖 − 𝑥 = 16.26
𝑛1 − 1 5
and
1 2 133.72
𝑆2 2 = ∑(𝑦𝑖 − 𝑦
̅) = = 22.29
𝑛2 − 1 6
Let 𝐻0 be true
S2 2 22.29
F= 2 = 16.268 = 1.3699 = 1.37
S1
Since calculated F < tabulated F , we accept the null hypothesis 𝐻0 at 5% los i.e., there is no
significant difference between the variances of the distribution by the workers.
2. The average income of 100 people of a city is Rs 210 with a standard deviation of Rs
10.For another sample of 150 people the average income is Rs 220 with a standard
deviation of Rs 12.Test the significant difference between two mean at 5% LOS.
3. A coin is tossed 960 times .Head turned up 184 times. Find whether the coin is unbiased.
4. Random samples of 600 men and 900 women in a locality were asked they would like to
have a bus stop near their residence .350 men and 475 women were in favor of the
proposal. Test the significance between the difference of two proportions at 5%LOS.
5. .A pair of dice are thrown 360 times and the frequency of each sum is indicated below:
Sum 2 3 4 5 6 7 8 9 10 11 12
Frequency 8 24 35 37 44 65 51 42 26 14 14
Would you say that the dice are fair on the basis of the chi-square test at 5% LOS
6. The following are the average weekly losses of worker hours due to accidents in 10
industrial plant before and after a certain safety programme was put into operation:
Before 45 73 46 124 33 57 83 34 26 17
After 36 60 44 119 35 51 77 29 24 11
Test whether the safety programme is effective in reducing the number of accidents at 5%
Can you conclude that the two samples are taken from the same population with SD as 4.
2. A sample of 500 products are examined from a factory and 5% found to be defective.
Another sample of 400 similar products are examined and 3% found to be defective.
Test the significance between the difference of two proportions at 5% LOS.
3. 20 people were attacked by a disease and only 18 survived .will you reject the hypothesis
that the survival rate of the attack by this disease is 85% in favor of the hypothesis that is
more at 5% LOS
4. Ten specimens of copper wires drawn from a large lot have the following breaking
strength(in kg) 518,572,570,568,572,578,572,569,548.Test whether the mean breaking
strengths of the lot may be taken to be 518 kg weight.
5. 4.A survey of 320 families with 4children each revealed the following distribution
No# of boys 5 4 3 2 1 0
No# of girls 0 1 2 3 4 5
No# of families 14 56 110 88 40 12
Is this result consistent with the hypothesis that male and female births are equally
popular?