Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
211 views

Unit 1 SNM - New (Compatibility Mode) Solved Hypothesis Test PDF

(i) The document discusses hypothesis testing and various statistical distributions used for testing hypotheses about population means, variances and proportions. (ii) Key steps in hypothesis testing are outlined, including setting the null hypothesis, choosing the significance level, computing the test statistic and comparing it to critical values to determine whether to reject or fail to reject the null hypothesis. (iii) Common tests discussed include t-tests, F-tests, chi-square tests and tests for independence using contingency tables. Both small sample and large sample tests are covered.

Uploaded by

sanjeevlr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
211 views

Unit 1 SNM - New (Compatibility Mode) Solved Hypothesis Test PDF

(i) The document discusses hypothesis testing and various statistical distributions used for testing hypotheses about population means, variances and proportions. (ii) Key steps in hypothesis testing are outlined, including setting the null hypothesis, choosing the significance level, computing the test statistic and comparing it to critical values to determine whether to reject or fail to reject the null hypothesis. (iii) Common tests discussed include t-tests, F-tests, chi-square tests and tests for independence using contingency tables. Both small sample and large sample tests are covered.

Uploaded by

sanjeevlr
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

UNIT I TESTING OF HYPOTHESIS

Sampling distributions - Estimation of parameters - Statistical


hypothesis - Tests based on Normal, t, Chi-square and F distributions
for mean, variance and proportion - Contigency table
(test for independent) - Goodness of fit.

Prepared by
Dr. A.R. VIJAYALAKSHMI
Population.
The group of individuals under study is called population.
The population may be finite or infinite.
Sample and Sample Size
A finite subset of statistical individuals in a population is called
Sample. The number of individuals in a sample is called Sample
Size(n).
Parameter and Statistic
The statistical constants in population namely mean µ and variance 2
which are usually referred to as parameters.
Statistical measures computed from sample observations alone, i.e.
mean x and variance s 2 which are usually referred to as statistic.
Various steps involved in testing of hypothesis
(i)Set up null hypothesis

(ii) Choose appropriate level of significance.(either 5% or 1%


level)This is to be decided before sample is drawn

t  E (t )
(iii)Compute test statistics z under null hypothesis
SE (t )

(iv) We compare the computed value of z in


step (iii) with the significant value at given level of significance

If z  1.96, H may be accepted at 5%


0

level of significance

3
If z  1.96, H may be rejected at 5%
0

level of significance

If z  2.58, H may be accepted at 1%


0

level of significan ce

If z  2.58 , H may be rejected at 1%


0

level of significan ce

4
Sampling distribution
The probability distribution of a sample statistic is often
called the sampling distribution of the statistic

Sampling Error

On examining a sample of a particular stuff we arrive at a


decision of purchasing or rejecting the stuff. The error
involved in such approximation is known as sampling error

5
Test of significance
A very important aspect of the sampling theory is the study of tests of
significance which enable us to decide on the basis of the sample
results if

(i) The deviation between the observed sample statistics and


the hypothetical parameter value is significant

(ii) The deviation between two sample statistics is significant

6
Null hypothesis:

It is a definite statement about the parameter, that there is no difference.


It is denoted by H 0

Alternative hypothesis:

Complementary hypothesis to null hypothesis is called


the alternative hypothesis and is denoted by H
1

7
Errors in sampling

Two types of errors are


Type I error: Reject H when it is true
0

Type II error: Accept H when it is false


0

Critical region

A region in the sample space which amounts to the rejection of H 0

is known as the critical region or region of rejection

Those region which lead to the acceptance of H 0

give us a region called acceptance region


8
Small sample
When the size of the sample (n) is less than 30 then the sample is
called a small sample

Important test for small samples:

(i) student’s ‘t’ test

(ii) F-test
(iii)   test
2

9
Students ‘t’ Test

x 
t with   (n  1)
s
n 1
1
Where x   x = Sample mean
i
n i

  population mean

2 1 2
S   ( x  x)i
n 1 i

n  sample size
10
Test for single and difference of mean
(large sample)

The test statistic


x1  x2
x   z
z  1 1
  
n n1 n2

x

s
n

11
Test of single and difference of mean
(large sample)

•The probability curve of the t- distribution is similar to the standard


normal curve and is symmetric about t=0,bell shaped and asymptotic
to the t- axis

•For sufficiently large value of n, the t- distribution tends to the standard


normal distribution

x x1  x 2 2 2
t , t 2 n s  n s
s 2 1 1  Where s  1 1 2 2
s    n1  n2  2
n 1 n
 1 n 2 
Test of single variance and equality of variance

•To test whether there is any significant difference between two


estimates of population variance

•To test if the two samples have come from the same population

S 12 2
F   ( x  x ) 2
S 22 2
S 
1 S 2
  ( y  y)
n1  1 2
n1  1

with   (n 1, n 1)


1 2

13
Chi –Square test for goodness of fit

Chi –Square test for goodness of fit is defined by

Where O is the observed frequency and E is the expected frequency

  (n  1) if the data is given in series


  (n  1) if the dist. is binomial
  (n  2) if the dist. is poisson
  (n  3) if the dist. is normal

14
Independence of attributes

  0 .05

The test statistic

(ad  bc) (a  b  c  d )
2

  2

(a  b)(c  d )(a  c)(b  d )

with   (r  1, s  1)

15
Problems on large samples- single mean
The mean life time of a sample of 100 light bulbs produced by a
company is computed to be 1570 hours with a sample standard deviation
of 120 hours. If  is the life of all the bulbs produced by the company,
test the hypothesis   1600hours against the alternative hypothesis
  1600 hours with   0.05 and 0.01
Solution: Given Sample size n  100 Sample mean x  1570
Sample S.D s  120 Population mean   1600

Null Hypothesis: H0 :   1600


i.e there is no significance between the sample mean and hypothetical
population mean

Alternative Hypothesis: H :   1600( 2 tailed alternative)


1

16
Level of significan ce   5%

x 1570  1600 30


The test statistic z
s
   Z  2.5
120 12
n 100
The critical value of z for 2 ailed test at 5% level of
significance is Z   1.96

Conclusion:
Since Z  Z

we reject the null hypothesis and hence the mean life time of the tubes
produced by the company may not be 1600 hours

17
Random samples drawn from two countries gave the following data relating
to the heights of adult males. Is the difference between standard deviation
significant?

Country A Country B
Mean height 67.42 67.25
(inches)
S.D(Inches) 2.58 2.50
Number in 1000 1200
samples
Solution:

Given that n  1000 , s  2 .58 inches , n  1200 , s  2 .50 inches


1 1 2 2

18
Null Hypothesis: H 0 :  1   2
i.e there is no significant difference between the sample standard deviation

Alternative Hypothesis: H 1 :  1   2 ( two tailed test )

s1  s 2 s1  s 2
The test statistic z  
 12  22 s 12 s 22
2
 
2 n1 2 n 22 2 n1 2
2 n 12
 N (0,1)( & are no tknown)
1 2

2.58  2.50
z   2.58  2.50  1.0387
(2.58) 2
( 2.50) 2

0.07702

2  1000 2  1200

Conclusion:
The critical value at 5% LOS for two tailed test is z  1.96
19
Since z  1. 96, we accept the null hypothesis and hence conclude that

the sample standard deviations to not differ significantly

1. A Sample of 900 members has a mean 3.4cm and standard deviation


2.61cm . Is the sample from a large population of mean 3.25cm and
standard deviation 2.61cms?(Test at 5% level of significance)

Solution:

Null Hypothesis:

20
Alternative Hypothesis:

Level of significance   5%

The test statistic

Conclusion:

21
Problems on difference of mean
The mean of 2 large samples 1000 and 2000 members are
67.5 inches and 68.0 inches respectively. Can the samples be
regarded as drawn from the same population of S.D 2.5
inches.
Solution:
Sample I Sample II
x1  67.5 x2  68.0
n1  1000 n2  2000
Population S.D.   2.5
The two given samples are large samples
Null Hypothesis: H 0 : 1   2
29
Alternative Hypothesis: H 1 : 1   2

The test statistic z  x1  x2 67.5  68


  5.17
1 1 1 1
  2.5 
n1 n2 1000 2000
|z|=5.17

The table value of |z| at 1% level =2.58

Conclusion:
The calculated value of is greater than the table value of z.

 H 0 is rejected at 1% level of significance and so the two samples


cannot be regarded as belonging to the same population.

28
In a large city A, 20% of a random sample of 900 school boys had a slight
physical defect. In another large city B, 18.5% of a random sample of 1600
school boys had the same defect. Is the difference between the proportions
significant?

Solution: Given n 1  900 , n 2  1600


20 18 .5
p1   0.2 p2   0 .185
100 100
H0: The differences between the two proportions are not significant.

n1 p1  n2 p2 900 (0.2)  1600 (0.185 )


P   0.1904
n1  n2 900  1600

p1  p2
Q = 1 – P = 0.8906. The test statistic z 
 1 1 

PQ   
 n1 n2 

24
0 . 2  0 . 185
  0 . 935
 1 1 
( 0 . 1904 )( 0 . 8096 )   
 900 1600 

Calculated Value of | z | = 0.9375 < table value = 1.96


we accept the null hypothesis H0 at 5% level of significance.

i.e., the differences between the two proportions are not significant

25
Problems on small sample (single mean)

Sandal powder is packed into packets by a machine. A random sample


of 12 packets is drawn and their weights are found to be (in kg) 0.49,
0.48, 0.47, 0.48, 0.49, 0.50, 0.51, 0.49, 0.48, 0.50, 0.51 and 0.48. Test if the
average weight of the packing can be taken as 0.5 kg.

Solution: Given: n = 12 μ = 0.5

26
x x2

0.49 0.2401
x 
2

s 
2
 ( x) 2

0.48 0.2304 n
0.47 0.2209 2.8830
  (0.49) 2

0.48 0.2304 12
0.49 0.2401
 0.24025  0.2401
0.50 0.2500
0.51 0.2601  0.00015
0.49 0.2401
0.48 0.2304 s  0.012
0.50 0.2500
0.51 0.2601
0.48 0.2304
5.88 2.8830

The parameter of interest is 


27
Null Hypothesis: H :   0.05
0

Alternative hypothesis: H :   0.05


1

Level of significance   5%
Degrees of freedom:n-1=12-1=11
The test statistic t  x    0.49  0.5  0.01
  2.76395
s 0 . 012 0.003618
n  1 11

t   2 . 76395  2 . 20

We reject the null hypothesis H0 at 5% level of significance.

i.e The average packing cannot be taken to be 0. 5 kg

28
A machinist is making engine parts with axle diameter of 0.700 inch. A
random sample of 10 parts shows a mean diameter of 0.742 inch with a
standard deviation of 0.040 inch. Compute the statistic you would use to
test whether the work is meeting the specifications
Solution:
Here we are given  = 0.700 inch ,
Null hypothesis H0:  = 0.700 inch
i.e., the product is conforming to specifications
Alternative hypothesis H1:   0.700 inch
x   0 . 742  0 . 700
t    3 . 15
The test statistic
s2 ( 0 . 040 ) 2
n 1
(10  1 )

Here the test statistic follows the student’s t with 10-1 = 9 degrees of freedom

At 5 % level of significance, the table value is 2.26


Hence the calculated value > table value
So the null hypothesis is rejected and we conclude that there is significant
difference and the product is not conforming to specifications
29
Problems on small sample-difference of mean
Two random samples gave the following results.
Sum of squares
Sample Size Sample Mean of deviations
from the mean
1 10 15 90
2 12 14 108

Test whether the samples come from the same normal population
Solution: H0: The two samples have been drawn from the same normal
population

i.e H 0 :  1   2 and  12   22

30
Equality of means will be tested by t-test and the equality of variances
will be tested by F-test. Since t-test assumes equality of variances,
we shall apply F-test first.

(i ) H 0 :  12   22 Given n 1  10 , n 2  12 , x 1  15 , x 2  14
2 2
(x1  x1 )  90, ( x2  x2 )  108
1 90 1 108
S12   ( x1  x1 ) 2
  10, S 2
2   ( x 2  x 2 ) 2
  9.82
n1  1 9 n2  1 11
2
Then F  S 1
 1 .018 Calculated F = 1.018 F0.05 (9, 11) = 2.90
S 22
Since calculated F < tabulated F, we accept the null hypothesis H0. i.e  12   22

Since 12  22 , we can apply t-test now. t-test is used to test equality of means

( ii ) H 0 : 1  2
1 1
2
s 
n1  n2  2
 2 2

(x1  x1)  (x2  x2 )  20 (90  108)  9.9
31
x1  x 2 15  14
The test statistic t   0 .742
 1 1  1 1
s    9 .9 (  )
 n1 n 2  10 12

  ( n 1  n 2  2 )  20

Since calculated value of t = 0.74 < the tabulated value of t = 2.086,


we accept the null hypothesis H 0 :  1   2

Hence the given samples come from the same normal population

Combining (i) and (ii), we conclude that the samples have come from
the same normal population

32
.

Solution: Given n  25 , x  6.9,   7 .0,  2  0 .01

Null Hypothesis: H 0 :   7
(The sample mean x does not differ significantly from the population
mean 
Alternative hypothesis: H 1 :   7
x 6 .9  7
The test statistic t   4.89
s 0 .1
n 1 24
33
Calculated Value of t  4 . 89 and table value = 1.645

Since the calculated value of t > the tabulated value of t, we reject


the null hypothesis H0.
i.e., the population mean   7 is unacceptable

34
Problems on chi square good-ness of fit
Fit a Poisson distribution to the following data and test the goodness of fit

x 0 1 2 3 4 5 6
Freque 27 72 30 3 4 5 6
ncy f(x) 5
Solution:
f x 189
Mean of the given distribution x 
i
 0.482
i

 f 392
i

To fit a Poisson distribution to the given data. We take the parameter (i.e mean)
of the Poisson distribution equal to the mean of the given distribution

i.e   x  0.482

35
The Poisson distribution is given by

e   x

P ( X  x)  , x  0,1,....
x!
and the theoretical frequencies are obtained by

e  392  e (0.482)
 x 0.482 x

( f ) 
i

x! x!

392  e (0.482)
0.482 0

f ( 0)   242.1
0!

392  e (0.482)
 0.482 1

f (1)   116.69
1!

36
392  e  0.482
(0.482) 2

f ( 2)   28.12
2!
392  e 0.482
(0.482) 3

f ( 3)   4.518
3!

392  e (0.482)
 0.482 4

f ( 4)   0.544
4!

392  e (0.482)
 0.482 5

f ( 5)   0.052
5!

392  e (0.482)
 0.482 6

f ( 6)   0.004
6!

37
The theoretical Poisson frequencies are:

Null hypothesis H : The Poisson distribution fits well in to the data


0

Alternative hypothesis
H : The Poisson distribution does not fits well in to the data
1

38
Here Calculated   tabulated  ,
2 2

(O  E )
2

  347 .714
 H is rejected
0
E

we accept H is accepted
1
at 5 % los and we cocluded
that the poisson dist . is not a good fit to the data
39
The following data are collected on the two characters

Smokers Non Smokers

literates 83 57
Ill iterates 45 68

Based on this, can you say that there is no relationship between


smoking and literacy?
Solution:
Null hypothesis H 0 : Literacy and smoking habit are independen t

Alternative hypothesis H 1 : Literacy and smoking habit are not independen t

 = 0.05 d.f (r-1)(c-1) = (2-1)(2-1) = 1 x 1 = 1


Expected frequencies

128  140 128  113


E (a )   70.83 , E(c)   57.17,
253 253
125  140 125  113
E(b)   69.17, E(d)   55.83
253 253
(O  E ) 2

Oi Ei i

Ei
i

83 70.83 2.091048 Here Calculated   9.475808


2

57 69.17 2.14123
table value of  ( 1d . f ) at 5%  3.84
2

45 57.17 2.590675
68 55.83 2.652855 Here Calculated   tabulated  ,
2 2

9.475808  H is rejected
0

There is some relationship between literacy and smoking habit.


41
2  2 contingency table of a b
c d
a b a+b
c d c+d ( ad  bc ) ( a  b  c  d )
2

N=a+b+c+d  
2

a+c b+d ( a  b )( c  d )( a  c )( b  d )

Find the value of  for the following 2  2 contingency table


2

6 2
3 5
( ad  bc ) ( a  b  c  d )
2

 2
 N =a+b +c+d
( a  b )( c  d )( a  c )( b  d )

  1.0159
2
Independence of attributes
Find if there is any association between extravagance in fathers and
extravagance in sons from the following data .Determine the coefficient of
association also
Extravagance fathers Miserly father

Extravagance sons 327 741


H : is significant
1

Miserly sons 545 234


Solution: The parameter of interest is  2

Null hypothesis

H : Namely that the extravagance in sons and fathers are not significant
0

Alternative hypothesis H : is significant


1

43
N  (ad  bc )
 
2

(a  b)(c  d )(a  c )(b  d )

(327)(234)  (545)(741) 2
 (327  545  741  234)
 
2

(872)(975)(1068)(779)

  279.76  3.841
2

H is accepted at 5% level of significance


0

The following data gives the number of air craft accidents that
occurred during various days of the week. Find whether the accidents
are uniformly distributed over the week.

Days SUN MON TUE WED THU FRI SAT


No. Of 14 16 8 20 11 9 14
students
44
Solution:
Null hypothesis H 0
:The accidents are uniformly distributed over the
week
Alternative hypothesis H :The accidents are not uniformly distributed
1
over the week

Level of significance   5% Degrees of freedom=(n-1)(7-1)=6

H 10

Tabulated  2  12.59
Total number of accidents =84

84
Under the null hypothesis E   12 , i  1, 2,... 7
I
7

45
Day O E OE (O  E ) 2

E
Sun 14 12 2 0.3333

Mon 16 12 4 1.3333

Tue 8 12 4 1.3333
Wed 12 12 0 0
Thu 11 12 1 0.0833
Fri 9 12 3 0.75
Sat 14 12 2 0.3333
(O  E ) 2

  4 . 167
E

Here Calculated   4 . 147  tabulated  ,


2 2

 H is accepted
0

46
An automobile company gives you the following information about
age groups and liking for particular model of a car which it plans to
introduce. On the basis of this data can it be concluded that the model
appeal is independent of the age group.
Persons Below 20-39 20-39 40-59 60 and
who 20 above
Liked 140 80 40 40 20
the car
Disliked 60 50 30 30 80
the car

Solution: The parameter of interest is  2

Null hypothesis H There is no significance difference


0

Alternative hypothesis H 1 There is a difference


Level of significance   5% Degrees of freedom=(r-1)(s-1)=(4-1)(2-1)
Tabulated  2  7.815
47
The test statistic

Expected frequency  Correspond ing row total  column total


Grandtotal
( 200 )( 320 )
Expected frequency for 140   128
500
( 130 )( 320 )
Expected frequency for 80   83 . 2
500
( 70 )( 320 )
Expected frequency for 40   44 . 8
500
( 100 )( 320 )
Expected frequency for 20   6 .4
500

48
( 200 )( 250 )
Expected frequency for 60   50
500

( 130 )( 250 )
Expected frequency for 50   65
500

( 70 )( 250 )
Expected frequency for 30   35
500

( 100 )( 250 )
Expected frequency for 80  500
 50

49
E (O  E ) 2

O OE
E
140 112.28 27.72 6.84
80 72.98 7.02 0.67
40 39.29 0.71 0.012
20 56.14 -36.14 23.26
60 87.72 -27.72 8.76
50 57.02 -7.02 0.864
30 30.70 -0.70 0.015
80 43.85 36.15 29.80
(O  E ) 2

  70.22
E

Here Calculated  2  tabulated  2 ,


 H 0 is rejected

50

You might also like