Sampling QB
Sampling QB
Sampling QB
Sampling
Definitions
Population: The group of individuals, under study is called is called
population.
Sample: A finite subset of statistical individuals in a population is called
Sample.
Sample size: The number of individuals in a sample is called the Sample size.
Parameters and Statistics: The statistical constants of the population are
referred as Parameters and the statistical constants of the Sample are referred
as Statistics.
Standard Error : The standard deviation of sampling distribution of a statistic
is known as its standard error and is denoted by (S.E)
Test of Significance : It enable us to decide on the basis of the sample results
if the deviation between the observed sample statistic and the hypothetical
parameter value is significant or the deviation between two sample statistics is
significant.
Null Hypothesis: A definite statement about the population parameter which is
usually a hypothesis of no-difference and is denoted by Ho.
Alternative Hypothesis: Any hypothesis which is complementary to the null
hypothesis is called an Alternative Hypothesis and is denoted by H1.
Errors in Sampling:
Type I and Ttype II errors.
Type I error : Rejection of H0 when it is true.
Type II error : Acceptance of H0 when it is false.
Two types of errors occurs in practice when we decide to accept or reject a lot
after examining a sample from it. They are Type 1 error occurs while rejecting
Ho when it is true. Type 2 error occurs while accepting Ho when it is wrong.
Critical region: A region corresponding to a statistic t in the sample space S
which lead to the rejection of Ho is called Critical region or Rejection region.
Those regions which lead to the acceptance of Ho are called Acceptance Region.
t = ( x - ) / (s/n-1)
1. Write down 95 % confidence limit for the population mean in a small test.
Let x be the sample mean and n be the sample size. Let s be the sample S.D.
Then x t (s/n-1)
2. Define student t test for difference of means of two samples.
To test the significant difference between two mea n
n1 and n2 use the statistic.
and
of sample sizes
t = ( x 1 x 2 ) / s ((1/n1) + (1/n2))
where s2 = n1s12 + n2s22/n1 + n2 2.
s1 and s2 being the sample standard deviations degree of freedom being
n1 + n2- 2.
3. State the application of t distribution?
When the size of the sample is less than 30, t test is used in (a) single mean
and (b) difference of two means.
T = X - / s n 1
7. How is the number of degrees of freedom of chi-square distribution fixed for
testing the goodness of fit of a poisson distribution for the given data.Degree of
freedom = n 1 where n is the no. of observations.
8. Write short notes on critical value.
The critical or rejection region is the region which corresponds to a
predetermined level of significance . Whenever the sample statistic falls in the
critical region we reject the null hypothesis as it will be considered to be
probably false. The value that separates the rejection region from the
acceptance region is called the critical value.
9. Define level of significance explain.
The probability that a random value of the statistic t belongs to the critical
region is known as the level of significance. In other words level of significance
is the size of type I error. The levels of significance usually employed in testing
of hypothesis are 5% and 1%.
10. Outline the assumptions made when the t test us applied for difference of
means.
(i) Degrees of freedom is n1 + n2 2.
(ii) The two population variances are believed to be equal.
(iii)
S = (n1s12 + n2 s22) / (n1 + n2 2) is the standard error.
Problems :
1. A machinist is making engine parts with axle of diameters of 0.700 inch.
A random sample of 10 parts shows a mean diameter of 0.742 inch with
a Sd of 0.040 inches. Computer the statistic you would use to test
whether the work is meeting the specification
Solution: Calculated t value = 3.15 and Tabulated Value
5% level of signicance with 9 degrees of freedom)
= 2.26( at
= 1.708( at 5%
1.
n1 = 8
s1 = 36 hrs
3.
iet 25 32 30 34 24 14 32 24 30 31 35 25
iet 44 34 22 10 47 31 40 32 35 18 21 35 29 22
4.The
27
26
21
25
Sample 27
B
30
28
31
22
36
To test if the two samples have come from same population we use F test
(OR) To test if there is any significant difference between two estimates of
population variance.
F = S12/ S22 Where S12 = (x- x )2/ n1-1 ; S22 = (ythe first sample size and n2 is the second sample size
y )2/
n2-1;Where n1 is
Problems
= 1.057,Tabulated Value
Sample
Size
Sample mean
Sum of squares of
deviations from the
mean
10
15
90
12
14
108
16
26
27
23
22
Method 27
II
33
42
35
32
34
38
= 4.95 ( at 5% level
Sample 24
A
27
26
21
25
Sample 27
B
30
28
31
22
36
Pumpkins
were
grown
under
experimental conditions. Two random
5.
two
30
32
33
33
29
34
Horse
B
30
30
24
27
29
---
29
(O E ) 2
E
Where o is the observed frequency and E is the Expected
2 =
frequency
EXPECTED FREQUENCIES
E(a)= (a+c)(a+b)/N
E(b)=(b+d)(a+b)/N
a+b
E(c) = (a+c)(c+d)/N
E(d)=(b+d)(c+d)/N
c+d
a+c
b+d
N(Total frequencies)
PROBLEMS
1.
The
number
of
accidents
per
week
are
as
follows
12,8,20,2,14,10,15,6,9,4. Are these frequencies in agreement with the
belief that the accident conditions were the same during this 10 week
period. ( CV= 26.6, TV= 16.9 at 9df/ rejected)
2.
3.
Digits: 0,1,2,3,4,5,6,7,8,9
Frequency : 1026,1107,997,966,1075,933,1197,972,964,853
(CV= 58.5442,TV=16.919 for 9 df at 5%)
4. Theory predicts the proportion of eans in the four grouls A,B,C D should be
9 : 3: 3:1 In an experiment with 1600 beans the numbers in the four groups
were 882,313,287,118. Does the experimental reslt support the theory (CV=
17.6362,TV=11.07 for 5 df at 5% rejected)
5 A die is thrown 264 times with the following results. Show that the die is
biased
No
appeared
on the die
Frequency
40
32
28
58
54
60
Frequency
18
19
23
21
16
25
22
20
21
15
Use the 2 test to assess the correctness of the hypothesis that the digits were
distributed
in the equal number in the tables from which these were
chosen. Solution: Calculated 2 value = 4.3, Tabulated Value
= 16.919 ( at
5% level of significance with 9 degrees of freedom)Calculated value< Tabulated
value, Accept Ho (Null hypothesis)
Tabulated value,
hypothesis)
Reject
Ho
(Null
8. Given the following contingency table for hair colour and eye colour. Find the
value of Chi-Square and is there any good association between the two
HAIR COLOUR
FAIR
BROWN
BLACK
TOTAL
BLUE
15
20
40
GREY
20
10
20
50
BROWN
25
15
20
60
TOTAL
60
30
60
150
LARGE SAMPLES
TEST OF SIGNIFICANCE OF LARGE SAMPLES
If the size of the sample n>30 then that sample is called large sample.
1. Test of significance for single proportion
Let p be the sample proportion and P be the population proportion, we use
the statistic Z= (p-P) / ( PQ / n)
Limits for population proportion P are given by p3 ( PQ / n)
Where q = 1-p
2. Test of significance for difference of proportions
Let n1 and n2 are the two sample sizes and sample proportions are p1 and p2
Z=
( p1 p2 )
where p= n1p1+n2p2/n1+n2 and q=1-p
pq (1 / n11 1 / n2 )
The values of x 1.96 (/n) are called 95% confidence limits for the mean of
the population corresponding to the given sample.
The values of x 2.58 (/n) are called 99% confidence limits for the mean of
the population corresponding to the given sample.
4. Test of significance for Difference of means
Z= (
) / (12/n1) + (22/n2)
1. Write down the test statistic for single mean for large samples.
= X - / (/n) where X is the same mean, is the population mean, s is
the population S.D.
n is the sample size.
2. The mean score of a random sample of 60 students is 145 with a SD of 40.
Fine the 95 % confidence limit for the population mean.
solution z = X 1.96 (/n),
6. A random sample of 500 apples were taken from the large consignment and
65 were found to be bad. Find the percentage of bad apples in the
consignment.
Solution: (0.175, 0.085) Hence percentage of bad apples in the consignment
lies between 17.5% and 8.5%
Difference of Proportions:
1.
of the proposal at 5%
( CV=1.28,CV1.96, accept)
2. Before an increase in excise duty on tea, 800 persons out of a sample of
1000 persons were found to be tea drinkers. After an increase in duty
800 people were tea drinkers in the sample of 1200 people. Using
standard error of proportions state whether there is a significant
decrease in the consumption of tea after the increase in the excise duty.
Solution: Calculated Z value = 6.972,Tabulated value at 5% (one tail) =
1.645,Calculated value > Tabulated value, Reject Ho (Null hypothesis)
3. In two large populations there are 30% and 25% respectively of fair haired
people. Is this difference likely to be hidden in samples of 1200 and 900
respectively from the two populations.
Solution: Calculated Z value
= 2.55,Tabulated value at 5%
1.96,Calculated value > Tabulated value, Reject Ho (Null hypothesis)
16-20
No
of 12
persons
21-25
26-30
31-35
36-40
22
20
30
16