Sta 121 Slides
Sta 121 Slides
Sta 121 Slides
1/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
COURSE OUTLINE
2/103
2/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
MODULE 1: INTRODUCTION TO STATISTICAL INFERENCE
Meaning of Statistics
Statistics can simply be defined as the "science of data".
It is the science of collection, organization and interpretation of numerical
facts, which we called data.
3/103
3/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Branches of Statistics
4/103 4/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Branches of Statistics
Descriptive Statistics
Collection, summarization and presentation of numerical information in form of
reports, charts and diagram.
Statistical Method
A device for classifying data and making clear relationship between variable under
consideration, using the statistical tools and formulae.
Inferential Statistics
Makes use of information from a sample to draw conclusions (inferences) about
the population from which the sample was taken.
5/103
5/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Population
Population
Group of individuals or items to whom the conclusions of a study or experiment
apply.
Finite Population
A population is said to be finite if it consists of a finite or fixed number of elements
(items, objects, and measurements or observations).
Infinite Population
A population is said to be infinite if there is (at least hypothetically) no limit to the
number of elements it can contain. For example, a possible roll of a pair of dice is
an infinite population for there is no limit to the number of times they can be rolled.
6/103
6/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling
Sample
A representative part of a population which can be random or purposive.
Advantages of sampling
Low cost of sampling.
Less time consuming in sampling.
Scope of sampling is high.
Accuracy of data is high.
Organization of convenience.
Intensive and exhaustive data.
Suitable in limited resources.
Better rapport.
7/103 7/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling
Disadvantages of sampling
Chances of bias
Difficulties in selecting a truly representative sample
Inadequate knowledge in the subject.
Changeability of units.
Impossibility of sampling.
Practice Question
1 What is the meaning of Statistics?
2 Discuss the types of Statistics?
3 Explain population.
8/103
8/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
MODULE 2: SAMPLING THEORY
Sampling Techniques
Sampling is concerned with the selection of a subset of individuals from within a
statistical population to estimate characteristics of the whole population.
9/103
9/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling Theory
10/103
10/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Probability and Non Probability Sampling
Probability/Random Sampling
Sampling technique in which sample from a larger population are chosen using a
method based on the theory of probability. Examples are Simple Random
Sampling, Systematic Sampling, Cluster Sampling, Stratified Sampling and
Multistage Sampling
11/103
11/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling Theory
Sampling Distribution
This is the computation of a statistic, which vary from sample to sample by
considering all the possible samples of size N that can be drawn from a given
population (either with or without replacement)
12/103
12/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling Distribution of Means
Suppose that all possible samples of size n are drawn without replacement from a
finite population of size N. If we denote the sampling mean and standard deviation
of the sampling distribution of means by µx̄ and σx̄ and the population mean and
standard deviation by µ and σ, respectively, then
q
µx̄ = µ and σx̄ = n N−n
√σ
N−1
13/103
13/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling Distribution of Proportions
14/103
14/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling Distribution of Differences and Sums
Given two populations, for sample sizes n1 and n2 drawn from different
populations, we compute statistic S1 and S2 respectively which yields a sampling
distribution for the statistics whose mean and standard deviation is denoted by
µS1 , µS2 and standard deviation σS1 , σS2 . From all possible combinations of these
samples from the two populations, we obtain a distribution of the differences
S1 − S2 , which is calledthe sampling distribution of difference of the statistics. The
mean and standard deviation of this sampling distribution, denoted by µS1−S2 and
σS1−S2 , are given by
q
µS1−S2 = µS1 − µS2 and σS1−S2 = σS1 2 + σ2
S2
15/103
15/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling Distribution of Differences and Sums
16/103
16/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Sampling Distribution of Differences and Sums
17/103
17/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Illustration 1
A population consists of the five numbers 2, 3, 6, 8 and 11. Consider all possible
samples of size 2 that can be drawn with replacement from this population. Find
(a) the mean of the population, (b) the standard deviation of the population, (c) the
mean of the sampling distribution of means and (d) the standard deviation of the
sampling distribution of means (i.e., the standard error of means).
Solution
a) The Population Mean
P
Xi 2 + 3 + 6 + 8 + 11 30
µ= = = = 6.0
N 5 5
18/103
18/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
c) There are 5(5) = 25 samples of size 2 that can be drawn with replacement.
These are
2.0, 2.5, 4.0, 5.0, 6.5, 2.5, 3.0, 4.5, 5.5, 7.0, 4.0, 4.5,
6.0, 7.0, 8.5, 5.0, 5.5, 7.0, 8.0, 9.5, 6.5, 7.0, 8.5, 9.5,
11.0.
20/103
20/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
d) The variance σx̄2 of the sampling distribution is obtained by deducting the mean
6 from each of the sample means, squaring the result and adding together and
divide the obtained value by 25.
135
σx̄2 =
25
√
Thusσx̄ = 5.40 = 2.32
This illustrates the fact that for finite populations involving sampling with
2
replacement (or infinite populations), σx̄2 = σn
Illustration 2
Solve Illustration 1 for the case that the sampling is without replacement.
21/103
21/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Solution
As in parts (a) and (b) in Illustration 1, µ = 6 and σ = 3.29.
2.5 + 4.0 + 5.0 + 6.5 + 4.5 + 5.5 + 7.0 + 7.0 + 8.5 + 9.5
µX̄ = = 6.0
10
22/103
22/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
23/103
23/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Illustration 3
Assume that the heights of 3000 soccer players in a tournament are normally
distributed with mean 68.0 inches (in) and standard deviation 3.0in. If 80 samples
consisting of 25 players each are obtained, what would be the expected mean and
standard deviation of the resulting sampling distribution of means if the sampling
were done (a) with replacement and (b) without replacement?
Solution
(a)
σ 3
µx̄ = µ = 68.0in and = σx̄ = √ = √ = 0.6in
n 25
24/103
24/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
(b)
r r
σ N −n 3 3000 − 25
µx̄ = 68.0in and = σx̄ = √ =√ = 0.598in
n N −1 25 3000 − 1
Illustration 4
In how many samples of Illustration 3 would you expect to find the mean between
(a) between 66.8 and 68.3 in and (b) less than 66.4 in?
25/103
25/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Solution
The mean x̄ of a sample in standard units is here given by
x̄ − µx̄ x̄ − 68.0
z= =
σx̄ 0.6
(a)
P(66.8 ≤ X̄ ≤ 68.3)
!
66.8 − 68.0 68.3 − 68.0
P ≤Z ≤
0.6 0.6
! !
68.3 − 68.0 66.8 − 68.0
=φ −φ
0.6 0.6
26/103 26/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
= φ(0.5) − φ(−2.0)
= φ(0.5) − 1 + φ(2.0)
= 0.6915 + 0.9772 − 1 = 0.6687
Therefore, the expected number of samples is (80)(0.6687) = 53
(b)
P(x̄ ≤ 66.4)
!
66.4 − 68.0
P Z ≤
0.6
!
66.4 − 68.0
=φ
0.6
27/103 27/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
= φ(−2.67) = 1 − φ(2.67)
= 1 − 0.9962 = 0.0038
Thus, the expected number of samples is (80)(0.0038) = 0
Illustration 5
Find the probability that in 120 tosses of a fair coin (a) less than 40% or more than
60% will be heads and (b) 58 or more will be heads
28/103
28/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Solution
We consider the 120 tosses of the coin to be a sample from the infinite population
of all possible tosses of the coin. In this population the probability of heads is
p = 12 and the probability of tails is q = 1 − p = 12
(a) Using normal approximation to binomial we require that the number of heads in
120 tosses will less than 48 or more than 72. Since the number of heads is a
discrete variable, we ask for the probability that the number of heads is less than
47.5 or greater than 72.5.
29/103
29/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
30/103
30/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
= φ(−2.28) + 1 − φ(2.28)
= 1 − φ(2.28) + 1 − φ(2.28)
= 2 − (φ(2.28) + φ(2.28))
= 2 − 2(0.9887) = 2 − 1.9774 = 0.0226
31/103
31/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Illustration 6
The solar light bulbs of company K have a mean life time of 1400 hours (h) with a
standard deviation of 200 h, while those of company L have a mean lifetime of
1200 h with a standard deviation of 100 h. If random samples of 125 bulbs of each
brand are tested, what is the probability that the brand K bulbs will have a mean
life time that is at least (a) 160 h and (b) 250 h more than the brand L bulbs?
32/103
32/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Solution
Let x̄K and x̄L denote the mean lifetimes of samples K and L, respectively, Then
and s r
σK2 σ2 (100)2 (200)2
σx̄K −x̄L = + L = + = 20h
nK nL 125 125
33/103
33/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
34/103
34/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
35/103
35/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
MODULE 3: STATISTICAL ESTIMATION THEORY
Estimation
Estimation is basically a process by which the sample statistics obtained is used to
estimate the parameters of the population from which the sample was drawn.
36/103
36/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Estimation Theory
37/103
37/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Estimation Theory
Unbiased Estimates
If the mean of the sampling distribution of a statistic equals the corresponding
population parameter, the statistic is called an unbiased estimator of the
parameter; otherwise, it is called a biased estimator.
Efficient Estimates
If the sampling distributions of two statistics have the same mean (or expectation),
then the statistic with the smaller variance is called an efficient estimator of the
mean, while the other statistic is called an inefficient estimator.
38/103
38/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Estimation Theory
39/103
39/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Estimation Theory
40/103
40/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Confidence Intervals for Proportions
41/103
41/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Confidence Intervals for Differences and Sums
42/103
42/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Illustration 1
In a sample of five measurements, the diameter of a sphere was recorded by a
student in a laboratory as 6.33, 6.37, 6.36, 6.32, and 6.37 centimeters (cm).
Determine unbiased and efficient estimates of (a) the true mean and (b) the true
variance.
Solution
(a) The unbiased and efficient estimate of the true mean (i.e., the populations
mean) is
P
X 6.33 + 6.37 + 6.36 + 6.32 + 6.37
X̂ = = = 6.35cm
N 5
43/103
43/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
(b) The unbiased and efficient estimate of the true variance (i.e., the population
variance) is
(X − X̄ )2
P
2 N 2
ŝ = s =
N −1 N −1
(6.33 − 6.35)2 + (6.37 − 6.35)2 + ... + (6.37 − 6.35)2
=
5−1
= 0.00055cm2
44/103
44/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Illustration 2
The standard deviation of bulbs manufactured by AYZ Solar Company is 5.6. If the
mean life span of 64 bulbs were randomly selected from the lot is 60 days.
(i) construct the 95% confidence limit for the bulb
(ii) what is the minimum number of samples to be selected so that the error does
not exceed 0.5
45/103
45/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Solution
0.05
(i) σ = 5.6, x̄ = 60, n = 64, α = 95% = 0.95, 1 − 0.95 = 0.05, Then, 2 = 0.025
From normal distribution table z α2 = 1.96.
The confidence interval, C.I. = x̄ ± z α2 √σn
5.6
= 60 ± 1.96 × √
64
= 60 ± 1.372
= [60 − 1.372, 60 + 1.372]
= [58.628, 61.372]
46/103
46/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
47/103
47/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Therefore, for the error not to exceed 0.5, the minimum number of samples to be
selected is 482.
Illustration 3
An animal scientist studying the effect of new substance added to the diet of
chinchila rabbits on the weights over a month period. The result of the effect on 7
rabbits choosing as a sample are shown in the table below.
Original Weights (Wo ) 56.1 22.2 50.1 39.5 10.3 20.2 7.4
New Weights(Wn ) 99.1 52.3 87.4 78.2 68.2 86.9 29.5
Find a 95% symmetric confidence interval for the weight gained. If the distribution
is assumed to be normally distributed.
48/103
48/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Solution
Wo Wn x = Wn − Wo x2
56.1 99.5 43.4 1883.56
22.2 52.3 30.1 906.01
50.1 87.4 37.3 1391.29
39.5 78.2 38.7 1497.69
30.3 68.2 37.9 1436.41
40.2 86.9 46.7 2180.89
7.4 29.5 22.1 488.41
Total 256.2 9784.26
where, x is the gain in weight over a period of 1 month. The distribution is normal,
the variance is unknown and n = 7(< 30), then we use t value instead of Z value.
49/103 49/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Sample mean
P
x 256.2
x̄ = = = 36.6
n 7
Sample variance
X )2 2
P
X2 − ( 9784.26 − (256.2)
P
2 n 7
s = = = 67.89
n−1 7−1
s = 8.24
50/103
50/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Confidence Interval
s
= x̄ ± tα √
n
8.24
= 36.6 ± 2.447 × √
7
= 36.6 ± 7.62
= [36.6 − 7.62, 36.6 + 7.62]
= [28.98, 44.22]
51/103
51/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
Illustration 4
A random sample of 50 Statistics grades out of a total of 200 showed a mean of
75 and a standard deviation of 10.
(i) What are the 95% confidence limits for estimates of the mean of the 200
grades?
(ii) With what degree of confidence could we say that the mean of all 200 grades
is 75 ± 1?
Solution
(a) Since the population size is not very large compared with the sample size, we
must adjust for it. Then the 95% confidence limits are
x̄ ± 1.96σX
52/103 52/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
r
σ N −n
= x̄ ± 1.96 √
n N −1
r
10 200 − 50
= 75 ± 1.96 √
50 200 − 1
= 75 ± 2.4
= [75 − 2.4, 75 + 2.4]
= [72.6, 77.4]
(b) The confidence limit can be represented by
x̄ ± zc σx̄
53/103
53/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Solved Problems
r
σ N −n
= x̄ ± zc √
n N −1
r
10 200 − 50
= 75 ± zc √
50 200 − 1
= 75 ± 1.23zc
Since this must equal to 75 ± 1, we have 1.23Zc = 1, or zc = 0.81. The area under
the normal curve from z = 0 to z = 0.81 is 0.7910 − 0.50 = 0.2910, hence the
required degree of confidence is 2(0.2910) = 0.582, or 58.2%
54/103
54/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
MODULE 4: STATISTICAL HYPOTHESIS TESTING
Hypothesis
A hypothesis is an idea that is based on known facts and is used for further
reasoning or investigation.
Statistical Hypothesis
A statistical hypothesis is an assertion or conjecture concerning one or more
populations which may be true or false.
Types of Hypothesis
Null Hypothesis (H0 ): a statistical hypothesis that states no difference.
Alternate Hypothesis (H1 ): a statistical hypothesis that states the existence of
difference.
55/103
55/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Full Specification of Hypothesis Test
H0 : µ = k ; H1 : µ < k (one sided) =⇒ lower tail or left tailed test
H0 : µ = k ; H1 : µ > k (one sided) =⇒ upper tail or right tailed test
H0 : µ = k ; H1 : µ 6= k (two sided) =⇒ two tail test
Very Important
Note that failure to reject H0 does not mean the null hypothesis is true. There is no
formal outcome that says "accept H0 ." It only means that we do not have sufficient
evidence to support H1 .
Statistical Test
This uses the data obtained from a sample to make a decision about whether the
null hypothesis should be rejected.
56/103 56/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Test Statistic
The numerical value obtained from a statistical test. Below are the distributions
and their respective conditions
Case 1
Test for mean, known variance, normal distribution or large sample i.e. n > 30
X̄ − µ
Z = √
σ/ n
57/103
57/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Case 2
Test for mean, large sample, unknown variance.
X̄ − µ
Z = √
s/ n
Case 3
Test for mean, unknown variance, small sample i.e. n < 30
X̄ − µ
t= √
s/ n
58/103
58/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Case 4
Test for proportion, large sample
p̂ − p
Z = q
pq
n
59/103
59/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Critical Region
This is a set on the real number line that leads to the rejection of H0 in favour of H1
60/103 60/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Type I Error
This is the possibility of rejecting the null hypothesis when it is true
Type II Error
This is the possibility of not rejecting the null hypothesis when it is false
61/103
61/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
62/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Level of Significance
This is the maximum probability of committing Type I error. It is usually denoted by
α. The probabiity of committing Type II error is denoted by β
Power of a test
This is the probability of rejecting H0 given that the specific alternative hypothesis
is true. That is Power = 1 − β.
63/103
63/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
64/103
64/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Illustration 1
A random sample of size 35 selected from a population whose distribution is
normal with mean µ and variance 36 gives a sample mean 48. Test the hypothesis,
H0 : µ = 50 against the sample mean 48, H1 : µ < 50 at 5% level of significance.
Solution
H0 : µ = 50
H1 : µ < 50
Here, n > 30 i.e. n=35 and the variance is known. σ 2 = 36 =⇒ σ = 6
X̄ − µ
Zcalc = √
σ/ n
65/103 65/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
48 − 50
= √
6/ 35
= −1.972
α = 5% = 0.05, Since it is a lower tail test, Ztab = −1.645
Conclusion: Since the test statistics Zcalc falls within the rejection region, the null
hypothesis is rejected and thereby conclude that µ < 50
Illustration 2
A quality control engineer finds that a sample of 100 light bulbs had an average
life-time of 470 hours. Assuming a population standard deviation of σ = 25hours,
test whether the population mean is 480 hours against the alternative hypothesis
µ < 480 at a significance level of α = 0.05
66/103 66/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Solution
H0 : µ = 480
H1 : µ < 480
n = 100 and the variance is known. σ = 25
X̄ − µ
Zcalc = √
σ/ n
470 − 480
= √
25/ 100
= −4.0
67/103
67/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Illustration 3
The time taken to shave the hair from the head of people were recorded by a hair
stylist. The mean time was found to be µ minutes and the standard deviation was
σ For these three individuals, the time taken for shaving were 3.52, 5.40, 4.33,
3.20 and 2.50 minutes. Test at 5% significance whether the mean time is equal to
3.45 or not
68/103
68/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Solution
H0 : µ = 3.45
H1 : µ 6= 3.45
n = 5 (small sample), variance unknown.
X
X = 3.52 + 5.40 + 4.33 + 3.20 + 2.50 = 18.95
X
X 2 = 3.522 + 5.402 + 4.332 + 3.202 + 2.502 = 76.7893
P
X 18.95
X̄ = = = 3.79
n 5
69/103
69/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
X )2
P
X2 − (
P
2 n
s =
n−1
2
76.7893 − (18.95)
5
= = 1.2422
5−1
s = 1.1145
X̄ − µ
tcalc = √
s/ n
3.79 − 3.45
= √ = 0.6822
1.1145/ 5
α = 5% = 0.05, Since it is a two tail test, ttab = t α2 ,n−1 = t0.975,4 = 2.776
70/103
70/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Conclusion: Since the test statistics tcalc does not fall within the rejection region,
we do not reject the null hypothesis and thereby conclude that µ = 3.45
Illustration 4
A batch of 100 resistors have an average of 101.5 Ω. Assuming a population
standard deviation of 5 Ω:
(a) Test whether the population mean is 100 Ω at 0.05 level of significance.
(b) Compute the p-value.
Solution
(a)
H0 : µ = 100
H1 : µ 6= 100
n = 100 and the variance is known. σ = 5
71/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc − µ INFERENCE (STA 121)
X̄STATISTICAL
Statistical Hypothesis Testing
101.5 − 100
= √
5/ 100
= 3.0
α = 5% = 0.05, Since it is a two tail test, Ztab = 1.96
Conclusion: Since the test statistics Zcalc falls within the rejection region, the null
hypothesis is rejected and thereby conclude that µ 6= 100.
(b) Since the observed Z value is 3. Then, the p-value is
This means that H0 could have been rejected at sig nificance level α = 0.0027
which is much stronger than rejecting it a 0.05.
72/103
72/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Statistical Hypothesis Testing
Illustration 5
An educator estimates that the dropout rate for seniors at high schools in Benin
City is 12%. Last year in a random sample of 300 Benin City seniors, 27 withdrew
from school. At α = 0.05, is there enough evidence to reject the educators claim?
Solution
H0 : p = 0.12
H1 : p 6= 0.12
27
n = 300, p̂ = 300 = 0.09
Conclusion: Since the test statistics Zcalc falls within the non-critical region, we do
not reject the null hypothesis and thereby do not have sufficient evidence to reject
the claim that the rate for seniors at high schools in Benin City is 12%
74/103
74/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
MODULE 5: REGRESSION ANALYSIS
Definition
Regression analysis is defined as the analysis of relationships among variables. It
is a statistical tool for the investigation of relationships between variables.
Y = β0 + β1 X1 + β2 X2 + − − − + βk Xk
where
Y = dependent or response variable
X1 , X2 , ..., Xk are called the explanatory or independent variables
β0 , β1 , ..., βk are called the regression coefficients.
75/103
75/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Y = β0 + β1 X
where P P P
XY − X
n Y
β1 = P 2 P 2
n X − ( X)
and
β0 = Ȳ − β1 X̄
76/103 76/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Illustration 1
The table below shows the heights to the nearest inch (in) and the weights to the
nearest pound (lb) of a sample of planks in a workshop.
X (in) 1 3 4 6 8 9 11 14
Y (lb) 1 2 4 4 5 7 8 9
77/103
77/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Solution
X Y X2 XY
1 1 1 1
3 2 9 6
6 4 36 24
8 5 64 40
9 7 81 63
11 8 121 88
14 9 196 126
X 2 = 508
P P P P
X = 52 Y = 36 XY = 348
78/103
78/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
P
X 52
X̄ = = = 7.43
n 7
P
Y 36
Ȳ = = = 5.14
n 7
P P P
n XY − X Y
β1 = P 2 P 2
n X − ( X)
7(348) − 52(36)
β1 = = 0.662
7(508) − 522
β0 = Ȳ − β1 X̄ = 7.43 − 0.636(5.14) = 4.161
Then, the regression line is
Y = 4.161 + 0.662X
79/103 79/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Y = 4.161 + 0.662(5)
Y = 4.161 + 3.310
= 7.471lb
when the height of the plank is 5 in, the weight of the plank is 7.471 lb.
6 = 4.161 + 0.662(X )
1.839
X =
= 2.778in
0.662
when the weight of the plank is 6 lb, the height of the plank is 2.778 in. 80/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Illustration 2
(a) Show that the equation of a straight line that passes through the points (X1 , Y1 )
and (X2 , Y2 ) is given by
Y2 − Y1
Y − Y1 = (X − X1 )
X2 − X1
(b) Find the equation of a straight line that passes through the points (2, -3) and
(4, 5).
81/103
81/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Solution
(a) The equation of a straight line is
Y = β0 + β1 X (1)
Y1 = β0 + β1 X1 (2)
Y2 = β0 + β1 X2 (3)
Y − Y1 = β1 (X − X1 ) (4)
82/103
82/103 Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Y2 − Y1
Y2 − Y1 = β1 (X2 − X1 ) or β1 =
X2 − X1
Y2 − Y1
Y − Y1 = (X − X1 )
X2 − X1
as required.
(b) Corresponding o the first point (2, -3), we have X1 = 2 and Y1 = −3;
corresponding to the second point (4,5), we have X2 = 4 and Y2 = 5. Thus the
slope is
83/103
83/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Y2 − Y1 5 − (−3) 8
β1 = = = =4
X2 − X1 4−2 2
and the required equation is
Y − Y1 = β1 (X − X1 )
Y − (−3) = 4(X − 2)
which can be written as
Y = 4X − 11
84/103
84/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Regression Analysis
Practice Question
The table below gives experimental values of the pressure P of a given mass of
gas corresponding to various values of the volume V. According to thermodynamic
principles, a relationship having the form PV γ = C, where γ and C are constants,
should exists between the variables.
85/103
85/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
MODULE 6: CORRELATION THEORY
Definition
Correlation refers to the mutual or degree of relationship between two or more
variables.
Corrleation can be
Perfect (Negative of Positive)
Partial
Zero (i.e. no correlation)
86/103
86/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Correlation Theory
Measurement of the Degree of Relationship
Product Moment Correlation Coefficient.
Spearman Rank Correlation Coefficient.
Illustration 1
A study recorded the starting salary (in thousands), Y, and years of education, X,
for 10 workers. The data is shown in the table below
Starting Salary 35 46 48 50 40 65 28 37 49 55
Years of Education 12 16 16 15 13 19 10 12 17 14
88/103
88/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Correlation Theory
Solution (i)
Y X X2 Y2 XY
35 12 144 1225 420
46 16 256 2116 736
48 16 256 2304 768
50 15 225 2500 750
40 13 169 1600 520
65 19 361 4225 1235
28 10 100 784 280
37 12 144 1225 444
49 17 289 2401 833
55 14 196 3025 770
X 2 = 2140 Y 2 = 21549
P P P P P
Y = 453 X = 144 XY = 6756
89/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Correlation Theory
P P P
n XY − X Y
r= p
[n X − ( X ) ][n Y − ( Y )2 ]
P 2 P 2 P 2 P
10(6756) − 144(453)
r=p
[10(2140) − 1442 ][10(21549) − 4532 ]
2328
= = 0.89
2612.773
Conclusion: It shows there is a strong positive correlation between the starting
salary and years of education of the workers.
90/103
90/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Correlation Theory
Solution (ii)
Y X Ry Rx d = Ry − Rx d2
35 12 9 8.5 0.5 0.25
46 16 6 3.5 2.5 6.25
48 16 5 3.5 1.5 2.25
50 15 3 5 -2 4
40 13 7 7 0 0
65 19 1 1 0 0
28 10 10 10 0 0
37 12 8 8.5 -0.5 0.25
49 17 4 2 2 4
55 14 2 6 -4 16
d 2 = 33
P
91/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Correlation Theory
6 d2
P
r =1−
n(n2 − 1)
6(33)
r =1−
10(102 − 1)
198
=1−
10(99)
= 1 − 0.20 = 0.80
Conclusion: It shows there is a strong positive correlation between the starting
salary and years of education of the workers.
92/103
92/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
MODULE 7: ELEMENTARY TIME SERIES ANALYSIS
Definitions
An ordered sequence of values of a variable at equally spaced time intervals.
Applications
Economic Forecasting
Sales Forecasting
Budgetary Analysis
Stock Market Analysis
Yield Projections
Process and Quality Control
Inventory Studies
Utility Studies
Census Analysis etc
93/103
93/103 Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
Components of Time Series
Trend (T)
Cyclical Variation (C)
Seasonal Variation (S)
Irregular Variation (I)
Trend
It refers to stationary, upward or downward movement that characterise a time
series over a period of time.
Examples of Trend
Population Changes
Technology Changes
Inflation of Deflation (Price Changes) etc 94/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
95/103
95/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
96/103
96/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
97/103
97/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
Cyclical Variation
Observable up and down fluctuations over an extended period of time. It could be
as a result of a boom in business or bust in an activity.
98/103
98/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
99/103
99/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
Seasonal Variation
This a variation that happens at a particular period of the year as a result of a
particular event. It is caused by such factors as weather, customs etc.
100/103
100/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
Example of a Seasonal Variation
101/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
Elementary Time Series Analysis
Irregular Variation
These are variations which are erratic in movement over time. They are either
unpredictable or caused by isolated events such as floods, earthquakes,
government policy, etc.
102/103
102/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)
THANK YOU
103/103
103/103
Olalekan Obisesan, Ph.D & Oladapo Oladoja, MSc STATISTICAL INFERENCE (STA 121)