0% found this document useful (0 votes)

3K views

Chapter 2 Sampling and Sampling Distribution

This document discusses sampling theory and methods. It defines key terms like population, sample, statistic, and parameter. There are two main methods of sampling - probability sampling and non-probability sampling. Probability sampling methods include simple random sampling, stratified sampling, systematic sampling, and cluster sampling. Simple random sampling gives each unit an equal chance of selection, while stratified sampling divides the population into homogeneous subgroups before sampling. Systematic sampling selects units at regular intervals after randomly selecting a starting point. Probability sampling methods aim to select representative samples and allow statistical inference about the population.

Uploaded by

Manchilot Tilahun

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3K views

Chapter 2 Sampling and Sampling Distribution

Uploaded by

Manchilot Tilahun

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 23

CHAPTER TWO

SAMPLING AND SAMPLING DISTRIBUTION

SAMPLING THEORY
Sampling is simply the process of learning about the population on the basis of a sample
drawn from it. Thus in sampling technique instead of every unit of the population only
part of the population is studied and the conclusions are drawn on that basis for the entire
population. The process of sampling involves three elements: selecting the sample,
collecting the information and making an inference about the population.

BASIC CONCEPTS OF SAMPLING THEORY

Population: In Statistics the term population is used to mean the totality of cases (items)
under consideration in a given investigation or research. In other words, the largest
collection of observations on a variable constitutes the population.

Census: The process of gathering data from every element in the population.

Sample: Is part of the population of interest. Any non-empty subset of a population is

called a sample. There are different possible samples that can be selected from a single
population. Nevertheless, the one that best reflects or represents the behavior of the
population is considered to be the most appropriate one.

Sampling: The method of selecting a sample from a population.

Statistic: It is a measurable characteristic of the sample. In short it is a sample result.

Parameter: It is a measurable characteristic of the population or it is a numerical result

obtained as measuring the population.

Sampling Frame: - The list of all possible units in the reference population.

Sample Size: - The number of elements/observations in a specific sample.

Sampling Error: - The difference between sample statistic and population parameter.

1
Sampling Unit: - Elements of the population to be sampled or the unit of selection in the
sampling process.

Sample design: Is the set of procedures for selecting the sample elements from the
population.

REASONS FOR SAMPLING

The following are the major reasons for sampling technique:

Cost/Economy
Unit cost of collecting data in the case of census is significantly less than in the case of
sampling. However, due to due to the larger number of items in the population, the total
cost involves in the case of census is significantly higher than in the case of sampling.
Suppose it takes Birr 200 per unit to make a census of 10,000,000 individuals but the unit
cost of sampling 5000 individuals is Birr 1000. Thus, the total cost is: 10,000,000 x 200 =
2,000,000,000 but that of sample is 5,000 x 1000 = 50,000,000

Timeliness
Due to the larger size of population total time involves in the case of census is
significantly higher than that of sampling (i.e., the sample may provide us with necessary
information quickly).

Large Population Size

Sometimes, many populations about which inferences must be made are quite large
implying that it is impossible to cover all the items in the population. Thus, the solution is
to take sample from such a population.

Inaccessibility of the Entire Population

In some cases the entire population may not be accessible due to diseases, death, conflict,
mental abnormality, prisoners, etc. In that case sampling is necessary.

2
Destructive Nature of Many Tests
Due to destructive nature of many tests, the resources are completed to collect
information only from part of the population. For example: blood test for a patient, life
hours of a tube light, strength of wires, etc.

Accuracy
Non-sampling error in the case of census is higher than the non-sampling error
committed in the case of a sample survey ( as less qualified investigator are involve in the
case of census and the supervision, monitoring and quality control mechanism in the case
of census may be poor). The higher the degree of non-sampling error, the less reliable
your result may be.

SAMPLING METHODS
There are two principal methods of drawing a sample from a population: Probability
sampling and Non-probability sampling.

1) Probability Sampling

In the case of probability sampling each observation in the population has an equal
chance of being selected to become part of the sample. There is no human judgment in
the case of probability sampling.

There are four basic types of probability sampling techniques.

i. Simple Random Sampling

ii. Stratified Sampling
iii. Systematic Sampling
iv. Cluster Sampling
i. Simple Random Sample
Simple Random Sampling is a method of probability sampling in which every unit in the
population has an equal nonzero chance of being selected (or part of the sample). In other
words, each element of the population has an equal and independent chance of being
included into the sample. The probability is given by n/N.

3
There are two methods to select a simple random sample:
Lottery method- In this method, each population item is numbered 1 to N on slips of
identical cards (size, shape and color). Then place numbered cards in a bowl, mix them
thoroughly, and select as many cards as needed in a blind fold selection. The subjects
whose numbers are selected constitute the sample. Since it is difficult to mix the cards
thoroughly, there is a chance of obtaining a biased sample. Thus we need other method of
selecting sample elements.

Random Number method- due to the problem of lottery method, statisticians use another
method known as the random number method where numbers are generated using
computers.

How to use random number table method

a) Assign a unique number to each population element in the sampling frame. Start
with serial number 1, or 01, or 001, etc. depending on the number of digits
required.
b) Choose a random starting position by closing your eyes (blind fold selection) and
placing your finger on a number in the table.
c) Select serial numbers across rows or down columns or diagonally from the
starting point.
d) Discard numbers that are not assigned to any population element and ignore
numbers that have already been selected.
e) Repeat the selection process until the required number of sample elements is
selected.
Advantage of simple random sampling
 It ensures that the sample is unbiased.
Disadvantages simple random sampling
 It requires a Sampling Frame, and this is sometimes impossible (the case of fish
population).
 If the population is very large, it is tedious and time consuming to number and
select the sample.

4
 Minority subgroups of the population may not be represented in the sample.
ii. Stratified Sampling
In stratified sampling, a population is first divided into subgroups, called strata (singular stratum),
and a sample is selected from each stratum based on simple random or systematic sampling
method. The strata are made according to various homogeneous characteristics such as sex, race,
region or institutional affiliation such as faculty. Stratified sampling is applied if the population is
heterogeneous.
Stratified random sampling method is a three-step process:
 Step 1- Divide the population into homogeneous, mutually exclusive and collectively
exhaustive groups or strata using some stratification variable (e.g. income level, sex,
education level, etc.);
 Step 2- Select an independent simple random sample from each stratum (using simple
random sample);
 Step 3- Form the final sample by consolidating all sample elements chosen in step 2.

Stratified samples can be:

Proportionate: involving the selection of sample elements from each stratum, such that the ratio
of sample elements and total number of population elements (n/N) is constant/equal for all strata.
Disproportionate: the sample is disproportionate when the above mentioned ratio is unequal.

Example: To select a proportionate stratified sample of 20 households from Addis Ababa that
belong to three income groups: low (50), middle (30) and high (20) (N=50+30+20=100).

 Sub-divide the club members into three homogeneous sub-groups or strata by the
income groups: low, middle and high.
 Calculate the overall sampling fraction, f, in the following manner: f=n/N=20/100=0.2
Where n = sample size and N = population size: n1=0.2*50=10, n2=0.2*30=6 and n3=0.2*2=4.
Thus, n=n1+n2+n3=10+6+4=20

Advantage of Stratified Sampling:

 The representation of the sample is improved
Disadvantages Stratified Sampling
 If there are many variables of interest, dividing a large population in to representative
subgroups requires a great deal of effort,

5
 If variables are somewhat complex or ambiguous (such as beliefs, attitudes, etc), it is
difficult to separate individuals in to the sub-groups according to these variables.
iii. Systematic Sampling
In systematic sampling only one random number is needed throughout the entire
sampling process. Elements of the population will be arranged in some order and the
elements to be included in the sample will be selected at a constant interval.
To use systematic sampling, a researcher needs:
a. A sampling frame of the population;
b. a skip interval (K) calculated as follows:

population list ¿(N ¿) N

Skip interval (K) = ¿∨ =K
sam ple¿ n ¿ n
The first element (number), which is between 1 and K, is determined using simple
random sampling and then the next items are selected using the skip interval. For
th th th
instance, the j unit is selected at first and then ( j+k ) ,( j+2 K ) … etc until the
required sample size is obtained.

Example: Suppose there are 2000 subjects in the population and a sample size of 50
subjects are needed. The sampling interval (k) is 40 (2000/50). Select the starting point,
say ‘x’, from 1 through 40 using simple random sampling, and then include every 40 th
element starting from ‘x’.

Advantages of Systematic Sampling:

 Less time consuming and easier to perform than SRS,
 It is more convenient to use as compared to SRS, and
 It provides a good approximation to SRS.

Disadvantages of Systematic Sampling:

If there is any sort of cyclic ordering of the subjects, the samples will not be
representative of the population. Example: If subjects in the population are arranged in a
manner such as:
 Defective item
 Non-defective item

6
 Defective item
 Non-defective item etc,

iv. Cluster Sampling

Cluster sampling can be used if the population is homogeneous and very large in size. It is a type
of sampling in which the population is divided into non-overlapping heterogeneous groups called
clusters or groups and clusters/groups of elements are sampled as the sampling units using simple
random sampling technique in the first phase (if it is the two-phase cluster sampling). In other
words, cluster sampling is a type of sampling which involves dividing the population into groups
(or clusters). Then, one or more clusters are chosen at random and individual within the chosen
cluster is sampled.
A two-step-process:
 Step 1- Defined population is divided into number of mutually exclusive and
collectively exhaustive heterogonous groups or clusters;
 Step 2- Select an independent simple random sample of clusters using sample
random sampling.
Advantages of Cluster Sampling:
 A list of all individual study units in the reference population is not required,
 Reduces cost, and
 Simplifies field work and it is convenient.
Disadvantages:
 The members of the clusters are often more homogeneous than the members of the
whole population and therefore, it may not be representative.
 The elements in a cluster may not have the same variation in characteristics as
elements selected individually from the population.

2) Non-Probability Sampling
In the case of non-probability sampling, not every unit in the population has a chance of
being included in the sample. It involves at least some degree of personal subjectivity
instead of following predetermined, probabilistic rules for selection.

There are three basic types of non-probability sampling techniques.

i. Convenience sampling
ii. Judgmental sampling

7
iii. Quota sampling
i. Convenience Sampling
Convenience sampling implies sample drawn at the convenience of the researcher. It is common
in exploratory research. Does not lead to any conclusion

ii. Judgmental sampling

Sampling based on some judgment, gut-feelings or experience of the researcher. It is common in
commercial marketing research projects. If inference drawing is not necessary, these samples are
quite useful.

iii. Quota sampling

In this method, the decision maker requires the sample to contain a certain number of items with
a given characteristic. It is something like judgmental sampling.

Example 2.5: Suppose we know that 54% of the adults in a community are females, and the study
requires 100 respondents as a sample. In quota sampling, we might interview the first 54 females and
the first 46 males.

DETERMINANT FACTORS OF THE SAMPLE SIZE

Size of sample means the number of sampling units selected from the population for
investigation. If the size of sample is small it may not represent the population and the
inference drawn about the population may be misleading. On the other hand, if the size of
sample is very large, it may be too burdensome financially, requires a lot of time and may
have a serious problem of managing it. Hence the sample size should be neither too small
nor too large. Rather it should be optimal.

The following factors should be considered while deciding the sample size:

8
i. The size of the population: the larger the size of the population, the bigger should be
the sample size.
ii. The resource available: if the resources available are vast a large sample size could be
taken. However, in most cases resources constitute a big constraint on sample size.
iii. The degree of accuracy or precision desired: the greater the degree of accuracy
desired, the larger should be the sample size. However, it does not necessarily mean
that bigger samples always ensure greater accuracy.
iv. Homogeneity or heterogeneity of the population: If the population consists of
homogeneous units a small sample may serve the purpose, but if the population
consists of heterogeneous units a large sample may be inevitable.
v. Nature of study: For an intensive and continuous study a small sample may be
suitable. But for studies which are not likely to be repeated and are quite extensive in
nature, it may be necessary to take a large sample size.
vi. Method of sampling adopted: The size of sample is also influenced by the type of
sampling plan adopted. For example if the sample is a simple random sample it may
necessitate a bigger sample size, However, in a properly drawn stratified sampling
plan, even a small sample may give better results.
vii. Nature of respondent: Where it is expected a large number of respondents will not co-
operate and send back the questionnaire, a large sample should be selected.

SAMPLING DISTRIBUTION
NOTE: The normal probability distribution is used to determine probabilities for the
normally distributed individual measurements, given the mean and the standard
deviation. Symbolically, the variable is the measurement X, with the population mean µ
and population standard deviation δ. In contrast to such distributions of individual
measurements, a sampling distribution is a probability distribution for the possible values
of a sample statistic.

Population distribution: Is the distribution of measured values of its members and have
mean denoted byμ and variance δ 2and standard deviationσ . The population standard
deviation describes the variation among values of members of the population; where as the

9
standard deviation of sampling distribution measures the variability among values of the
statistics (sample) such as mean values, proportion values due to sampling errors.

Sample distribution: Is the distribution of measured values of sample in random samples

drawn from a given population. Each sample mean would vary from sample to sample.
This variability serves as the basis for random sampling distribution. A sampling
distribution is a probability distribution for the possible values of a sample statistic, such
as a sample mean.

SAMPLING DISTRIBUTION OF THE MEAN

Sampling distribution of the mean: Is the probability distribution of all possible values of
a given statistic (sample) from all distinct possible sample of equal size drawn from a
population or a process. The sampling distribution of the mean values has its own
arithmetic mean denoted by μ x́ (read as mu sub x bar) and standard deviation δ x́ (sigma sub
x bar).The sampling distribution of the mean is the probability distributions of the means,
X of all simple random samples of a given sample size n that can be drawn from the
population.

NB: The sampling distribution of the mean is not the sample distribution, which is the
distribution of the measured values of X in one random sample. Rather, the sampling
distribution of the mean is the probability distribution for X , the sample mean.

For any given sample size n taken from a population with mean µ and standard deviation
δ, the value of the sample mean X would vary from sample to sample if several
random samples were obtained from the population. This variability serves as the basis
for sampling distribution.

The sampling distribution of the mean is described by two parameters: the expected value

( X ) = X , or mean of the sampling distribution of the mean, and the standard

deviation of the mean

δ x , the standard error of the mean.

PROPERTIES OF THE SAMPLING DISTRIBUTION OF MEANS

1. The arithmetic mean μ x́ of sampling distribution of mean values is equal to the
population meanμ regardless of the form of population distribution .i.e. μ x́=μ
2. The sampling distribution has a standard deviation (also called standard error) equal to
the population standard deviation divided by the square root of the sample size i.e. δ x́ =
σ
. This hold true if and only of n<0.05N and N is very large. If N is finite and
√n

10
δ N −n
n˃0.05N
δx = ∗
√
√ n N−1 .
N −n
The expression √ N −1 is called finite population correction factor/finite population
multiplier. In the calculation of the standard error of the mean, if the population

standard deviation δ is unknown, the standard error of the mean

δ x , can be estimated

by using the sample standard error of the mean

S X which is calculated as follows:

S S N−n
δX =
√n
or δ X = ∗
√
√ n N−1 .
3. A sample size n≥30 is generally said to be considered to be a large sample for statistical
analysis where as a sample of size n¿ 30 is considered to be a small sample. The
sampling distribution of means is approximately normal for sufficiently large sample
sizes (n≥ 30).
4. When standard deviation of population σ is not known, the standard deviation of the
sample s which closely approximates σ value is used to compute standard error, i.e. δ x́ =
s
.
√n
Example 1. A population consists of the following ages: 10, 20, 30, 40, and 50.
A random sample of three is to be selected from this population and mean
computed. Develop the sampling distribution of the mean.

Solution: The number of simple random samples of size n that can be drawn without
N!
replacement from a population of size N is N C n( ).With N= 5 and n = 3, 5C3 =
n !( N −n)!
10 samples can be drawn from the population as:

Sampled items Sample means ( X )

10, 20, 30 20.00
10, 20, 40, 23.33
10, 20, 50 26.67
10, 30, 40 26.67
10, 30, 50 30.00
10, 40, 50 33.33
20, 30, 40 30.00
20, 30, 50 33.33
20, 40, 50 36.67
30, 40, 50 40.00

11
300.00

A systematic organization of the above figures gives the following:

Sample mean ( X ) Frequency Prob. (relative freq.) of X

20.00 1 0.1
23.33 1 0.1
26.67 2 0.2
30.00 2 0.2
33.33 2 0.2
36.67 1 0.1
40.00 1 0.1
TOTAL 10.00 1.00
Columns 1 and 2 show frequency distribution of sample means.
Columns 1 and 3 show sampling distribution of the mean.

μ=
∑ X = ∑ x =30 ,
N n Regardless of the sample size μ=X .

x (Observation) x−μ ( x−μ)2

10 -20 400
20 -10 100
30 0 0
40 10 100
50 20 400
∑ ( x−μ)2 1,000

∑ ( X i− X )2
σ=
√ N
=
√ 1000
5
=14 . 142

δ N−n 14 .142 5−3

σ X=
√n
∗
√ =
N−1 √3
∗
5−1 √
=5.774

∑ ( X i −X )2
=
√ N
=
√
Since averaging reduces variability
333 .4
10
=5. 774
δ x < δ except the cases where δ = 0 and n =
1.

12
Central Limit Theorem and the Sampling Distribution of the Mean
The Central Limit Theorem (CLT) states that:

1. If the population is normally distributed, the distribution of sample means is

normal regardless of the sample size.
2. If the population from which samples are taken is not normal, the distribution of
sample means will be approximately normal if the sample size (n) is sufficiently
large (n ≥ 30). The larger the sample size is used, the closer the sampling
distribution is to the normal curve.

The relationship between the shape of the population distribution and the shape of the
sampling distribution of the mean is called the Central Limit Theorem.

The significance of the Central Limit Theorem is that it permits us to use sample statistics
to make inference about population parameters without knowing anything about the
shape of the frequency distribution of that population other than what we can get from the
sample. It also permits us to use the normal distribution curve for analyzing distributions
whose shape is unknown. It creates the potential for applying the normal distribution to
many problems when the sample is sufficiently large. As mentioned earlier the above
properties must exist, given this value of sample mean X́ is first converted in to a value Z
on the standard normal distribution to know how any single value deviates from X́ of
sample mean values ( μ x́), by using the formula;

X́−μ
X́−μ x́
Z= = δ because μ x́=μ
δ x́
√n
If the population is finite and samples of fixed size n are drawn without replacement, then
the standard error of sampling distribution of mean can be modified to adjust the continued
change in the size of population μ due to the several draws of samples of size n is as
follows:

Example 2: The mean length of a certain tool is 41.5 hours with a standard deviation of 2.5
hours. What is the probability that a simple random sample of size 50 drawn from this
population will have a mean between 40.5 hours and 42 hours?

μ=41.5 δ=2.5 n=50

P (40.5≤ X́ ≤42.0) =?

δ 2.5 2.5
μ x́= μ δ x́ = = = = 0.3536
√ n √50 7.0711

13
The population distribution is unknown, but sample size n=50 is large enough to apply the
central limit theorem. Hence the normal distribution can be used to find the required
probability.
X́ 1−μ X́ −μ
P (40.5≤ X́ ≤42) = P ( ≤Z≤ 2 )
δ x́ δ x́
40.5−41.5 42−41.5
=P( ≤ Z≤ )
0.3536 0.3536
= P (−2.8281 ≤ Z ≤ 1.4140)
=P (Z ≥−2.8281) + P (Z ≤ 1.4140)
=0.4977+0.4207=0.9184
Thus 0.9184 is the probability of the tool having mean life between the required hours.
δ =300
0.4977
0.4207

x́=40.5 μ=41.5 x́=40.5

Example 2. A continuous manufacturing process produces items whose weights

are normally distributed with a mean weight of 800gms and a standard
deviation of 300gms. A random sample of 16 items is to be selected from
the process.
A. What is the probability that the arithmetic mean of the sample exceeds 900gms? Interpret
the result.
B. Find the values of the sample arithmetic mean within which the middle 95% of all sample
means will fall.

Solution:

A. P (x́ ≥ 900) =?
μ X́ =μ=800gms δ=300gms
n=16
P (x́ ≥ 900) =?
δ 300 300
δ x́ = = = = 75
√ n √16 4

14
0.0918

μ X́ =800 X́ =900

X́−μ x́ 900−8 00
P (x́ ≥ 900) =P (Z≥ = ¿
δ x́ 75
=P (Z≥ 1.33¿
=0.5000-0.4082
=0.0918

B. Since Z=1.96 for the middle 95% area under the normal curve, therefore using the formula
for z to solve for the values of x́ in terms of the known values are as follows.
x́ 1= μ X́ -Zδ x́ x́ 2= μ X́ +Zδ x́
=800-1.96(75) =800+1.96(75)
=653gms =947gms
0.95
=300

SAMPLING DISTRIBUTION OF SAMPLE PROPORTIONS

The sample proportion Ṕ having the characteristic of interest (success or failure, accept or
reject, head or tail) is the best use for statistical inferences about the population parameter
P. the sample proportion can be defined as:

number of success , X
Ṕ=
sample ¿ n

With same logic of sampling distribution of mean, the sampling distribution of sample
proportions with mean μ Ṕ and standard deviation also called standard error) δ Ṕ is given by:

15
μ Ṕ = P and δ Ṕ = pq = p(1−P)
√ √
n n

If a large ample size (n≥30) satisfying following two conditions,

A. np≥5
B. nq≥5
Then the sampling distribution of proportions is very closely normally distributed. It may
be noted that the sampling distribution of the proportion would actually follow binomial
distribution because population is binomially distributed.
For finite population in which sampling is done without replacement we have;
μ Ṕ = P and δ Ṕ = pq * N −n
√ √
n N −1
Under the same guidelines as mentioned in the previous sections, for a large sample size n≥
30, the sampling distribution of proportion is closely approximated by a normal distribution
with a mean and standard deviation as stated above. Hence, to standardize sample
proportion Ṕ, the standard normal variable,
Ṕ−P
Ṕ−μ Ṕ
Z= = pq
δ Ṕ

Example 3.
√ n
Few years back, a policy was introduced to give loans to
unemployed engineers to start their own business. Out of 1,000,000
engineers, 600,000 accepted the policy and got the loan. A sample of 100
unemployed engineers is taken at the same time of allotment of loans. What
is the probability that sample portion would have exceeded 50%
acceptance?
Solution:

μ Ṕ = P=0.60 N=1,000,000
n=100 P ( Ṕ ≥0. 5) =?

δ Ṕ = pq √ N−n ¿ ¿=¿ )( √ 1,000,000−100 ¿ ¿)

√ n N−1
δ Ṕ =0.0489
1,000,000−1

Ṕ−μ Ṕ 0.50−0.60
P ( Ṕ ≥0. 5) =P (Z≥ ) =P (Z≥ ) =0.4793+0.5000=0.9793
δ Ṕ 0.0489

0.4793 16
0.5000
5 P=0.60

Example 4. A population proportion is 0.40. A simple random sample of size

200 will be taken and the sample proportion will be used to estimate the
population proportion, what is the probability that the sample proportion
will be with in ±0.03 of the population proportion.

Given:
μ Ṕ = P=0.40 n=200

Ṕ−P
δ Ṕ = ( 0.4 ) (0.6) =0.0346
√ 200
P (-0.03≤ Ṕ ≤ 0.03) = 2P (Z≥
δ Ṕ
)

= 2P (Z ≤ 0.87 ¿
=2x0.3078
=0.6156

0.3078 0.3078

P=0.40

Example 5. A manufacturer of watches has determined from past experience that

3% of the watches he produces are defective. If a random sample of 300
watches is examined, what is the probability that the proportion of defective
is between 0.02 and 0.035?

μ Ṕ = P=0.03 Ṕ2=0.035
Ṕ1=0.02 n=300

17
δ Ṕ = ( 0.03 ) (0.97) =0.0098
√ 300

Ṕ−P Ṕ−P
P (-0.03≤ Ṕ ≤ 0.03) = P ( ≤Z ≤ )
δ Ṕ δ Ṕ
0.02−0.03 0.035−0.03
=P( ≤Z≤ )
0.0098 0.0098
= P (-1.02≤ Z ≤0.51)
=P (Z≥−1.02) + P (Z≤ 0.51)
=0.3461+0.1950
= 0.5411
Hence the probability that the proportion of defective will lie between 0.02 and
0.035 is 0.5411

0.3461 0.1950

=0.02 P=0.03 =0.035

Sampling Distribution of the Difference between Two Means

The concept of sampling distribution of sample mean introduced earlier can also be
used to compare a population of size N 1 having mean μ1and standard deviation δ 1
with another similar type of population of size N 2 having mean μ2and standard
deviationδ 2.

Let X́ 1 ∧ X́ 2be the mean of sampling distribution of the mean of two populations,
respectively. Then the difference between their mean values μ1and μ2can be
estimated by generalizing the formula of standard normal variable as follows;

( X́ 1− X́ 2 )−(μ X́ −μ X́ ) ( X́ 1− X́ 2 )−(μ1−μ2 )
Z= 1 2
=
δ ( X́ −X́ )
1 2
δ ( X́ − X́ )
1 2

Where: μ X́ −μ X́ = μ1−μ 2 (mean of sampling distribution of sample mean)

1 2

2 2
δ ¿¿= δ X́ + δ X́
√ 1 2

18
δ 1 2 δ 22

two means)
=
√ n1 n2
+ (standard error of sampling distribution of difference of

n1andn2 are independent random samples drawn from first and second
population , respectively.

Example: Car stereos of manufacturer A have a mean lifetime of 1,400 hours with a standard
deviation of 200 hours, while those of manufacturer B have a mean life time of 1,200 hours with a
standard deviation of 100 hours. If a random sample of 125 stereos of each manufacturer are tested,
what is the probability that manufacturer A’s stereos will have a mean life time which is at least;

A. 160 hours more than manufacturer B’s stereos?

B. 250 hours more than manufacturer B’s stereos?
Solution:

Manufacturer A μ1=1,400 hours

δ 1= 200 hours n1=125
Manufacturer B μ1=1,200 hours
δ 1= 200 hours n1=125
a)
2 2 2 2
δ ( X́ −X́ )= δ 1 + δ 2 = (200) + (100) = √ 80+320=√ 400 =20
1 2
√ n1 n2
P ( X́ 1 − X́ 2 ≥160) =
125√ 125
P (Z ≥ ¿ ¿)
160−200
=P ( Z ≥ )
20
=P (Z ≥−2)
=0.5000+0.4772
=0.9772 (area under normal curve)

0.9772

X́ 1 − X́ 2=160 μ X́ −X́ =200

1 2

Hence, the probability is very high that the life time of the stereos of A is 160 hours more
than that of b.

19
b) Proceeding in the same manner as in part a) as follows:
( X́ 1− X́ 2)(μ1 −μ 2) 250−200
P ( X́ 1 − X́ 2 ≥250) = P (Z ≥ =P ( Z ≥ )
δ ( X́ − X́ )
1 2
20
=P (Z ≥−2.5)
=0.5000 - 0.4938
=0.0062 (area under normal curve)

0.0062

Example 6. The strength of a wire produced by company has a mean of 4,500kg

and a δ 1of 200 kg. Company B has a mean of 4,000 kg and a δ 2of 300 kg. if
50 wires of company A and 100 wires of company B are selected at random
and tested for strength, what is the probability that the sample mean strength
of a will be at least 600gk more than that of B?

Given:
μ1= 4,500 μ2= 4,000
δ 1=200 δ 2=300
n1=5 n2 =100

2 2 2 2
δ ( X́ −X́ )= δ 1 + δ 2 = (200) + (300) = =41.23
1 2
√
n1 n2 √50
P (Z ≥ ¿ ¿)
P ( X́ 1 − X́ 2 ≥600) =
100

600−500
=P ( Z ≥ )
41.23
=P (Z ≥ 2.43)
=0.4925
=0.5000 - 0.4925=0.0075 (area under normal curve)

20
0.0075

21
SAMPLING DISTRIBUTION OF THE DIFFERENCE OF TWO PROPORTIONS

Suppose two populations of size N 1and N 2are given. For each sample of size n1from the first
population, compute sample proportion Ṕ1and standard deviation δ Ṕ . Similarly for each sample
1

size of n2 from the second population, compute sample proportion Ṕ2 and standard deviation δ Ṕ . 2

For all combinations of these samples from these populations, we can obtain a sampling
distribution of the difference Ṕ1− Ṕ2 of sample proportion. Such a distribution is called sampling
distribution of the difference of two proportions. The mean and standard deviation of this
distribution are given by;

μ Ṕ −μ Ṕ = P1−P2
1 2

P1 q1 P2 q 2
2 2
δ ¿¿= δ Ṕ + δ Ṕ =
√ 1 2
√ n1
+
n2

If sample size n1 ∧n1 are large i.e. n1 ≥30, then the sampling distribution of difference of
proportions is closely approximated by a normal distribution.

Example 7. 10% of the machines produced by company a are defective and 5% of these
produced by company B are defective. A random sample of250 machines is taken
from company A and a random sample of 300 machines is taken from company B.
what is the probability that the difference in sample proportion is less than or equal
to0.02?

μ Ṕ −μ Ṕ = P1−P2= 0.10−0.05=0.05
1 2

n1=250 n2 =300
The standard error of the difference in a sample proportions is given by

δ ( Ṕ −Ṕ )= δ Ṕ 2 + δ Ṕ 2 = P1 q1 + P2 q 2
1 2 √ 1 2
n1 √ n2
δ ( Ṕ −Ṕ )= √ 0.0052 = 0.0228
1 2

The desired probability of the difference in sample proportion is given by

( Ṕ 1− Ṕ2 )−(P1−P2 )
P¿0.02) =P ( Z ≥
δ ( Ṕ − Ṕ )
1 2

0.02−0.05
=P ( Z ≥ )
0.0228
=P (Z ≥−1.32)

22
=0.5000 - 0.4066=0.0934 (area under normal curve)
Hence the desired probability for the difference in sample proportions is 0.0934

0.093

Ṕ1− Ṕ2=0.02 μ( Ṕ − Ṕ )=0.05

1 2

Digital SAT November 2024
83% (12)
Digital SAT November 2024
100 pages
The Art of Problem Solving Intermediate Algebra
96% (25)
The Art of Problem Solving Intermediate Algebra
720 pages
Discovering Geometry Solutions Manual
70% (10)
Discovering Geometry Solutions Manual
304 pages
Woodcock Johson IV Training Manual PDF
100% (2)
Woodcock Johson IV Training Manual PDF
48 pages
Beginner's Step-By-Step Coding Course Learn Computer Programming The Easy Way, UK Edition
98% (46)
Beginner's Step-By-Step Coding Course Learn Computer Programming The Easy Way, UK Edition
360 pages
The Motivational Interviewing Workbook - Exercises To Decide What You Want and How To Get There
100% (10)
The Motivational Interviewing Workbook - Exercises To Decide What You Want and How To Get There
224 pages
Introduction To Geometry
90% (21)
Introduction To Geometry
580 pages
Workout Log
63% (19)
Workout Log
8 pages
4th Grade Math Minutes
88% (34)
4th Grade Math Minutes
112 pages
Physics Primer - Homework - 1
95% (42)
Physics Primer - Homework - 1
40 pages
Envision Math Common Core Workbook
50% (4)
Envision Math Common Core Workbook
270 pages
Golf Strategies - Dave Pelz's Short Game Bible PDF
92% (24)
Golf Strategies - Dave Pelz's Short Game Bible PDF
444 pages
Catherine V Holmes - How To Draw Cool Stuff, A Drawing Guide For Teachers and Students
97% (35)
Catherine V Holmes - How To Draw Cool Stuff, A Drawing Guide For Teachers and Students
260 pages
Geometry PDF
95% (19)
Geometry PDF
1,129 pages
[Algebra Essentials Practice Workbook with Answers Linear and Quadratic Equations Cross Multiplying and Systems of Equations Improve your Math Fluency Series] Chris McMullen - Algebra Essentials Practice Workbook with A.pdf
80% (10)
[Algebra Essentials Practice Workbook with Answers Linear and Quadratic Equations Cross Multiplying and Systems of Equations Improve your Math Fluency Series] Chris McMullen - Algebra Essentials Practice Workbook with A.pdf
207 pages
Parts Work 4th Edition
100% (30)
Parts Work 4th Edition
166 pages
Math 87 Mathematics 8 - 7 Textbook An Incremental Development Stephen Hake John Saxon
100% (10)
Math 87 Mathematics 8 - 7 Textbook An Incremental Development Stephen Hake John Saxon
696 pages
Sampling Distribution PPT To USE
100% (1)
Sampling Distribution PPT To USE
20 pages
Astrology Cheatsheet
98% (43)
Astrology Cheatsheet
15 pages
Algebra 1
40% (5)
Algebra 1
61 pages
Presentation T Test
50% (2)
Presentation T Test
31 pages
Self-System Therapy For Depression Client Workbook
100% (8)
Self-System Therapy For Depression Client Workbook
113 pages
Asvab Study Guide
100% (2)
Asvab Study Guide
185 pages
Statistics True or False
100% (1)
Statistics True or False
9 pages
Analyzing Dependent Data With Vine Copulas. A Practical Guide With R - Claudia Czado
100% (2)
Analyzing Dependent Data With Vine Copulas. A Practical Guide With R - Claudia Czado
261 pages
Probability PPT 2
No ratings yet
Probability PPT 2
32 pages
Chris McMullen - Intermediate Algebra Skills Practice Workbook With Answers - Functions, Radicals, Polynomials, Conics, Systems, Inequalities, and (2021, Zishka Publishing) - Libgen - Li
100% (3)
Chris McMullen - Intermediate Algebra Skills Practice Workbook With Answers - Functions, Radicals, Polynomials, Conics, Systems, Inequalities, and (2021, Zishka Publishing) - Libgen - Li
502 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
20 pages
CHAPTER 5 Skewness, Kurtosis and Moments
0% (1)
CHAPTER 5 Skewness, Kurtosis and Moments
49 pages
CH 5 HP Testing
100% (1)
CH 5 HP Testing
29 pages
Probability PowerPoint Notes
80% (5)
Probability PowerPoint Notes
15 pages
CHAPTER 9 Estimation and Confidence Intervals
100% (1)
CHAPTER 9 Estimation and Confidence Intervals
45 pages
Geometric Mean
100% (1)
Geometric Mean
14 pages
CHAPTER 8 Sampling and Sampling Distributions
0% (1)
CHAPTER 8 Sampling and Sampling Distributions
54 pages
Unit I Lesson 3 Computing The Mean of A Discrete Probability Distribution
100% (1)
Unit I Lesson 3 Computing The Mean of A Discrete Probability Distribution
24 pages
Correlation and Regression
100% (4)
Correlation and Regression
49 pages
Chapter 4 Hypothesis Testing
No ratings yet
Chapter 4 Hypothesis Testing
13 pages
A 2 C 1 RM
100% (4)
A 2 C 1 RM
92 pages
CHAPTER 5 Skewness, Kurtosis and Moments
100% (3)
CHAPTER 5 Skewness, Kurtosis and Moments
49 pages
Hypothesis Testing
100% (1)
Hypothesis Testing
44 pages
CH 3 Statistical Estimation
100% (1)
CH 3 Statistical Estimation
13 pages
Probability Distribution
80% (5)
Probability Distribution
69 pages
Chapter Two: Statistical Estimation: Definition of Terms: Interval Estimate
100% (1)
Chapter Two: Statistical Estimation: Definition of Terms: Interval Estimate
15 pages
Measures of Dispersion
100% (6)
Measures of Dispersion
18 pages
Understanding The Z-Scores
No ratings yet
Understanding The Z-Scores
14 pages
Unit II Lesson 4 Determining Probabilities
100% (2)
Unit II Lesson 4 Determining Probabilities
14 pages
Sample Size Determination
89% (9)
Sample Size Determination
34 pages
CHAPTER 10 Hypothesis Testing
No ratings yet
CHAPTER 10 Hypothesis Testing
49 pages
Hypothesis Testing Z-Test and T-Test
100% (1)
Hypothesis Testing Z-Test and T-Test
13 pages
Probability Distribution
100% (2)
Probability Distribution
32 pages
Normal Distribution
50% (2)
Normal Distribution
24 pages
Topic 12 Central Limit Theorem PDF
100% (2)
Topic 12 Central Limit Theorem PDF
4 pages
Problems On Confidence Interval
100% (3)
Problems On Confidence Interval
6 pages
Hypothesis Test Full
No ratings yet
Hypothesis Test Full
41 pages
Estimation
100% (1)
Estimation
19 pages
Standard Deviation Problems
No ratings yet
Standard Deviation Problems
15 pages
Confidence Intervals For The Population Mean When Is Unknown
No ratings yet
Confidence Intervals For The Population Mean When Is Unknown
18 pages
ESTIMATION
No ratings yet
ESTIMATION
51 pages
Chapter1 - Statistics For Managerial Decisions
No ratings yet
Chapter1 - Statistics For Managerial Decisions
26 pages
03 Chapter 3 - Statistical Estimation
No ratings yet
03 Chapter 3 - Statistical Estimation
17 pages
PSUnit IV Lesson 2 Understanding Confidence Interval Estimates For The Sample Mean
0% (1)
PSUnit IV Lesson 2 Understanding Confidence Interval Estimates For The Sample Mean
18 pages
CHAPTER 2 Data Classification, Tabulation and Presentation
100% (2)
CHAPTER 2 Data Classification, Tabulation and Presentation
95 pages
CHAPTER 3 Measure of Central Tendency
No ratings yet
CHAPTER 3 Measure of Central Tendency
99 pages
Topic: Measures of Dispersion (Test 2) : MCQ Exam Type Questions Get Answers MCQ Test Mcqs of All Subject
No ratings yet
Topic: Measures of Dispersion (Test 2) : MCQ Exam Type Questions Get Answers MCQ Test Mcqs of All Subject
2 pages
CH 4 Estimation.
100% (1)
CH 4 Estimation.
48 pages
Five Functions of Management
No ratings yet
Five Functions of Management
5 pages
Module 4.1 Point and Interval Estimates
100% (2)
Module 4.1 Point and Interval Estimates
4 pages
MCQ Testing of Hypothesis With Correct Answers
100% (4)
MCQ Testing of Hypothesis With Correct Answers
8 pages
Chapter Four
100% (2)
Chapter Four
44 pages
Standard Normal Distribution
No ratings yet
Standard Normal Distribution
28 pages
4 - Estimation
No ratings yet
4 - Estimation
63 pages
Statistical Hypotheses
75% (4)
Statistical Hypotheses
9 pages
Chapter 7 Measurement in Management and Scaling Techniques
No ratings yet
Chapter 7 Measurement in Management and Scaling Techniques
37 pages
Past Paper Questions Measure of Central Tendency
100% (2)
Past Paper Questions Measure of Central Tendency
3 pages
Hypothesis Testing Z-Test Z-Test: State The Hypotheses
No ratings yet
Hypothesis Testing Z-Test Z-Test: State The Hypotheses
9 pages
CHAPTER 4 Measure of Dispersion
No ratings yet
CHAPTER 4 Measure of Dispersion
76 pages
Econometrics I Ch2
No ratings yet
Econometrics I Ch2
105 pages
Lesson: Measure of Shapes
No ratings yet
Lesson: Measure of Shapes
16 pages
Correlation
100% (1)
Correlation
78 pages
T - Test
100% (2)
T - Test
32 pages
Lecture 4
No ratings yet
Lecture 4
55 pages
Chapter 7
No ratings yet
Chapter 7
10 pages
chptr1 statistcs2
No ratings yet
chptr1 statistcs2
8 pages
Sampling
No ratings yet
Sampling
7 pages
Stat II Material
No ratings yet
Stat II Material
87 pages
BDU Biometrics
No ratings yet
BDU Biometrics
122 pages
Unit 4 Multiple Linear Regression
No ratings yet
Unit 4 Multiple Linear Regression
3 pages
Unit 3 Anova Two Way Classification
No ratings yet
Unit 3 Anova Two Way Classification
18 pages
Algebra 2
95% (19)
Algebra 2
200 pages
Fractions
100% (10)
Fractions
50 pages
Pre-Algebra and Algebra
100% (23)
Pre-Algebra and Algebra
66 pages
Florida Teacher Certificate Examinations (FTCE) Study Guide
0% (2)
Florida Teacher Certificate Examinations (FTCE) Study Guide
20 pages
Algebra Cheat Sheet
No ratings yet
Algebra Cheat Sheet
4 pages
How To Read Sheet Music For Beginners
100% (2)
How To Read Sheet Music For Beginners
15 pages
M. Aurelius PDF
100% (10)
M. Aurelius PDF
366 pages
Full Download Bayesian Methods For Measures of Agreement 1st Edition Lyle D. Broemeling PDF
100% (17)
Full Download Bayesian Methods For Measures of Agreement 1st Edition Lyle D. Broemeling PDF
84 pages
Computer Simulation A Foundational Approach Using Python 1st Edition Yahya Esmail Osais Ebook All Chapters PDF
100% (4)
Computer Simulation A Foundational Approach Using Python 1st Edition Yahya Esmail Osais Ebook All Chapters PDF
33 pages
Machine Learning: Notes by Aniket Sahoo - Part II
No ratings yet
Machine Learning: Notes by Aniket Sahoo - Part II
140 pages
L 3 - Material of Lecture 3
No ratings yet
L 3 - Material of Lecture 3
11 pages
Probablity and Statistics
No ratings yet
Probablity and Statistics
11 pages
Statistics Module 11
No ratings yet
Statistics Module 11
9 pages
PPC15 Biostatistics Workbook
No ratings yet
PPC15 Biostatistics Workbook
22 pages
Model Building Through
No ratings yet
Model Building Through
21 pages
13 Analysis of Variance
No ratings yet
13 Analysis of Variance
16 pages
Mankiw PrinciplesOfEconomics 10e PPT CH38
No ratings yet
Mankiw PrinciplesOfEconomics 10e PPT CH38
43 pages
Cheat Sheet Statistics
No ratings yet
Cheat Sheet Statistics
3 pages
Mathsassignment 2
No ratings yet
Mathsassignment 2
2 pages
Anova, Ancova, Manova, & Mancova
No ratings yet
Anova, Ancova, Manova, & Mancova
11 pages
Group 1 - MRA - Analysis & Interpretations
No ratings yet
Group 1 - MRA - Analysis & Interpretations
18 pages
ETL 1110-2-547 Reliability Methods in Geotechnical Engineeri PDF
No ratings yet
ETL 1110-2-547 Reliability Methods in Geotechnical Engineeri PDF
14 pages
Gibbs Sampling For The Un-Initiated: As If This Needs A Subtitle
No ratings yet
Gibbs Sampling For The Un-Initiated: As If This Needs A Subtitle
6 pages
Extreme Value Theory: A Primer: Harald E. Rieder Lamont-Doherty Earth Observatory 9/8/2014
No ratings yet
Extreme Value Theory: A Primer: Harald E. Rieder Lamont-Doherty Earth Observatory 9/8/2014
57 pages
Examples of Bayes Theorem PDF
67% (3)
Examples of Bayes Theorem PDF
2 pages
STAT 3022 Data Analysis Class Slides 1
No ratings yet
STAT 3022 Data Analysis Class Slides 1
16 pages
Lian Polyan Watumlawar, Lakon Utamakno, Yudho Dwi Galih Cahyono Institut Teknologi Adhi Tama Surabaya
No ratings yet
Lian Polyan Watumlawar, Lakon Utamakno, Yudho Dwi Galih Cahyono Institut Teknologi Adhi Tama Surabaya
8 pages
Week 9 PDF
No ratings yet
Week 9 PDF
70 pages
3 - Continuous Random Variables
No ratings yet
3 - Continuous Random Variables
84 pages
By Chris Chatfield, Published in 2004 by Chapman & Hall/CRC in The Texts in Statistical Science Series
No ratings yet
By Chris Chatfield, Published in 2004 by Chapman & Hall/CRC in The Texts in Statistical Science Series
19 pages
Instant Download Statistics for Data Science and Analytics 1st Edition Peter C. Bruce PDF All Chapters
100% (1)
Instant Download Statistics for Data Science and Analytics 1st Edition Peter C. Bruce PDF All Chapters
71 pages
Chapter Five Regression
No ratings yet
Chapter Five Regression
12 pages