Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
65 views

Lecture 7.1 - Estimation of Parameters

The document discusses point estimation of population parameters from sample data. Specifically, it covers: 1) Estimating the population mean using the sample mean, with the standard error of the sample mean providing a measure of accuracy. 2) For a large sample, the 100(1-α)% error margin of the sample mean estimating the population mean is approximately zα/2σ/√n, where zα/2 is the normal critical value and σ is the population standard deviation. 3) Determining the required sample size needed to estimate the population mean within a desired level of precision and probability, using the equation n = (zα/2σ/d)2, where d is the desired

Uploaded by

Junior Lafena
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
65 views

Lecture 7.1 - Estimation of Parameters

The document discusses point estimation of population parameters from sample data. Specifically, it covers: 1) Estimating the population mean using the sample mean, with the standard error of the sample mean providing a measure of accuracy. 2) For a large sample, the 100(1-α)% error margin of the sample mean estimating the population mean is approximately zα/2σ/√n, where zα/2 is the normal critical value and σ is the population standard deviation. 3) Determining the required sample size needed to estimate the population mean within a desired level of precision and probability, using the equation n = (zα/2σ/d)2, where d is the desired

Uploaded by

Junior Lafena
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

7.

0 Estimation of parameters
7.1 Introduction
The problem with statistical inferences arises when we wish to make generalizations about a
population on the basis of a sample selected from it. We want to ensure as accurately as possible
that the conclusions made from the sample measurements are valid for the entire population.

Statistical inference deals with drawing conclusions about population parameters from
an analysis of the sample data.

The nature of the inference to be considered depends on the intent of the investigation. The two
most types of inferences are (1) estimation of parameter(s) and (2) testing of statistical
hypotheses. The true value of a parameter is unknown constant that can be correctly ascertained
by an exhaustive study of the population. We will study more on statistical hypotheses in the
later lectures.

Example 1
To study the growth of pine trees at an early stage, a nursery worker records 40 measurements of
the heights of one-year-old red pine seedlings. This set of measurements is given on Table 1.

Table 1: Heights of one-year-old red pine


seedling Measured in cm
2.6 1.9 1.8 1.6 1.4 2.2 1.2 1.6
1.6 1.5 1.4 1.6 2.3 1.5 1.1 1.6
2.0 1.5 1.7 1.5 1.6 2.1 2.8 1.0
1.2 1.2 1.8 1.7 0.8 1.5 2.0 2.2
1.5 1.6 2.2 2.1 3.1 1.7 1.7 1.2

However, the target of our investigation is not just the particular set of measurements recorded,
but the vast (infinite) population of the heights of all possible one-year-old pine seedlings. The
population distribution of the height is unknown to us and so the population mean  and the
population deviation  . Taking the view that the 41 observations represent a random sample
from the population distribution, of heights one goal in this study may be to ‘learn  ’. More
specifically,

(a) Estimate a single value for the unknown ,  ( Point estimation)

1
(b) Determine an interval values for  (interval estimation)
(c) Decide whether or not the mean height  is 1.9 cm, which was previously found to be
the mean height of a different stock of pine seedlings( testing hypothesis)

Example 2:
A government agency wishes to assess the prevailing rate of unemployment in a particular
country. It is correctly felt that this assessment could be made quickly and effectively by
sampling a small number of persons currently unemployed. Suppose that 500 randomly selected
persons are interviewed and 41 are found to be unemployed.

A descriptive summary of this finding is provided by

41
Sample proportion of unemployed: pˆ = = 0.082
500

Here our target investigation is the proportion of the unemployed, p, the entire population. The
true value of the parameter, p, is unknown to us. The sample pˆ = 0.082 sheds some light on p,
but it subjects to some error since it draws only from part of the population. We would like to
evaluate its margin of error and provide an interval of plausible (reasonable) values of p. Further
we may also wish to test the hypothesis that the unemployment rate in the country is not higher
than the rate quoted in the federal report.

7.2. Point estimation of population mean.


The objective of the point estimation is to calculate, from the sample data, a single number that is
likely to be close to the unknown value of the parameter.

If we have size n sample, then the estimator of the sample mean is

X 1 , X 2 ,..., X n
X=
n

Without an assessment of accuracy, a single number quoted as an estimate may not serve as
useful purpose. The standard deviation, alternatively called the standard error (SE) of the
estimator, provides information about it variability. In order to study the sample mean X as an
estimator of the population mean,  , let us review the results

2
i) E( X ) = 
 
ii) sd ( X ) = so S .E.( X ) =
n n
iii) With large n X is nearly normally distributed with mean  and standard deviation

.
n

The first two results show that the distribution of X is centered around  and its standard error

is , where  is the population standard deviation and n is the sample size.
n

To understand how closely X is expected to estimate  , we examine the third result shown in

Figure1. Thus, prior to sampling, the probability is 0.954 that the estimator X will be within a
distance 2 / n from the true parameter value  . The probability statement can be rephrased as

‘when we are estimating  by X , the 95.4% error margin is 2 / n .

Fig 1: Approximate distribution of X

The following notation will facilitate our writing of an expression for 100(1 −  )% error margin
where 1 −  denotes the desired high probability such as 0.99, 0.95 or 0.90

z /2 = the upper  / 2 point of the standard normal distribution. That is, the area to
the right of z /2 is  / 2 , and the area between − z /2 and z /2 is 1 −  , see figure 2.

3
Fig 2: The notation z /2

To illustrate the notation, suppose we want to determine the 90% error margin. We set 1 −  =90
so z / 2 =0.05 and we have z0.05 = 1.645 . Therefore, we can only say that when estimating  by

X , the 90% error margin is 1.645 / n . A minor difficulty in computing the standard error of
X remains, because expression involves the unknown population standard deviation  . We can
estimate  by the sample standard deviation.

(X i − X )2
s= i =1

n −1

When n is large, the effect of estimating the standard error  / n by s / n can be neglected.
The summary is given in the box.

Point Estimation of the Mean

Parameter: Population mean, 

Data: X1 , X 2 ,..., X n (a random sample of size n)

Estimator: X (sample mean)


S.E.( X ) =  / n , estimated S .E.( X ) = s / n

For large n, the 100(1 −  )% error is z /2 / n (If  is not known use s in place of  )

Note: The estimated population mean  tend to miss by an amount equal to the S.E ( s / n ).

4
Example 3
From the data of Example 1, consisting of 40 measurements of the heights of one-year-old red
pine seedlings, give a point estimate of the population height and state a 95% error margin.

Solution:
The sample mean and the standard deviation computed from the 40 measurements in Table 1 are

40 40

 xi ( x − x )
2
i
x= i =1
= 1.715 s= i =1
= 0.2254 = 0.475
n 39

To calculate the 95% error margin, we set 1 −  = 0.90 so that  / 2 = 0.025 and z /2 = 1.96 .

Therefore, the 95% error margin is

1.96s 1.96*0.475
= = 0.15cm
n 40

Caution: (important)

a) Standard error should not be interpreted as the ‘typical’ error in a problem of estimation
as the word ‘standard’ may suggest. For instance, when S.E.( X ) = 0.3 , we should think

that the error ( X −  ) is likely to be 0.3, but rather prior to observing the data, the
probability is approximately 0.954 that the error will be within 2(S.E.) = 0.6
b) An estimate and its variability are often reported in either of the forms: estimate 2S.E.
or estimate or 2(S.E.) . In reporting numerical results such as 53.4  4.6 , we must
specify whether 4.6 represents S.E., 2(S.E.), or some other multiple of standard error.

5.2 Determining the sample size (Large samples)

During the planning stage of an investigation it is important to address the question of sample
size. Because sampling is costly and time consuming, the investigator wants to know beforehand
the sample size required to give the desired precision.

In order to determine how large a sample is needed for estimating a population mean, we must
specify

5
d= the desired error margin
and
1 −  = probability associate with error margin.

Referring to the expression for 100(1 −  )% error margin, we equate: z /2 =d
n

z  
2

This gives the equation in which n is unknown n =   /2  which determines the required
 d 
sample size. This sample size is valid provided n  30 so that normal approximation to X is
satisfactory.

To be 100(1 −  )% sure that the error of estimation | X −  | does not exceed d, the required
sample size is

z  
2

n =   /2 
 d 

If is  completely unknown, a small-scale preliminary sampling is necessary to obtain an


estimate of  to be used in the formula to compute n.

Example 4
A limnologist wishes to estimate the mean phosphate content per unit volume of lake water. It is
known from studies in previous years that the standard deviation has fairly stable  = 4 . How
many water samples must the Limnologist analyze to be 90% certain that the error of estimation
does not exceed 0.8?

Solution:
Here  = 4 and 1 −  = .90 , so  / 2 = 0.05 . The upper 0.05 point of N (1, 0) distribution is

z0.05 = 1.645 . The tolerable error is d=0.8. Computing

2
1.645* 4 
n= = 67.65
 0.8 

the required sample size is n = 68

6
Tutorial Exercise
1. For estimating a population mean with the sample mean X , find (i) the standard error of X and (ii)
100(1 −  )% margin in each case
a) n=138,  =22, 1 −  =0.95
b) n=320,  =56, 1 −  =0.92
2. Determine the point estimate of the population mean and its 100(1 −  )% margin error in each cases:

(a) n=150, x = 86.2 , s = 9.56 1 −  = 0.975


(b) n=1100, x = 0.728 , s = 0.085 1 −  = 0.90
3. Consider the population of estimating mean,  based on a random sample of size n from the
population. Compute a point estimate of  in and the estimate standard error in each of the following
case:
a) n=70,  x = 852 ,  ( x − x ) = 215
i i
2

b) n=160,  x = 1985 ,  ( x − x ) = 475


i i
2

4. Data on the average weekly earnings were obtained from a survey of 50 nonsupervisory production
workers in a mining industry. The sample mean and standard deviation were found to be $650 and
$35, respectively. Estimate the true mean weekly earnings and determine the 95% error margin.
5. When estimating  from a large sample, suppose that one has found the 95% error margin of X to
be 3.26. From this information, determine the:
a) Estimated S.E. of X
b) 90% error margin
6. When estimating the mean of a population, how large a sample is required in order that the 95% error
margin be:
a) 1/8 of the population standard deviation?
b) 15% of the population standard deviation?
7. Referring to Exercise3, suppose that the survey of 50 workers was, in fact, a pilot study intended to
give an idea of the population standard deviation. Assuming  =$35, determine the sample size that
is needed for estimating the population mean weekly earnings with 98% error margin of $3.50.

References:

Richard, A. Johnson and Gouri K, Bhattacharyya, (1992). Statistics: Principles and Methods (2nd Edition).
John Wily & Sons. New York.

7
Model Solutions:

1. (a) S.E=1.873, 95% error margin=3.67 (b) S.E=3.130, 92% error margin=5.48

2. (a) | 86.2 −  |= 9.56 / 150 = 0.781 ;(  = 85.42 ), err/margin= 2.24*(9.56 / 150) = 1.75

(b)  = 0.728 − (0.085 / 1100) = 0.725 err/margin=5.02E-03


3. (a) x = 12.17 , estimated S.E=0.211 (b) x = 12.41 , estimated S.E=0.136

4. (a)  = $650 − 4.95 = $645.05 (b) Error Margin= z0.025 *35 / 50 = 1.96*4.95 = $9.70

5. (a) 3.26=1.96*σ/sqrt(n); σ/sqrt(n)=1.6633 b) E.Margin= z0.05* σ/sqrt(n)=1.645*1.6633=2.736

1.96*   1.96*  
2 2

6. (a)  n=246 (b)  n=171


  / 8   0.15 
2
 2.33*35 
7. (a)  =11.08=12
 3.50 

You might also like