Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
256 views

Homework Questions

This document contains the homework problems for a statistics course. The problems cover topics like descriptive statistics, sampling distributions, maximum likelihood estimation, consistency, unbiasedness, the Cramer-Rao inequality, and uniformly minimum variance unbiased estimation. Some of the problems involve finding estimators for parameters of distributions like the Poisson, uniform, exponential, and normal distributions. Other problems involve determining whether estimators are complete or sufficient statistics, properties of order statistics, and deriving distributions of test statistics.

Uploaded by

polar neckson
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
256 views

Homework Questions

This document contains the homework problems for a statistics course. The problems cover topics like descriptive statistics, sampling distributions, maximum likelihood estimation, consistency, unbiasedness, the Cramer-Rao inequality, and uniformly minimum variance unbiased estimation. Some of the problems involve finding estimators for parameters of distributions like the Poisson, uniform, exponential, and normal distributions. Other problems involve determining whether estimators are complete or sufficient statistics, properties of order statistics, and deriving distributions of test statistics.

Uploaded by

polar neckson
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

M 305: Statistics Odd 2020

Homework 1
(descriptive statistics, sampling distributions)

1. The average particulate concentration, in micrograms per cubic meter, was measured in a petrochem-
ical complex at 36 randomly chosen times, with the following concentrations resulting:

5, 18, 15, 7, 23, 220, 130, 85, 103, 25, 80, 7, 24, 6, 13, 65, 37, 25,
24, 65, 82, 95, 77, 15, 70, 110, 44, 28, 33, 81, 29,14, 45, 92,17, 53

(a) Obtain a grouped frequency distribution with number of classes = 5.


(b) Draw a histogram corresponding to the grouped frequency distribution.
(c) What can you say about the skewness of the histogram?

2. Suppose data values {y1 , . . . , yn } are obtained from the data values {x1 , . . . , xn } using a linear trans-
formation yi = α + βxi where i = 1, . . . , n and α and β are constants.

(a) Find the relation between the sample means x̄ and ȳ, i.e., express ȳ in terms of x̄.
(b) What is the relation between the sample medians x̃ and ỹ? Express ỹ in terms of x̃.
1 Pn
(c) Determine the relation between the sample variances Sx2 = n−1 2
i=1 (xi − x̄) and
1 n
Sy2 = n−1 2
P
i=1 (yi − ȳ) .
(d) If β < 0, show all steps to find the sample correlation coefficient r of the paired observations
(xi , yi ), i = 1, . . . , n.

x -3 -2 -1 1 2 3
3. (a) Find the sample correlation coefficient for the paired data
y 9 4 1 1 4 9
(b) Draw a scatter diagram. Do you see any pattern that the points are following?
(c) Show that the sample correlation coefficient can be alternatively expressed as
Pn
i − nx̄ȳ
i=1 xi yq
r = qP .
n 2 2
Pn 2 2
i=1 xi − nx̄ i=1 yi − nȳ

(d) If in a survey it is found that each woman marries a man who is 4 years older than the woman.
What would be the correlation coefficient between the ages of husbands and wives?
(e) State true or false, and explain briefly: “If y is usually less than x, the correlation between x and
y will be negative.”
4. Sample variance involves the sum of all squared deviations from the mean. For a data set x1 , . . . , xn ,
a new measure of spread based on “sum of all squared deviations from the median” is defined as
n
1 X
(xi − x̃)2 where x̃ is the sample median.
n−1
i=1
Prove that the new measure will always produce a higher value than usual sample variance. In other
words, prove that
n n
1 X 1 X
(xi − x̃)2 ≥ (xi − x̄)2 where x̄ is the sample mean.
n−1 n−1
i=1 i=1

5. A die is rolled. Let X be the face value that turns up, and X1 , X2 be two independent observations
on X. Find the probability mass function of sample mean X.

6. Let X1 , . . . , Xn be a random sample from N (µ, σ 2 ) and X and S 2 , respectively, be the sample mean
and the sample variance. Let Xn+1 ∼ N (µ, σ 2 ), and p assume that X1 , . . . , Xn , Xn+1 are independent.
Find the sampling distribution of [(Xn+1 − X)/S] n/(n + 1).

7. Let X1 , . . . , Xn be a random sample from a distribution F with mean E(X1 ) = µ, variance V (X1 ) = σ 2
and fourth central moment E[(X1 − µ)4 ] = µ4 < ∞. If S 2 denotes the sample variance of the above
random sample, then show that
µ4 n−3 4
E(S 2 ) = σ 2 and V (S 2 ) = − σ .
n n(n − 1)

8. Let X1 , X2 , . . . , X10 be a random sample from a standard normal distribution. Find the numbers a
and b such that
10
!
X
P a≤ Xi2 ≤ b = 0.95
i=1

9. A particular type of vacuum-packed coffee packet contains an average of 16 ounces. It has been
observed that the number of ounces of coffee in these packets is normally distributed with σ = 1.41
ounce. A random sample of 15 of these coffee packets is selected, and the observations are used to
2

calculate s. Find the numbers a and b such that P a ≤ S ≤ b = 0.90
10. A psychologist claims that the mean age at which female children start walking is 11.4 months. If 20
randomly selected female children are found to have started walking at a mean age of 11.5 months
with standard deviation of 2 months, would you agree with the psychologist’s claim? Assume that the
sample came from a normal population.

11. Let X and Y be independent random variables from an exponential distribution with common param-
eter θ = 1. Show that X/Y has an F (2, 2) -distribution.?

12. Let X1 , . . . , Xn be a random sample from exponential distribution with a mean of θ. Show that
Y1 = min (X1 , X2 , . . . , Xn ) has an exponential distribution with mean θ/n. Also, find the pdf of
Yn = max (X1 , X2 , . . . , Xn )
M 305: Statistics Odd 2021

Homework 2
(MME, MLE, consistency, unbiasedness, Cramer-Rao inequality, UMVUE,)

1. Let X ∼ P ois(λ). Show that unbiased estimator of τ (λ) = 1/λ does not exist. Based on X1 , . . . , Xn
i.i.d. random samples from P ois(λ), find a consistent estimator of τ (λ) based on a sufficient statistics
of λ.

2. Let X1 , . . . , Xn be i.i.d. observations having common density function

f (x | β) = α0 β −α0 xα0 −1 if 0 < x < β, and 0 otherwise,

where α0 is a known positive constant and β > 0 is the unknown parameter.

(a) Show that ((α0 + 1)X)/α0 is both unbiased and consistent estimator of β.
(b) Find the UMVUE of β.
(c) Is the UMVUE a consistent estimator of β? Which estimator of β (among the above two) should
we prefer?

3. Let X be one observation from the pdf


 |x|
θ
f (x|θ) = (1 − θ)1−|x| I{x∈{−1,0,1}} , 0 < θ < 1.
2

(a) Prove or disprove whether X is a complete sufficient statistic for θ.


(b) Prove or disprove whether |X| is a complete sufficient statistic for θ.
(c) Does f (x|θ) belong to an exponential family?

4. Let X1 , . . . , Xn be i.i.d. random variables from U nif orm(θ − 12 , θ + 21 ) where −∞ < θ < ∞. Show
that T (X) = (X(1) , X(n) ) is a minimal sufficient statistics for θ but not complete for θ.

5. Let X1 , . . . , Xn are i.i.d. observations from exponential distribution with pdf


 
1 x−µ
f (x|µ, σ) = exp − I{x≥µ} , where − ∞ < µ < ∞, σ > 0.
σ σ

(a) Show that Y1 = σ2 (X1 − µ) follows a χ22 distribution.


(b) Find a complete and sufficient statistic for µ when σ is known.
(c) Find E X(1) ni=1 Xi − X(1) .
 P 

(d) Show that σ2 ni=1 Xi − X(1) follows a χ22n−2 distribution.


P 
6. Suppose X1 , . . . , Xn are i.i.d. double exponential random variables with density function f (x|σ) =
|x|

1 − σ |X1 |
2σ e where σ > 0. Find E Pn |Xi | .
i=1

7. For each of the following pdfs, let X1 , . . . , Xn be a sample from that distribution. In each case, find
the best estimator (i.e., UMVUE) of τ (θ) = θr for r > 0.
1
(a) f (x| θ) = 2θ , −θ < x < θ, θ>0
(b) f (x| θ) = θx (1 − θ)1−x , x = 0, 1, 0 < θ < 1, and r is a positive integer less than n.

8. For n ≥ 2, let X1 , . . . , Xn be i.i.d. N (µ, σ 2 ). Find the UMVUE of τ (µ, σ) = µσ p where p is a known
positive constant. (hint: use Basu’s theorem)

9. Suppose X1 , . . . , Xn are i.i.d observations with pdf f (x|θ) = (θ + 1)xθ I{0<x<1} where θ > 0.

(a) Show that − log X1 follows an exponential distribution with rate parameter θ + 1.
(b) Show that T = n1 ni=1 log( X1i ) is the UMVUE of θ+1
1
P
.
(c) Show that if UMVUE exists, it must be unique.
(d) Compute E [log(X1 ) | T ].

10. Let X1 , . . . , Xn be a random sample (i.i.d.) from N (µ0 , σ 2 ) where µ0 is assumed known.

(a) Find the Cramer-Rao lower bound (CRLB) for unbiased estimators of σ 2 .
(b) Show that the variance of the UMVUE of σ 2 attains the lower bound (CRLB).

11. Let X1 , . . . , Xn be i.i.d. random variables with common density f (x|θ). Find an maximum likelihood
estimator (MLE) and method of moment estimator (MME) for θ in each of the following cases:

(a) f (x|θ) = θ(1 − x)θ−1 I{0<x<1} where θ > 1.


√ 1 2
(b) f (x|µ, σ 2 ) = (σx 2π)−1 e− 2σ2 (log x−µ) I{x>0} where θ = (µ, σ 2 ) ∈ R × R+

12. Suppose that X1 , . . . , Xn are i.i.d. copies of a random variable X having N (µ, 4) distribution. But,
instead of recording all the observations X1 , . . . , Xn , one notes only whether or not the observations
are greater than 0. If {Xi > 0} occurs m(< n) times, find the MLE of µ.

13. Let X1 , . . . , Xn be i.i.d. observations from U nif orm(−θ, θ) distribution where θ > 0. Find the MLE
of θ. Is it a function of sufficient statistic of θ?

14. Let X1 , . . . , Xn be random sample from exponential distribution with location and scale parameters θ
and σ respectively. The pdf of Xi is given by
 
1 x−θ
f (x|θ, σ) = exp − I{x≥θ} ,
σ σ
where −∞ < θ < ∞ and σ > 0.
(a) Find the MLE of (θ, σ) and g(θ, σ) = θ + σ.
(b) Show that the MLE is a consistent estimator of (θ, σ).
(c) Find the method of moment estimator (MME) (θ, σ) and show that it is also consistent.

15. Let X1 , . . . , Xn be i.i.d. random variables from N (µ, σ 2 ), where µ is known to be restricted in [0, ∞)
and σ > 0. Find the MLE of (µ, σ 2 ).
M 305: Statistics Odd 2021

Homework 3
(Interval estimation)

1. A random sample of size 50 from a particular brand of 16 -ounce tea packets produced a mean weight
of 15.65 ounces. Assume that the weights of these brands of tea packets are normally distributed with
standard deviation of 0.59 ounce. Find a 95% confidence interval for the true mean µ.

2. Let X1 , . . . , Xn be a random sample from a Poisson distribution with parameter λ.

(a) Construct a 90% confidence interval for λ.


(b) Suppose that the number of raisins in a bowl of a particular brand of cereal is observed to be
25. Assuming that the number of raisins in a bowl is Poisson distributed, estimate the expected
number of raisins per bowl with a 90% confidence interval.
(c) How many bowls of cereal need to be sampled in order to estimate the expected number of raisins
per bowl with a standard error of less than 0.2?

3. Many mutual funds use an investment approach involving owning stocks whose price/earnings
multiples (P/Es) are less than the P/E of the S&P 500 . The following data give P/ Es of 49
companies a randomly selected mutual fund owns in a particular year.

6.8 5.6 8.5 8.5 8.4 7.5 9.3 9.4 7.8 7.1
9.9 9.6 9.0 9.4 13.7 16.6 9.1 10.1 10.6 11.1
8.9 11.7 12.8 11.5 12.0 10.6 11.1 6.4 12.3 12.3
11.4 9.9 14.3 11.5 11.8 13.3 12.8 13.7 13.9 12.9
14.2 14.0 15.5 16.9 18.0 17.9 21.8 18.4 34.3

Find a 98% confidence interval for the mean P/E multiples. Interpret the result and state any as-
sumptions you have made.

4. An opinion poll conducted in March of 1996 by a newspaper (Tampa Tribune) among eligible voters
with a sample size 425 showed that the president, who was seeking reelection, had 45% support. Give
a 95% and a 98% confidence interval for the proportion of support for the president.

5. In a large university, the following are the ages of 20 randomly chosen employees:

24 31 28 43 28 56 48 39 52 32
38 49 51 49 62 33 41 58 63 56
Assuming that the data come from a normal population, construct a 95% confidence interval for the
population mean µ of the ages of the employees of this university. Interpret your answer.
6. Let a random sample of size 17 from a normal population for which both mean µ and variance σ 2 are
unknown yield x̄ = 3.12 and s2 = 1.04. Determine a 99% confidence interval for µ.

7. Air pollution in large U.S. cities is monitored to see whether it conforms to requirements set by the
Environmental Protection Agency. The following data, expressed as an air pollution index, give the
air quality of a city for 10 randomly selected days.

56.23 57.12 57.7 65.80 59.40


62.90 58.00 64.56 63.92 63.45

Assuming that the data may be viewed as a random sample from a normal population, construct a
99% confidence interval for the actual variance of the air pollution index for this city and interpret.

8. A pharmaceutical company tested a new drug to be marketed for the treatment of a particular type of
virus. In order to obtain an estimate on the mean recovery time, this drug was tested on 15 volunteer
patients, and the recovery time (in days) was recorded. The following data were obtained.

8 17 10 6 34 11 13 6 9 8
0 17 10 0 54 19 4 12 17 7

(a) Obtain a 95% confidence interval estimate of the mean recovery.


(b) What assumptions do we need to make? Test for these assumptions.

9. Let the random variables X1 and X2 follow binomial distributions that have parameters n1 = 100, n2 =
75, Let x1 = 35 and x2 = 27 be observed values of X1 and X2 . Let p1 and p2 be the true proportions.
Determine an appropriate 95% confidence interval for p1 − p2 .

10. The following information is obtained from two independent samples selected from two populations.
n1 = 40 x̄1 = 28.4 s1 = 4.1 n2 = 32 x̄2 = 25.6 s2 = 4.5

(a) What is the maximum likelihood estimator of µ1 − µ2 ?


(b) Construct a 99% confidence interval for µ1 − µ2 .

11. The management of a supermarket wanted to study the spending habits of its male and female cus-
tomers. A random sample of 16 male customers who shopped at this supermarket showed that they
spent an average of $55 with a standard deviation of $12. Another random sample of 25 female cus-
tomers showed that they spent $85 with a standard deviation of $20.50. Assuming that the amounts
spent at this supermarket by all its male and female customers were approximately normally dis-
tributed, construct a 90% confidence interval for the ratio of variance in spending for males and
females, σ12 /σ22 .
M 305: Statistics Odd 2021

Homework 4
(Testing of hypothesis)

1. It is suspected that a coin is not balanced (not fair). Let p be the probability of tossing a head. To
test H0 : p = 0.5 against the alternative hypothesis Ha : p > 0.5, a coin is tossed 15 times. Let Y equal
the number of times a head is observed in the 15 tosses of this coin. Assume the rejection region to
be {Y ≥ 10} (a) Find type I error α. (b) Find type II error β for p = 0.7 (c) Find type II error β for
p = 0.6 (d) Find the rejection region for {Y ≥ K} for α = 0.01, and α = 0.03. (e) For the alternative
Ha : p = 0.7, find β for the values of α given in (d)

2. Let X1 , . . . , Xn be a random sample from a Poisson distribution with mean λ. Find a best critical
region for testing H0 : λ = 3 against Ha : λ = 6

3. A clinical oncology program developed a set of guidelines for their cancer patients to follow. It is
believed that the proportion of patients who are still living after 24 months is greater for those who
follow the guidelines. Of the 40 patients who followed the guidelines, 30 are still living after 24
months, whereas of 32 patients who did not follow the guidelines, 21 are living after 24 months. Find
a likelihood ratio test at α = 0.01 to decide whether the program is effective.

4. A check-cashing service company found that approximately 7% of all checks submitted to the service
were without sufficient funds. After instituting a random check verification system to reduce its losses,
the service company found that only 70 were rejected in a random sample of 1125 that were cashed.
Is there sufficient evidence that the check verification system reduced the proportion of bad checks at
α = 0.01? What is the p -value associated with the test? What would you conclude at the α = 0.05
level?

5. A manufacturer of washers provides a particular model in one of three colors, white, black, or ivory.
Of the first 1500 washers sold, it is noticed that 550 were of ivory color. Would you conclude that
customers have a preference for the ivory color? Justify your answer. Use α = 0.01.

6. A company that manufactures precision special-alloy steel shafts claims that the variance in the di-
ameters of shafts is no more than 0.0003 . A random sample of 10 shafts gave a sample variance of
0.00027 . At the 5% level of significance, test whether the company’s claim can be substantiated.

7. The following information was obtained from two independent samples selected from two normally
distributed populations with unknown but equal variances.
Sample 1 14 15 11 14 10 8 13 10 12 16 15
Sample 2 17 16 21 12 20 18 16 14 21 20 13 20 13
Test at the 2% significance level whether µ1 is lower than µ2 .

8. Suppose we want to know the effect on driving of a drug for cold and allergy, in a study in which
the same people were tested twice, once after 1 hour of taking the drug and once when no drug is
taken. Suppose we obtain the following data, which represent the number of cones (placed in a certain
pattern) knocked down by each of the nine individuals before taking the drug and after an hour of
taking the drug.
No drug 0 0 3 2 0 0 3 3 1
After drug 1 5 6 5 5 5 6 1 6
Assuming that the difference of each pair is coming from an approximately normal distribution, test
if there is any difference in the individuals’ driving ability under the two conditions. Use α = 0.05.

9. The IQs of 17 students from one area of a city showed a mean of 106 with a standard deviation of 10,
whereas the IQs of 14 students from another area showed a mean of 109 with a standard deviation of
7. Test for equality of variances between the IQs of the two groups at α = 0.02

10. A random sample was taken of 300 undergraduate students from a university. The students in the
sample were classified according to their gender and according to the choice of their major. The result
is given in the following table. College
Gender Arts and sciences Engineering Business Other Total
Male 75 40 24 66 205
Female 45 12 15 23 95
Total 120 52 39 89 300
Test the hypothesis that the choice of the major by undergraduate students in this university is
independent of their gender. Use α = 0.01.

11. The speeds of vehicles (in mph) passing through a section of Highway 75 are recorded for a random
sample of 150 vehicles and are given below. Test the hypothesis that the speeds are normally distributed
with a mean of 70 and a standard deviation of 4. Use a = 0.01.
Range 40 − 55 56 − 65 66 − 75 76 − 85 > 85
Number 12 14 78 40 6

12. A survey of footwear preferences of a random sample of 100 undergraduate students females and 50
males) from a large university resulted in the following data.
Boots Leather shoes Sneakers Sandals Other
Female 12 9 12 10 7
Male 10 12 17 7 4
(a) Let pi , i = 1, 2, 3, 4, 5, represent the respective true proportions of students with a particular
footwear preference, and let H0 : p1 = 0.20; p2 = 0.20; p3 = 0.30; p4 = 0.20; p5 = 0.10 versus Ha :
At least one of the probabilities is different from the hypothesized value. Test this hypothesis using
α = 0.05 (b) Test the hypothesis that the choice of footwear by undergraduate students in this
university is independent of their gender, using α = 0.05.

You might also like