Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Assignments

The document outlines an assignment for a statistics class at the Institute of Technology of Cambodia, covering various statistical analyses and hypothesis testing scenarios. It includes tasks related to soccer team age data, textile fiber elongation tests, sodium content in cornflakes, tire life studies, and comparisons of weights in baby boys and girls. Each section requires calculations of means, variances, hypothesis testing, and confidence intervals based on provided data sets.

Uploaded by

speiizeth11
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Assignments

The document outlines an assignment for a statistics class at the Institute of Technology of Cambodia, covering various statistical analyses and hypothesis testing scenarios. It includes tasks related to soccer team age data, textile fiber elongation tests, sodium content in cornflakes, tire life studies, and comparisons of weights in baby boys and girls. Each section requires calculations of means, variances, hypothesis testing, and confidence intervals based on provided data sets.

Uploaded by

speiizeth11
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Institute of Technology of Cambodia Engineering Class

Assignment for I3
Subject: Statistics

1. The Rochester Raging Rhinos Professional Soccer Team is hoping for a good 2010 season. The
blend of experienced and young, energetic players should make for a solid team. The current
ages for the team are:

23 24 25 32 30 20 31 24 30 24
33 36 30 20 25 26 30 31 23 24

Table 1: Data Table

(a) Find the mean, median, mode, variance, sd, Q1 , Q2 , Q3 and IQR.
(b) Construct a grouped frequency histogram using classes 19-21, 21-23, and so on.
(c) Describe the distribution shown in the histogram.
(d) Based on the histogram and its shape, what would you predict for the mean and the
median? Which would be higher? Why?
(e) Compute the mean and median. Compare answers to your predicted values in part c.
(f) Which measure of central tendency provides the best measure of the center? Why?
(g) Draw a box plot and comment on the shape of the distribution.

2. A textile fiber manufacturer is investigating a new drapery yarn, which the company claims
has a mean thread elongation of 12 kilograms with a standard deviation of 0.5 kilograms. The
company wishes to test the hypothesis

H0 : µ = 12 against H1 : µ < 12

(a) What is the type I error probability if the critical region is defined as x̄ < 11.5 kilograms?
(b) Find β for the case in which the true mean elongation is 11.25 kilograms.
(c) Find β for the case in which the true mean is 11.5 kilograms.
(d) Repeat (a)-(c) using a sample size of n = 16 and the same critical region.
(e) Find the boundary of the critical region if the type I error probability is
– α = 0.01 and n=4
– α = 0.05 and n=4
– α = 0.01 and n = 16
– α = 0.05 and n = 16
(f) Calculate the probability of a type II error if the true mean elongation is 11.5 kilograms
and
– α = 0.05 and n = 4
– α = 0.05 and n = 16
– Compare the values of β calculated in the previous parts. What conclusion can you
draw?
(g) Calculate the P -value if the observed statistic is

Prepared by Mr. PHOK Ponna 1/8


Institute of Technology of Cambodia Engineering Class

(a) x̄ = 11.25 (b) x̄ = 11.0 (c) x̄ = 11.75

3. A manufacturer produces crankshafts for an automobile engine. The crankshafts wear after
100,000 miles (0.0001 inch) is of interest because it is likely to have an impact on warranty
claims. A random sample of n = 15 shafts is tested and x̄ = 2.78. It is known that σ = 0.9
and that wear is normally distributed.

(a) Test H0 : µ = 3 versus H1 : µ ̸= 3 using α = 0.05.


(b) What is the power of this test if µ = 3.25?
(c) What sample size would be required to detect a true mean of 3.75 if we wanted the power
to be at least 0.9?

4. The sodium content of twenty 300-gram boxes of organic cornflakes was determined. The data
(in milligrams) are as follows: 131.15, 130.69, 130.91, 129.54, 129.64, 128.77, 130.72, 128.33,
128.24, 129.65, 130.14, 129.29, 128.71, 129.00, 129.39, 130.42, 129.53, 130.12, 129.78, 130.92.
The following tasks are to be completed:

(a) Check that sodium content is normally distributed.


(b) Can you support a claim that the mean sodium content of this brand of cornflakes differs
from 130 milligrams? Use α = 0.05. Find the P -value.
(e) Explain how the question in part (a) could be answered by constructing a two-sided
confidence interval on the mean sodium content.

5. A research engineer for a tire manufacturer is investigating tire life for a new rubber compound
and has built 16 tires and tested them to end-of-life in a road test. The sample mean and
standard deviation are x̄ = 60, 139.7 kilometers and s = 3, 645.94 kilometers.

(a) Can you conclude, using α = 0.05, that the standard deviation of tire life is less than
4000 kilometers? State any necessary assumptions about the underlying distribution of
the data. Find the P -value for this test.
(b) Explain how you could answer the question in part (a) by constructing a 95% one-sided
confidence interval for σ.

6. In a random sample of 85 automobile engine crankshaft bearings, 10 have a surface finish


roughness that exceeds the specifications. Do these data present strong evidence that the
proportion of crankshaft bearings exhibiting excess surface roughness exceeds 0.10?

(a) State and test the appropriate hypotheses using α = 0.05.


(b) If it is really the situation that p = 0.15, how likely is it that the test procedure in part
(a) will not reject the null hypothesis?
(c) If p = 0.15, how large would the sample size have to be for us to have a probability of
correctly rejecting the null hypothesis of 0.9?

7. Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 < µ2 with known variances σ1 = 10


and σ2 = 5. Suppose that sample sizes n1 = 10 and n2 = 15 and that x̄1 = 14.2 and x̄2 = 19.7.
Use α = 0.05.

(a) Test the hypothesis and find the P -value.


(b) Explain how the test could be conducted with a confidence interval.
(c) What is the power of the test in part (a) if µ1 is 4 units less than µ2 ?
(d) Assume that sample sizes are equal. What sample size should be used to obtain β = 0.05
if µ1 is 4 units less than µ2 ? Assume that α = 0.05.

Prepared by Mr. PHOK Ponna 2/8


Institute of Technology of Cambodia Engineering Class

8. Consider the hypothesis test H0 : µ1 = µ2 against H1 : µ1 ̸= µ2 . Suppose that sample sizes


n1 = 10 and n2 = 10, that x̄1 = 7.8 and x̄2 = 5.6, and that s21 = 4 and s22 = 9. Assume that
σ12 = σ22 and that the data are drawn from normal distributions. Use α = 0.05.

(a) Test the hypothesis and find the P -value.


(b) Explain how the test could be conducted with a confidence interval.

9. Two suppliers manufacture a plastic gear used in a laser printer. The impact strength of these
gears measured in foot-pounds is an important characteristic. A random sample of 10 gears
from supplier 1 results in x̄1 = 290 and s1 = 12, and another random sample of 16 gears from
the second supplier results in x̄2 = 321 and s2 = 22.

(a) Is there evidence to support the claim that supplier 2 provides gears with higher mean im-
pact strength? Use α = 0.05, and assume that both populations are normally distributed
but the variances are not equal. What is the P -value for this test?
(b) Do the data support the claim that the mean impact strength of gears from supplier 2 is
at least 25 foot-pounds higher than that of supplier 1? Make the same assumptions as in
part (a).
(c) Construct a confidence interval estimate for the difference in mean impact strength, and
explain how this interval could be interpreted.

10. Assume that the birth weight in grams of a baby born in the United States is N (3315, 5252 )
for boys and girls combined. Let X equal the weight of a baby girl who is born at home in
2
Ottawa County, and assume that the distribution of X is N (µX , σX ).
2
(a) Using 11 observations of X, define the critical region for testing H0 : σX = 5252 against
2
the alternative hypothesis Ha : σX < 5252 (less variation of weights of home-born babies)
if α = 0.05.
(b) Calculate the value of the test statistic and state your conclusion, using the following
weights:

3119 2657 3459 3629 3345 3629


3515 3856 3629 3345 3062

Table 2: Weights of Baby Girls (grams)

11. Let Y equal the weight in grams of a baby boy who is born at home in Ottawa County, and
assume that the distribution of Y is N (µY , σY2 ). The following weights are for 11 boys:

4082 3686 4111 3686 3175 4139


3660 3380 3298 3657 4082

Table 3: Weights of Baby Boys (grams)

(a) Test H0 : σY2 = 5252 against the alternative hypothesis Ha : σY2 < 5252 . Use α = 0.05.
(b) Test the equality of the variances of the weights of girls (Exercise 10) and the weights of
boys born at home in Ottawa County against a two-sided alternative.
(c) Test the equality of the means of the weights of girls and boys born at home in Ottawa
County against a two-sided alternative.

Prepared by Mr. PHOK Ponna 3/8


Institute of Technology of Cambodia Engineering Class

12. Let X and Y equal the forces required to pull stud No. 3 and stud No. 4 of a window that
has been manufactured for an automobile. Assume that the distributions of X and Y are
2
N (µX , σX ) and N (µY , σY2 ), respectively.

(a) If m = n = 10 observations are selected randomly, define a test statistic and a critical
region for testing H0 : µX − µY = 0 against a two-sided alternative. Let α = 0.05. Assume
that the variances are equal.
(b) Given m = 10 observations of X, namely,

111, 120, 139, 136, 138, 149, 143, 145, 111, 123

and n = 10 observations of Y , namely,

152, 155, 133, 134, 119, 155, 142, 146, 157, 149

calculate the value of the test statistic and state your conclusion clearly.
(c) What is the approximate p-value of this test?
(d) Construct box plots for each of the two sets of data. Do the box plots confirm your
decision in part (b)?

13. A box of Corn Flakes that is labeled “NET WT. 14 OZ.” should have 14 oz or more of cereal
inside. Twenty of these boxes were randomly selected and the weight of the contents (in ounces)
determined.

Table 4: Weights of Cereal Boxes (in ounces)

14.52 14.47 14.80 14.60 14.45 14.25 14.15 14.12 14.36 14.39
14.50 14.29 14.28 14.60 13.85 14.18 14.39 14.45 14.69 14.38

(a) Draw a histogram of the weight of cereal per box.


(b) Find the sample statistics mean and standard deviation.
(c) What percent of the sample is below the 14.0 oz weight?
The plant manager is studying the filling process and needs to estimate the mean weight
of all boxes being filled.
(d) Determine whether an assumption of normality is reasonable. Explain.
(e) Find the 95% confidence interval for the mean weight.
(f) The filling process is believed to be running with a standard deviation of fill of no more
than 0.2 oz. Test this hypothesis at the 0.01 level

14. A food manufacturer claims that eating its new cereal as part of a daily diet lowers total blood
cholesterol levels. The table shows the total blood cholesterol levels (in milligrams per deciliter
of blood) of seven patients before eating the cereal and after one year of eating the cereal as
part of their diets. At α = 0.05, can you conclude that the new cereal lowers total blood
cholesterol levels?

15. Ten individuals have participated in a diet-modification program to stimulate weight loss. Their
weight bothbefore and after participation in the program is shown in thefollowing list.

(a) Is there evidence to support the claim that this particular diet-modification program is
effective in producing a mean weight reduction? Use α = 0.05.

Prepared by Mr. PHOK Ponna 4/8


Institute of Technology of Cambodia Engineering Class

Table 5: Cholesterol Levels of Patients

Patient 1 2 3 4 5 6 7
Total blood cholesterol level (before) 210 225 240 255 270 260 235
Total blood cholesterol level (after) 200 220 245 248 265 250 230

Table 6: Subject Measurements Before and After

Subject Before After


1 195 187
2 213 195
3 247 221
4 201 190
5 187 175
6 210 197
7 215 199
8 246 221
9 294 278
10 310 285

(b) Is there evidence to support the claim that this particular diet-modification program will
result in a mean weight loss of at least 10 pounds? Use α = 0.05.
(c) Suppose that, if the diet-modification program results in a mean weight loss of at least 10
pounds, it is important to detect this with a probability of at least 0.90. Was the use of
10 subjects an adequate sample size? If not, how many subjects should have been used?

16. An article in Neurology (1998, Vol. 50, pp. 1246–1252) discussed that monozygotic twins
share numerous physical, psychological, and pathological traits. The investigators measured
an intelligence score of 10 pairs of twins, as the following table shows:

Table 7: Intelligence Scores of Pairs of Twins

Pair Birth order: 1 Birth order: 2


1 6.08 5.73
2 6.22 5.80
3 7.99 8.42
4 7.44 6.84
5 6.48 6.43
6 7.99 8.76
7 6.32 6.32
8 7.60 7.62
9 6.03 6.59
10 7.52 7.67

(a) Is the assumption that the difference in score is normally distributed reasonable? Show
results to support your answer.
(b) Find a 95% confidence interval on the difference in mean score. Is there any evidence that
mean score depends on birth order?
(c) It is important to detect a mean difference of at least 0.90. Was the use of 10 pairs an
adequate sample size? If not, how many pairs should have been used?

Prepared by Mr. PHOK Ponna 5/8


Institute of Technology of Cambodia Engineering Class

17. Two different types of injection-molding machines are used to form plastic parts. A part is
considered defective if it has excessive shrinkage or is discolored. Two random samples, each
of size 300, are selected, and 15 defective parts are found in the sample from machine 1, and 8
defective parts are found in the sample from machine 2.

(a) Is it reasonable to conclude that both machines produce the same fraction of defective
parts, using α = 0.05? Find the P-value for this test.
(b) Construct a 95% confidence interval on the difference in the two fractions defective.
(c) Suppose that p1 = 0.05 and p2 = 0.01. With the sample sizes given here, what is the
power of the test for this two-sided alternative?
(d) Suppose that p1 = 0.05 and p2 = 0.01. Determine the sample size needed to detect this
difference with a probability of at least 0.9.
(e) Suppose that p1 = 0.05 and p2 = 0.02. With the sample sizes given here, what is the
power of the test for this two-sided alternative?
(f) Suppose that p1 = 0.05 and p2 = 0.02. Determine the sample size needed to detect this
difference with a probability of at least 0.9.

18. In a survey of 450 adults 18 to 29 years of age, 419 said they use the Internet. In a survey of
400 adults 30 to 49 years of age, 324 said they use the Internet. At can you reject the claim
that the proportion of Internet users is the same for the two age groups?

19. Let Y1 , ..., Yn be independent identically distributed N (µ = 0, σ 2 ) random variables with pdf
 2
1 −y
f (y) = √ exp
2πσ 2 2σ 2

where y is real and σ 2 > 0.


Pn 2
i=1 Yi
2
(a) Show that Yσ2 ∼ χ2 (1). Deduce that σ2
∼ χ2 (n). [Recal that if Z ∼ N (0, 1), then
Z 2 ∼ χ2 (1).]
(b) What is the UMP (uniformly most powerful) level α test for H0 : σ 2 = 1 vs. Ha : σ 2 > 1
?
(c) If n = 20 and α = 0.05, then find β(3.8027) of the above UMP test if σ 2 = 3.8027.

20. Let X have a Pareto distribution with parameter θ > 0; that is, the pdf of X is
(
1 − θ1 −1
x , x > 1,
f (x; θ) = θ
0, otherwise.

Let X1 , X2 , . . . , Xn be a random sample from this distribution.

(a) Let Yn = 2θ ni=1 ln Xi . Show that Yn has chi-squared distribution with degree of freedom
P
2n (that is, Yn ∼ χ2 (2n)).
(Recall that if V ∼ χ2 (ν), then the moment generating function (mgf) of V is GV (t) =
(1 − 2t)−ν/2 , t < 21 ).
(b) Using Neyman-Pearson lemma, show that the best critical region for testing H0 : θ = θ0
against Ha : θ = θa , θa > θ0 > 0, at level of test α, is
( n
)
X
RR = (x1 , . . . , xn ) : ln xi ≥ c ,
i=1

where c satisfies P (Yn ≥ 2c/θ0 ) = α.

Prepared by Mr. PHOK Ponna 6/8


Institute of Technology of Cambodia Engineering Class

(c) Is the above critical region RR is uniformly most powerful for testing H0 : θ = θ0 against
Ha : θ > θ0 at level of test α? Justify your answer.
(d) If n = 12, α = 0.10, H0 : θ = 3 and Ha : θ = 5. Determine the critical region RR.

21. Let X1 , X2 , . . . , Xn be a random sample from a population X with pdf


 √
 x
 1 −

f (x; θ) = 2θ√x e θ if x > 0


0 otherwise .

where θ > 0 is an unknown parameter.



(a) Let Y = X. Find the cdf of Y and then deduce the pdf of Y . Show that Y ∼ Exp(θ).
(b) Find the MLE θ̂n for θ. Is θ̂n efficient?
2nθ̂n
(c) Let U = . Find the mgf of U and deduce that U ∼ χ2 (2n).
θ
20
X √
(d) Derive a 90% CI for θ when xi = 47.4.
i=1

(e) Find the best critical region for testing H0 : θ = 1 versus Ha : θ = θa , where θa > 1 when
α = 0.01 and n = 15.
(f) Is the test in (e) a UMP test for testing H0 : θ = 1 vs Ha : θ > 1? Justify your answer.

22. A mathematics drop-in help centre keeps track of the number of students it helps on a given
day with the following results:

Monday Tuesday Wednesday Thursday Friday


4 3 6 1 4
7 6 5 3 2
8 7 8 3 1
4 3 0 5 1
7 4 3 3 1
6 0 2 3

Assuming that the underlying populations are normal with common standard deviation, can
one conclude that the average number of students helped in the centre on a given day differs
from any other at a level of significance of α = 0.01 ? Use the following partially completed
ANOVA table in your analysis.

Source df Sum of Squares Mean Squares F


Treatments 55.7586
Error 99.0
Total 154.7586

Given: F0.01,4,24 = 4.218

23. Consider the following computer output for an experiment.

(a) How many replicates did the experimenter use?


(b) Fill in the missing information in the ANOVA table. Use bounds for the P-value.
(c) What conclusions can you draw about differences in the factor-level means?

Prepared by Mr. PHOK Ponna 7/8


Institute of Technology of Cambodia Engineering Class

Table 8: ANOVA Table

Source DF SS MS F P-value
Factor 5 ? ? ? ?
Error ? 27.38 ? ? ?
Total 29 66.34

(d) Compute an estimate for σ 2 .


24. In Design and Analysis of Experiments, 8th edition (John Wiley & Sons, 2012), D. C. Mont-
gomery described an experiment in which the tensile strength of a synthetic fiber was of interest
to the manufacturer. It is suspected that strength is related to the percentage of cotton in the
fiber. Five levels of cotton percentage were used, and five replicates were run in random order,
resulting in the following data:

Cotton Percentage Observations


1 2 3 4 5
15 7 7 15 11 9
20 12 17 12 18 18
25 14 18 18 19 19
30 19 25 22 19 23
35 7 10 11 15 11

Table 9: Cotton Percentage Observations

(a) Does cotton percentage affect breaking strength? Draw comparative box plots and perform
an analysis of variance. Use α = 0.05.
(b) Plot average tensile strength against cotton percentage and interpret the results.
(c) Analyze the residuals and comment on model adequacy.
(d) Use Tukey’s method to identify significant difference among µi ’s.
25. An experiment was run to determine whether fourspecific firing temperatures affect the density
of a certain typeof brick. The experiment led to the following data.

Temperature (◦ F ) Density
100 21.8 21.9 21.7 21.6 21.7 21.5 21.8
125 21.7 21.4 21.5 21.5 - - -
150 21.9 21.8 21.8 21.6 21.5 - -
175 21.9 21.7 21.8 21.7 21.6 21.8 -

Table 10: Temperature vs. Density

(a) Does the firing temperature affect the density of the bricks? Use α = 0.05.
(b) Find the P -value for the F -statistic computed in part (a).
(c) Analyze the residuals from the experiment.

Prepared by Mr. PHOK Ponna 8/8

You might also like