data analysis
data analysis
Name___________________________________
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
The owners of a coffee shop conducted a taste test to determine whether its customers preferred a new coffee brand to
the current one sold by the shop. Customers who were willing to participate were given small samples of each of the
two brands in random order and were asked to select which one they preferred without knowing the brand. Of the
100 participating customers, 90% chose the new brand. Based on these results, the owners determined that a majority
of their customers preferred the new brand and therefore switched their coffee supplier.
2) Predicting the preference of all of the coffee shop customers based on the taste test results refers 2)
to which aspect of statistics?
A) Investigation
B) Inference
C) Design
D) None of these
E) Description
Opinion
Party Approve Disapprove No Opinion
Republican 42 20 14
Democrat 50 24 18
Independent 10 16 6
Find the rejection region to test the claim of independence. Use α = 0.05.
A) χ2 > 11.14
B) χ2 > 7.78
C) χ2 > 9.49
D) χ2 > 7.81
E) χ2 > 16.92
1
4) A medical researcher is interested in determining if there is a relationship between adults over 4)
50 who walk regularly and low, moderate, and high blood pressure. A random sample of 236
adults over 50 is selected and the results are given below.
Calculate the chi-square test statistic χ2 to test the claim that walking and blood pressure level
are not related.
A) 16.183 B) 18.112 C) 2.778 D) 6.003 E) 3.473
H0: No linear relationship between the number of years and test score, β = 0.
Ha: There is a linear relationship between the number of years and test score, β ≠ 0.
Do the data provide sufficient evidence to conclude that the number of years applicants have
studied Spanish and their score on the test are independent? Assume α = 0.05.
A) Since the P-value =0.00111 < 0.05, we reject H0 and conclude that the number of years
applicants have studied Spanish and their score on the test are independent.
B) Since the P-value =0.00024 < 0.05, we reject H0 and conclude that the number of years
applicants have studied Spanish and their score on the test are independent.
C) Since the P-value=0.00111 < 0.05, we reject H0 and conclude that there is an association
between the number of years applicants have studied Spanish and their score on the test.
D) Since the P-value=0.00024 < 0.05, we reject H0 and conclude that there is an association
between the number of years applicants have studied Spanish and their score on the test.
E) No conclusions can be drawn without a chi-squared test of independence.
2
6) The index of exposure to radioactive waste, x, and the cancer mortality rates, y, (deaths per 6)
100,000) were recorded for nine different Oregon counties. Use the regression analysis provided
below to perform the hypothesis test to determine if the index of exposure is useful as a
predictor of cancer mortality rate.
The regression equation is ^
y = 114.7156 + 9.231456x
R-sq = 85.8%
9 - 2 = 7 degrees of freedom
H0: No linear relationship between the index of exposure and cancer mortality rate, β = 0.
Ha: There is a linear relationship between the index of exposure and cancer mortality rate, β ≠ 0.
3
Fill in the missing table entries.
7) Fill in the missing entries in the following partially completed one-way ANOVA table. 7)
Source DF SS MS F-statistic
Group 24
Error 29 4.8
Total 33
A)
Source DF SS MS F-statistic
Group 4 24 6.00 1.25
Error 29 139.2 4.8
Total 33 163.2
B)
Source DF SS MS F-statistic
Group 4 24 6.00 0.80
Error 29 139.2 4.8
Total 33 163.2
C)
Source DF SS MS F-statistic
Group 62 24 0.39 0.08
Error 29 139.2 4.8
Total 33 163.2
D)
Source DF SS MS F-statistic
Group 4 24 6.00 1.25
Error 29 139.2 4.8
Total 33 5.8
E) cannot be determined from the information given
Given below are the analysis of variance results from a Minitab display. Assume that you want to use a 0.05
significance level in testing the null hypothesis that the different samples come from populations with the same
mean.
8) 8)
Source DF SS MS F p
Factor 3 13.500 4.500 5.17 0.011
Error 16 13.925 0.870
Total 19 27.425
What can you conclude about the equality of the population means?
A) Do not reject the null hypothesis since the p-value is greater than the significance level.
We conclude that the factor means are equal.
B) Reject the null hypothesis since the p-value is less than the significance level. We
conclude that all of the factor means differ.
C) Reject the null hypothesis since the p-value is less than the significance level. We
conclude that at least two of the factor means differ.
D) Do not reject the null hypothesis since the p-value is less than the significance level.
There is not enough evidence to show that the factor means are unequal.
4
Use the MiniTab display.
9) A manager records the production output of three employees who each work on three different 9)
machines for three different days. The sample results are given below and the Minitab results
follow.
Employee
A B C
I 23, 27, 29 30, 27, 25 18, 20, 22
Machine II 25, 26, 24 24, 29, 26 19, 16, 14
III 28, 25, 26 25, 27, 23 15, 11, 17
SOURCE DF SS MS
MACHINE 2 34.67 17.33
EMPLOYEE 2 504.67 252.33
INTERACTION 4 26.67 6.67
ERROR 18 98.00 5.44
TOTAL 26 664.00
Assume that the number of items produced is not affected by an interaction between employee
and machine. Using a 0.05 significance level, test the claim that the machine has no effect on the
number of items produced. State the null hypothesis.
A) Ha: There is a machine effect.
B) H0: There is no machine effect.
C) Ha: There is no machine effect.
D) None of these.
E) H0: There is a machine effect.
5
10) A manager records the production output of three employees who each work on three different 10)
machines for three different days. The sample results are given below and the Minitab results
follow.
Employee
A B C
I 16, 18, 19 15, 17, 20 14, 18, 16
Machine II 20, 27, 29 25, 28, 27 29, 28, 26
III 15, 18, 17 16, 16, 19 13, 17, 16
SOURCE DF SS MS
MACHINE 2 588.74 294.37
EMPLOYEE 2 2.07 1.04
INTERACTION 4 15.48 3.87
ERROR 18 98.67 5.48
TOTAL 26 704.96
Using a 0.05 significance level, test the claim that the interaction between employee and machine
has no effect on the number of items produced. Calculate the F-statistic for this test. Round
your answer to four decimal places.
A) 53.7172 B) 0.7062 C) 0.1569 D) 53.9069 E) 2.9300
Program I Program II
60 75 61 63 66 89 68 77
86 69 64 70 84 80 81 87
72 82 59 78 73 91 93
94 95
A) 143 B) 95.5 C) 90 D) 235 E) 8.61
6
12) 11 female employees and 11 male employees are randomly selected from one company and their 12)
weekly salaries are recorded. The salaries (in dollars) are shown below. Software reports a
small-sample one-sided P-value of 0.008. Interpret.
Female Male
350 420 470 410 460 650
385 675 520 545 720 810
540 400 550 660 500 880
450 640 700 750
A) If there were no difference in weekly salaries among men and women, the probability of
obtaining a sample as extreme as that observed is 0.008. There is strong evidence that the
average male salary is higher.
B) If there were no difference in weekly salaries among men and women, the probability of
obtaining a sample as extreme as that observed is 0.008. There is strong evidence that the
average salaries differ.
C) The probability that a randomly selected male earns more than a randomly selected
female is 0.008. There is strong evidence that the average male salary is higher.
D) The probability of obtaining a sample where the average female salary is greater than or
equal to the average male salary is 0.008. There is strong evidence that the average male
salary is higher.
E) The probability that a randomly selected female earns the same as a randomly selected
male is 0.008. There is strong evidence that the average male salary is higher.
13) A medical researcher wishes to try three different techniques to lower blood pressure of patients 13)
with high blood pressure. The subjects are randomly selected and assigned to one of the three
groups. Group 1 is given medication, Group 2 is given an exercise program, and Group 3 is
assigned a diet program. At the end of six weeks, the reduction in each subject's blood pressure
is recorded. The Kruskal-Wallis test was used to test the claim that there is no difference in the
distributions of the populations. The resulting H statistic is given below. Provide bounds for
the P-value and interpret. Assume α = 0.05.
H = 10.29
A) 0.010 < P-value < 0.025. Reject H0 and conclude that the population distributions for the
three groups are not all equal.
B) 0.010 < P-value < 0.025. Reject H0 and conclude that the population distributions for the
three groups are all equal.
C) 0.005 < P-value < 0.010. Reject H0 and conclude that the population distributions for the
three groups are not all equal.
D) 0.010 < P-value < 0.025. Reject H0 and conclude that the population distributions for the
three groups are all different.
E) 0.005 < P-value < 0.010. Reject H0 and conclude that the population distributions for the
three groups are all different.
7
14) A researcher wishes to determine whether there is a difference in the average age of elementary 14)
school, high school, and community college teachers. Teachers are randomly selected. Their
ages are recorded below. Find the mean rank for high school teachers that are used in the
Kruskal-Wallis test.
8
Answer Key
Testname: DATA ANALYSIS
1) E
2) B
3) C
4) E
5) D
6) E
7) A
8) C
9) B
10) B
11) C
12) A
13) C
14) B