2023_Fall_DAP_COMM215
2023_Fall_DAP_COMM215
2023_Fall_DAP_COMM215
Group Members
Nouran Ismail, 402443244
Tiffany Chee, 40249188
Kristen Podwalski, 40229066
3. Calculate the variance for each indicator and comment on the dispersion and
outliers of each indicator.
4. Construct a histogram for each indicator using the number of classes and class
length explained in Chapter 2.
5. Determine for each indicator whether it is skewed to the right, skewed to the left, or
not skewed.
Indicator 1: not skewed
Indicator 2: skewed to the right
Indicator 3: not skewed
Indicator 4: not skewed
Indicator 5: skewed to the left
Indicator 6: skewed to the right
Indicator 7: skewed to the right
6. For the Blue Team, calculate the expected probability of someone being in Category
A.
7 in category A
40 in all categories
7/40=0.175
7. For the Blue Team, given that someone is female, what is the probability that she
belongs to Category C?
The probability that someone is in category C given they are female is equal to the
probability they are in category C and they are female divided by the probability they are
female.
7 in category C
18 female
7/18=0.3889
8. For the Red Team, calculate the expected probability of someone being in Category
D.
9. For the Red Team, given that someone is male, what is the probability that he
belongs to Category B?
4 Males in category B
22 males
4/22= 0.182
10. Construct a pie chart to illustrate the proportion of each category within the Red
Team.
11. Construct a pie chart to illustrate the proportion of each category within the Blue
Team.
12. Joylandia advocates that the number of people in each of the four categories is
equal. Test this claim based on the Red Team and Blue Team samples at a 1% level
of significance.
Observed:
Red Team Blue Team Total
A 13 7 20
B 8 11 19
C 9 11 20
D 10 11 21
Tota 40 40 80
l
Expected:
Red Team Blue Team Total
A 10 10 20
B 9.5 9.5 19
C 10 10 20
D 10.5 10.5 21
Tota 40 40 80
l
H0: the distribution of people across all four categories = for the red team and the blue team
HA: the distribution across all four categories ≠ for the red team and the blue team
Chi-Square Test:
X² = (13-10)²/10 + (8-9.5)²/9.5 + (9-10)²/10 + (10-10.5)²/10.5 + (7-10)²/10 + (11-9.5)²/9.5 + (11-
10)²/10 + (11-10.5)²/10.5 = 2.521303256
13. Using the original data, test the overall significance of this multiple regression
model. The independent variables are the seven indicators and the dependent
variable is the performance index. Use a significance level of α=.01.
SUMMARY OUTPUT
Regression Statistics
0.9997520
Multiple R 7
R Square 0.9995042
Adjusted R
Square 0.9995007
Standard 0.0166878
Error 1
Observations 1000
ANOVA
Significance
df SS MS F F
79.558969
Regression 7 556.912787 5 285687.043 0
0.0002784
Residual 992 0.27625508 8
SUMMARY OUTPUT
Regression Statistics
0.9997520
Multiple R 7
R Square 0.9995042
Adjusted R
Square 0.9995007
Standard 0.0166878
Error 1
Observations 1000
ANOVA
Significance
df SS MS F F
79.558969
Regression 7 556.912787 5 285687.043 0
0.0002784
Residual 992 0.27625508 8
14. Joylandia calculates the performance index using the indicators as building blocks.
However, during the games, the performance index was computed based on other
criteria. Test whether the multiple regression model is still significant for both the
Red and Blue teams at a 1% level of significance. Note: The independent variables
remain the seven indicators, but the dependent variable is now the performance
index during the games.
BLUE:
SUMMARY OUTPUT
Regression Statistics
Multiple R 1
R Square 1
Adjusted R
Square 1
Standard
Error 2.153E-16
Observations 40
ANOVA
Significanc
df SS MS F eF
Regression 1 35.8129499 35.8129499 7.726E+32 0
Residual 38 1.7615E-30 4.6354E-32
Total 39 35.8129499
p-value is above 0.01therefore, the regression model is not significant for predicting the
performance index during games for Blue teams at a 1% significance level.
RED:
SUMMARY OUTPUT
Regression Statistics
0.4047825
Multiple R 2
0.1638488
R Square 9
Adjusted R -
Square 0.0190592
Standard 0.9186405
Error 4
Observation
s 40
ANOVA
Significanc
df SS MS F eF
5.2917573 0.7559653 0.8957991 0.5215157
Regression 7 7 4 9 1
27.004814 0.8439004
Residual 32 3 5
32.296571
Total 39 7
p-value is above 0.01therefore, the regression model is not significant for predicting the
performance index during games for Blue teams at a 1% significance level.
15. Reflect on the limitations of this project. If you were to expand this project, what
improvements would you make; what other questions would you ask; would you request
more data; and what other comments would you provide? Justify your answers.
This project has a good variety of subjects that are helpful for the final exam. In order to extract
valuable and applicable insights, expanding the project would necessitate more thorough data, an
improved model, and a greater comprehension of the context. Enhanced project outcomes could
be achieved by professional collaboration, ongoing iteration, and a comprehensive approach to
data collecting and analysis.