Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
14 views

Assignment 3 P Value

Uploaded by

jana.mansour33
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Assignment 3 P Value

Uploaded by

jana.mansour33
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Assignment 3 – p value

Please look at the video: https://www.youtube.com/watch?v=WXPBoFDqNVk

When we deal with a large number of data and we want to check a certain claim regarding all data,
oftentimes it is not possible to look at all of them, so a sample is taken. The p value, helps us decide if
the sample is representative.

The null hypothesis: That there is no statistical, significant difference between the expected (the
claim) and the observed values (my observations) and that the recorded differences are just fluctuations
due to mere chance, pure luck, random chance.

The following steps are followed to get the p value: expected and observed value are stated, the chi
square number is calculated, the p value picked up from the table for the correct degree of freedom. If
the p value is more than 0.05, the null hypothesis is accepted. If the p value is less than 0.05, the null
hypothesis is rejected.

Chi square table:

P 0.99 0.95 0.90 0.75 0.5 0.25 0.1 0.05 0.025 0.01 0.00 0.001
value 5

1 0 0.004 0.01 0.102 0.45 1.32 2.71 3.84 5.024 6.63 7.88 10.83
6 5

2 0.02 0.103 0.21 0.575 1.38 2.77 4.61 5.99 7.378 9.21 10.5 13.81
1 6 9

3 0.115 0.352 0.58 1.212 2.36 4.11 6.25 7.81 9.348 11.34 12.8 16.26
4 6 3

4 0.297 0.711 1.06 1.923 3.35 5.39 7.78 9.49 11.143 13.28 14.8 18.46
4 7 6

5 0.554 1.145 1.61 2.675 4.35 6.63 9.24 11.07 12.832 15.09 16.7 20.51
0 1 5

6 0.872 1.635 2.20 3.455 5.34 7.84 10.6 12.59 14.449 16.81 18.5 22.45
4 8 4 5

7 1.239 2.167 2.83 4.255 6.34 9.04 12.0 14.07 16.013 18.48 20.2 24.32
3 6 2 8

8 1.647 2.733 3.49 5.071 7.34 10.22 13.3 15.51 17.534 20.09 21.9 26.12
0 4 6 5
9 2.088 3.325 4.16 5.899 8.34 11.39 14.6 16.92 19.022 21.67 23.5 27.87
8 3 8 9

10 2.558 3.940 4.86 6.737 9.34 12.55 15.9 18.31 20.483 23.21 25.1 29.58
5 2 9 9

1) I am thinking of buying a restaurant, so I go and ask the current owner: “What is the distribution
of the number of customers you get each day? “And he says: “Oh, I’ve already figured that out”
and he gives me this distribution over here: 5% on Mondays, 10% on Tuesdays, 10% on
Wednesdays, 15% on Thursdays, 20% on Fridays, 20% on Saturdays, and 20% on Sundays. I was
a little bit suspicious, so I decided to see how well this distribution fits observed data, and I
actually recorded the number of customers as they come in during the week and this is what I
get: 8% on Mondays, 11% on Tuesdays, 15% on Wednesday, 14% on Thursday, 15% on Fridays,
18 % on Saturdays and 19% on Sundays. The following questions / steps can help you decide
whether to reject or accept the owner’s distribution. What are my expected and observed
values? State the null hypothesis. How many components are there and what are they? How
many degrees of freedom? Calculate the value of chi square. What is the range of the p-value?
Do you keep or you reject the null hypothesis? What does this mean?
My expected values are Mondays 5%, Tuesdays 10%, Wednesdays 10%, Thursdays
15%, Fridays 20%, Saturdays 20%, and Sundays 20%. My observed values are Mondays
8%, Tuesdays 11%, Wednesdays 15%, Thursdays 14%, Fridays 15%, Saturdays 18%,
and Sundays 19%.
Null hypothesis says that no relation exists among population parameters. There is no
statistical difference between the observed and the expected values. This means
difference observed is expected due to chance.
7 components, the days of the week
n-1= 7-1= 6 degrees of freedom
chi^2 = ((8-5)2/5) + (11-10)2/10) + ((15-10)2/10) + ((14-15)2/15) + ((15-20)2/20) + (18-
20)2/20) + ((19-20)2/20)
chi^2 = 1.8+0.1+2.5+0.066+1.25+0.2+0.05
chi^2 = 5.966
range of the p-value 0.5-0.25 p > 0.05
keep the null hypothesis. The recorded differences are pure luck and have nothing to do
with statistical.
2) For a certain celebration, balloons of 6 colors have been used: red, yellow, blue, green, orange
and violet. My friend who organized the event (and bought thousands of balloons) claims that
the colors were evenly distributed. I look up on the sky and see 36 balloons (a small sample): 2
red, 8 yellow, 6 blue, 1 green, 10 orange and 9 violet. Was my friend right? What are your
EXPECTED VALUES (for 36 balloons) according to her claim? What are your OBSERVED VALUES?
What is the null hypothesis? What are your components and how many are they? How many
degrees of freedom are there? Calculate the chi square value. Get the corresponding p-value
from the table. Would you reject or not the null hypothesis? Reflect on your results.
My expected values are 36/6 = 6 of each color. Thus, 6 red, 6 yellow, 6 blue, 6 green, 6
orange, 6 violet. My observed values are 2 red, 8 yellow, 6 blue, 1 green, 10 orange and 9
violet.
Null hypothesis is colors of balloon equally distributed.
Amount of red= amount of yellow= amount of blue = amount of green= amount of
orange= amount of violet.
Components are color of the balloons, which are six colors.
Degree of freedom= n-1= 6-1= 5
Chi^2 = ((2-6)2/6) + (8-6)2/6) + ((6-6)2/6) + ((1-6)2/6) + ((10-6)2/6) + (9-6)2/6)
Chi^2 = 2.66+0.66+0+4.16+2.66+1.5
Chi^2 = 11.64
The range of p-value 0.025- 0.05 (p< 0.05)
I would reject the null hypothesis, as the recorded differences are statistically significant
and not due to mere luck.
3) I need to n the claim that at AUC 60% of the students are female and 40% are males, so I count
the number of females and males in an electronic engineering class and find out that there are
10% females and 90% males. What are your EXPECTED VALUES? What are your OBSERVED
VALUES? (Do not convert). What is the null hypothesis? What are your components and how
many are they? How many degrees of freedom are there? Calculate the chi square value. Get
the corresponding p-value from the table. Would you reject or keep the null hypothesis?
Reflect on your results.
Expected: females 60% males 40%
Observed: females 10% males 90%
The null hypothesis is that there is no different between the expected and the observed,
and that the recorded difference is due to mere luck.
Components: electronic students females and males. There are two components
Degree of freedom= n-1= 2-1= 1
Chi^2 = ((10-60)2/60) + (90-40)2/40)
Chi^2 = 41.66 + 62.5
Chi^2 = 104.16
P< 0.05
Reject the null hypothesis. The recorded differences are not due to mere coincidences,
but they are statistically significant.
4) Quality control: M&M candies are produced evenly for each color (red, orange, yellow, green,
blue, brown), the manufacturer claims. As I can’t buy and eat all their candies, all I can do is take
a sample: out of 18 candies, I find 9 yellow, 1 red, 1 brown, 3 blue, 1 orange and 3 green. To find
out if the manufacturer is right, let's answer the questions: What are your EXPECTED VALUES
(for the 18 candies)? What are your OBSERVED VALUES? What is the null hypothesis? What are
your components and how many are they? How many degrees of freedom are there? Calculate
the chi square value. Get the corresponding p-value from the table. Would you reject or keep
the null hypothesis? Reflect on your results.
Expected 3 yellow, 3 red, 3 brown, 3 blue, 3 orange, 3 green
Observed 9 yellow, 1 red, 1 brown, 3 blue, 1 orange, 3 green
Null hypothesis: M&M produce even colors of candies.
Components is the color of the M&M. 6 components the 6 colors.
Degrees of freedom= n-1= 6-1= 5
Chi^2 = ((1-3)2/3) + (9-3)2/3 + ((3-3)2/3) + ((3-3)2/3) + ((1-3)2/3) + ((1-3)2/3)
Chi^2 = 1.33+12+0+0+1.33+1.33=16
Chi^2 = 16
p-value between 0.005-0.01, p-value= 0.0068 (p-value< 0.05)
Reject the null hypothesis. The recorded differences are statistically significant between
the expected and the observed. And they are not due to mere luck. There is a meaningful
real difference.
5) Many years ago, a deadly disease of anthrax decimated the sheep population in France. When a
vaccination was discovered, in order to validate it, a public experiment was performed. Given 20
sheep, 10 sheep got the vaccination (the experimental group: vaccinated) and 10 sheep did not
get it (the control group; unvaccinated). (The independent variable: the vaccine). The sheep
were randomly selected and kept in the same conditions. Then, they were all given the same
deadly doze of anthrax (constant). The number of dead sheep (the dependent variable) was
counted in both groups. The expected values in case the vaccine is not effective are 10 dead
sheep in both groups. (TOTAL FAILURE - Vaccine skeptics believed that this technique doesn’t
work at all and vaccinating animals will make no difference whether they live or die. They
thought that in both groups (experimental and control) all the 10 sheep will die (since they were
given a deadly doze of anthrax). That there will be no difference in the groups. They adopt the
idea that there is no meaningful difference between the two groups, that the vaccine didn’t have
any effect.) However, after performing the experiment, the observed values are 10 dead sheep
in the unvaccinated group and 5 dead sheep in the vaccinated group. Would the government
and pharmaceutical industries spend confidently a very large amount of money for a technique
that can save only 5 animals out of 10? To help them, answer the question below: What are
your EXPECTED VALUES (for the dead sheep in the 2 groups)? What are your OBSERVED VALUES
for the dead sheep in the 2 groups? What is the null hypothesis? What are your components
and how many are they? How many degrees of freedom are there? Calculate the chi square
value, get the corresponding p-value from the table and decide if you keep or reject the null
hypothesis.
O (observed): 10 dead unvaccinated and 5 dead vaccinated
E (expected): 10 dead unvaccinated and 10 dead vaccinated
Degrees of freedom = n-1 = 2-1 = 1 degrees of freedom
Chi^2= (o-e)2/e = (10-10)2/10 +(5-10)2/10 =0+25/10= 2.5
Components are the vaccinated and not vaccinated sheep. 2 components.
p value > 0.05
Keep the null hypothesis. The recorded differences are due to mere coincidences and
have nothing to do with statistically significant.
6) The Election Day has finally arrived, and 3 candidates A, B and C are all hoping to become the
president of the association. A small survey 2 weeks earlier showed that A is a favorite by 60 %,
followed by B at 30% and C at10 %. On the election evening, a sample of the votes yielded 55%
for A, 31% for B and 14% for C. (Do not convert). To find out if the sample of votes is
representative for the entire population and if A indeed won or not, the following questions /
steps can help you decide: What are your EXPECTED VALUES? What are your OBSERVED
VALUES? What is the null hypothesis? What are your components and how many are they? How
many degrees of freedom are there? Calculate the chi square value. Get the corresponding p-
value from the table. Would you reject or keep the null hypothesis? Reflect on your results.
Expected values= 60%, 30%, and 10%.
Observed values= 55%, 31%, and 14%.
Components are the three candidates, A, B and C. three components.
Degrees of freedom= n-1= 3-1=2.
Chi^2= ((55-60)2/60 +(31-30)2/30 +(14-10)2/10= 5/12 + 1/30 + 8/5 =41/20 = 2.05.
P-value between 0.25-0.5 (p-value >0.05)
keep the null hypothesis. The recorded differences are due to mere coincidences and
have nothing to do with statistically significant.
7) The widespread claim that asbestos causes lung cancer has been put to test decades ago, by
examining the lung health of workers in a factory that produces asbestos and other people at a
construction site working with pipes covered by asbestos.
In case asbestos had been that dangerous, the incidence of lung cancer among those people
should have been much higher than in the rest of the population, however, the study indicated
no difference.
But in the coming decades too many people in the industry died, with the incidence of lung
cancer much higher than the average population. Where is the contradiction? Scientists have
looked recently into the cohort of the 100 people studied and 80 of them had just been
employed, 10 had worked for 10 years and 8 had worked for 20 years and 2 for 30 years. But
there had been people working in the factory for over 30 years! A fair random distribution
should have been 25 from each category.
What are your EXPECTED VALUES? What are your OBSERVED VALUES? What is the null
hypothesis? What are your components and how many are they? How many degrees of
freedom are there? Calculate the chi square value. Get the corresponding p-value from the
table. Would you reject or keep the null hypothesis? Reflect on your results in light of the
reliability of the sample used.
Expected values= 25, 25, 25, and 25.
Observed values= 80, 10, 8, and 2.
Components are workers in a factory that produces asbestos and other people at a
construction site working with pipes covered by asbestos. 2 components.
Degrees of freedom= n-1= 4-1=3.
Chi^2 = ((80-25)2/25+ (10-25)2/25+ (8-25)2/25+ (2-25)2/25) = 4068/25= 162.72.
p-value<0.05
Null hypothesis is rejected.
The recorded differences are not due to mere coincidences, but they are statistically
significant.
8) A factory manager keeps bragging about the precision of various processes in the factory. For
instance, a container full of nails (900 nails) had been divided equally in 3 parts, he says. The
quality control counted the nails and found 297, 305 and 298. In order to check the reliability of
the manager´s claim: What are your EXPECTED VALUES? What are your OBSERVED VALUES?
What is the null hypothesis? What are your components and how many are they? How many
degrees of freedom are there? Calculate the chi square value. Get the corresponding p-value
from the table. Would you reject or keep the null hypothesis? Reflect on your results.
Expected values= 300, 300 and 300.
Observed values= 297, 305 and 298.
Components are the divided parts of nails. 3 components.
Degrees of freedom= n-1= 3-1=2.
Chi square= ((297-300)2/300 +(305-300)2/300+ (298-300)2/300) = 19/150= 0.1267.
P-value between 0.90-0.95 (p-value >0.05)
Keep the null hypothesis.
The recorded differences are due to mere coincidences and have nothing to do with
statistically significant.

You might also like