Data Lab
Data Lab
We Have Candy
Ben Dobozy
Kitchener-Waterloo Collegiate & Vocational School
MDM4UI
Mrs. Ford
Dobozy 2
Contents
1 Introduction 3
3 Theoretical Predictions 5
4 Experimental Results 9
5 Analysis 10
7 Connections 13
8 Conclusion 13
1 Introduction
The annual KCI games fair is a mass experiment in statistics. Students enrolled in MDM4UI form small
groups and design games of chance. Students calculated in advance the odds of each outcome in their game
and the game’s overall expected value. One event day, the students set up their games in the cafeteria
and interested students were given ”Data Dollars” to spend on the games. The cost to play a game was
determined by its expected value. The number of occurrences of each outcome at the fair were recorded by
the students and compared to the predicted theoretical results. As a result, it would be ideal to attract as
After taking an informal survey of KCI students, my partner Dan Duran determined that the most desired
quality in a games fair game is free candy. We designed a minimalist marketing scheme focused on making
it as clear as possible that there would be free candy available. A picture of our final setup is available in
the appendix. We titled our game ”We Have Candy feat ben and dan”(sic). We designed a simple dice game
that ensured people didn’t feel cheated and would want to keep playing while at the same time stacking the
odds in our favour. It was clearly a successful marketing scheme, as we were able to run 85 trials.
The game went through several iterations before we settled on the final version. Initially, the game
involved someone playing indefinitely until they lost. There would be a prize pool available that doubled
every round and they could either take the money or continue doubling it. However, the ability to either
”take it or leave it” meant that the player had choice, so it was not a game of purely chance. Also, it would
be impossible to calculate the expected value of such a game using the techniques we had been taught up to
that point.
As a result, we adjusted our design. We still wanted to keep it simple as we thought that a simple game
would be much more likely to both attract and retain players. The central idea of playing until you lost and
rolling dice were retained, but everything else was thrown out. We settled on a five round game where players
were rewarded based on the round they reached, up to a grand prize from winning round five. The specifics
of the game are described below. All that was left was to pick (more or less arbitrarily) the prize values that
would place the expected value within the acceptable range. We selected payout values such that our game
cost 2 data dollars to play. We decided that the values should get larger until the last round, where a win
awarded less money than a loss. The intent was to create tension in the last round of the game.
Dobozy 4
• 2 6-sided dice
TM
2. They roll the dice in the Beyblade Metal Fusion Bolt Blast Stadium
3. If either of the numbers declared appear on the dice, the player advances to the next round (if none
5. If the number appears on either dice, the player advances to the next round.
If the player loses at any given round (or wins round 5), consult Table 1 for the prize they win. For a diagram
of standard game progression, see Figure 1. A photo of the game setup is provided in the appendix.
3 Theoretical Predictions
Calculating the probability of each outcome was fairly straightforward. For the first round, the player
rolls two dice and declares two numbers. They win when either number declared came up on the dice. The
2 2
P (Winning Round 1) = +
6 6
4
=
6
2
=
3
2
P (Losing Round 1) = 1 −
3
1
=
3
Dobozy 6
For all subsequent rounds, only one number is declared, so the probability of winning is given by:
1 1
P (Winning rounds 2-5) = +
6 6
1
=
3
And:
1
P (Losing Rounds 1-5) = 1 −
3
2
=
3
These are the probabilities for each round in isolation. However, in order to win (or lose) a given round,
one must first have reached that round, meaning they must have won the previous rounds. In other words,
P (Winning Round 2 6= P (Winning on Round 2)). For example, the probability of losing on round 3 is given
by:
P (Losing on Round 3) = P (Winning Round 1) and P (Winning Round 2) and P (Losing Round 3)
2 1 2
= ( )( )( )
3 3 3
4
=
27
≈ 0.148
2
Note that since the probability of winning the first round is 3, the probability of winning any subsequent
round is 13 , the probability of losing any subsequent round is 23 , and the probability of losing on a given round
is the product of the probabilities of winning each previous round times the probability of losing the current
round, it is simple to derive an expression for the probability of losing on any round after the initial round.
First, since:
And:
n−2
2 1
P (Winning Every Previous Round) =
3 3
Dobozy 7
Therefore:
n−2
2 1 2
L(n) =
3 3 3
2 n−2
2 1
L(n) =
3 3
n−2
4 1
L(n) = (1)
9 3
Where n is the round in question and L(n) is the probability of losing on the nth round.
Note that since there are two possible outcomes (a win or a loss), the probabilities don’t change from one
round to another, and (1) describes the probability of a loss after a given number of rounds, (1) describes a
geometric distribution. This formula applies only to the middle rounds (loss on 2 to loss on 5). It doesn’t
1
apply for the special cases of a loss on round 1 or a win on round 5. A loss on round 1 is simply 3 and, while
the probability of winning on round 5 is slightly harder to find, it is given by the equation:
Using equations (1) and (2), we generated a table in Microsoft Excel of possible outcomes and their
Pn
respective probabilities. Using the formula for expected value, E(n) = i=1 (xi )(P (i)), where P (i) is the
probability of losing on the ith round, and xi is the payout for losing on the ith round, we calculated the
expected value for the game. We summarized the information in a table (rounded to 2 decimal places for
simplicity):
The expected value was found by simply summing up the items in the fourth column of Table 2. It was
found to be 1.73. Using this result, we also predicted our expected profit. We used the formula:
Where n is the number of trials and E(x) is the expected value. Using this formula, we predicted that after
85 trials we would have made a profit of 22.95 data dollars. We could expect our profit to be somewhere
around 23 dollars.
Dobozy 9
4 Experimental Results
fi
Outcome Payout ($) Frequency (fi ) Experimental Probability ( total ) Expected Value (P (i) × P ayout)
The calculation of each part of this table is fairly straightforward. The total number of trials was found by
simply summing the number of occurrences of each outcome. The experimental probability of each outcome
was then determined by dividing the frequency of that outcome by the total number of trials, and the expected
value was found by multiplying the experimental probability and payout for each outcome and summing them
fi
P (experimental) =
total
and:
6
X
E(x) = P (experimental) × P ayout
i=1
We found our experimental expected value to be 1.67. We made a profit of 28 data dollars. However, we
had to ask for extra money to cover debts at multiple points during the games fair. Once that is taken into
account, our final profit was 18 data dollars. Though I will cover this in more detail in the analysis section,
our experimental expected value is remarkable close to our theoretically predicted value. This is possibly
because we had a very large sample size thanks to Gabriel Meissner who simply stood there and played our
game for the entire lunch period. It was strange that there was not a single occurrence of someone winning
round 5. The theoretical probability of someone winning round 5 (P ≈ 0.01) suggests that we probably
5 Analysis
Our theoretical and experimental expected values differed by only 0.06. To calculate percent difference,
|V1 − V2 |
Percent Difference = (V1 +V2 )
× 100%
2
I let V1 equal 1.73 and V2 equal 1.67. The percent error in our expected value is 3.53%. This is a remarkable
similarity, which probably exists because of Gabriel Meissner’s help. Since the AP lunch happened to coincide
Dobozy 11
with the games fair, we had access to a surplus of Chinese food. In exchange for the Chinese food, Gabriel
Meissner agreed to play our game over and over again. This is probably why our experimental results match
There are slight variations in the data due to our sample size. While it was relatively large, it would
ideally be in the 100s. It terms of profit, we expected that our profit would be somewhere around 23 dollars,
and ended up with a net profit of 18 dollars. This is slightly lower than expected, but can also be attributed
to low sample size and random stastical fluctuations. It is also possible that we simply missed taking money
from Gabe one of the times he played our game, or failed to record his results once or twice, which could
more than account for the discrepancy between our predicted and experimental profits.
Overall, I believe that our group was very successful. We had a good (but not ideal) sample size, and even
ignoring the fact that we hired Gabriel Meissner to play our game for us, we succeeded in attracting many
people to play our game. Our minimalist design and very clear, no-nonsense marketing resulted in many
people coming to try our game. We also created a very straightforward set of rules that were still complex
enough to be mathematically interesting. The simplicity of the game was another factor in our success, as it
Dobozy 12
was easy to play again and did an excellent job of creating tension.
One of the best decisions we made was to offer candy as a prize for winning the first round. It didn’t
cut into our profits at all, and we could select a prize for the second round that would ensure we would still
make a profit if someone lost there. People don’t care about data dollars, they care about candy, so they
kept playing to win more and more candy, totally missing the fact that they were losing money every time.
Also, the odds of making it to the second round were high enough that most people who played won candy.
However, our record keeping system could probably have been improved, and we could have payed more
attention to the money we distributed. I’m not sure whether or not those were major factors in our profit
discrepancy, but it would help ensure our results are accurate. We could also have adjusted the payout values.
We needed to ask for extra money several times as a consequence of people winning relatively easy prizes
that were valued too high. Specifically, the relatively large number of people who made it to rounds 3 and
4 resulted in us running out of money. It would have run more smoothly if the prize value for round 4 had
been slightly lower, and the values for a loss and win on round 5 had been made higher to compensate.
TM
Another problem with our game was the design of the Beyblade Metal Fusion Bolt Blast Stadium .
The staduim was designed for BeybladeTM battles, not rolling dice, and as a consequence had holes on both
sides. In a BeybladeTM battle, being knocked into these holes counts as a loss. However, the holes acted as
traps for the dice, which would fall in on occasion, sometimes being prevented from landing on a specific face
by the walls of the hole. A potential design improvement would be to use an alternate BeystadiumTM such
7 Connections
The most important thing I learned from this project is the importance of good marketing. We surveyed
the student body, determined the the thing they wanted most out of a games fair game, and centered our
marketing on that. And it worked. Our advertising made it clear that we had candy, the candy was easy to
win, and that’s all you need to know about our game. A high contrast, black-on-white sigh was eye-catching,
easy to read, and got the message accross. The simplicity of our game also worked in our favour. Since our
game was easy to pick up and play and gave the player the illusion of control it was highly addictive. Most
These lessons can be easily extended to apply to real life. In the future, if I need to market something,
I’ll make sure to keep my marketing as simple and to the point as possible. When I’m designing a game, or
any other activity, I’ll make sure to keep the rules simple. If people can pick up and play a game easily, then
8 Conclusion
Overall, our predictions were incredibly accurate, with the percent error in our expected value being only
3.53%. Our high accuracy can be attributed to our relatively large sample size (thanks again to Gabriel
Meissner). Theere was also a discrepancy between our predicted profit and our theoretical profit as well as
the experimental and theoretical probabilities of each outcome (Figure 5). However, both of these can be
expected with this kind of stastical experiment. These deviations can be expected to shrink with a larger
sample size. That may not be fully necessary however, as if there’s one thing I’ve learned over the course of
this project, it’s not about how much money you make or how accurate your predictions are. What matters
Figure 7: A photo of our final game setup. Note the instructional ”Roll Here” sign. The dice are not pictured.
Figure 8: A photo of the game during the actual games fair. Note the highly visually appealing
”We Have Candy” sign
Though neither photo makes it obvious, the player rolls both dice inside the pictured BeystadiumTM
Dobozy 15
References