Mathematical Foundations of Computer Science Lecture Outline

Mathematical Foundations of Computer Science
Lecture Outline
November 1, 2018
Example. The following pseudo-code computes the minimum of n distinct numbers that
are stored in an array A. What is the expected number of times that the variable min is
assigned a value if the array A is a random permutation of the n elements.
FindMin(A, n):
min ← A[1]
for i ← 2 to n do
if (A[i] < min) then
min ← A[i]
return min
Solution. Let X be the random variable denoting the number of times that min is as-
signed a value. We want to calculate E[X]. Let Xi be the random variable that is 1 if min
is assigned A[i] and 0 otherwise. Clearly,
X = X1 + X2 + X3 + · · · + Xn
Using the linearity of expectation we get

n
X
E[X] = E[Xi ]
i=1
Xn
= Pr[Xi = 1] (1)
i=1
Note that Pr[Xi = 1] is the probability that A[i] contains the smallest element among the
elements A[1], A[2], . . . , A[i]. Since the smallest of these elements is equally likely to be in
any of the first i locations, we have Pr[Xi = 1] = 1i . Thus equation (1) becomes
n
X 1
E[X] = = H(n) ≈ ln n + c
i
i=1
where c is a constant less than 1.
Example. Suppose there are k people in a room and n days in a year. On average how
many pairs of people share the same birthday?
2 Lecture Outline November 1, 2018
Solution. Let X be the random variable denoting the number of pairs of people sharing
the same birthday. For any two people i and j, let Xij be an indicator random
P variable
that is 1 if i and j have the same birthday and is 0 otherwise. Clearly X = i,j Xij . Using
the linearity of expectation we get
X
E[X] = E[Xij ]
i,j
X
= Pr[Xij = 1]
i,j
X1
=
n
i,j
k

2
=
n
k(k − 1)
=
2n
Assuming n = 365, the smallest value of k for which the RHS is at least 1 is 28.
Example (Markov’s Inequality). Let X be a non-negative random variable. Then for

all a > 0, prove that
E[X]
Pr[X ≥ a] ≤
a
Solution. Intuitively, the claim means that if there is too much of probability mass as-
sociated with values above E[X] then the total contribution of such values to E[X] would
be very large. Formally, the proof is as follows.
X
E[X] = x Pr[X = x]
x
X
≥ x Pr[X = x]
x≥a
X
≥a Pr[X = x]
x≥a
= a Pr[X ≥ a]
E[X]
∴ Pr[X ≥ a] ≤
a
Example. Suppose we flip a fair coin n times. Using Markov’s inequality bound the the
probability of obtaining at least 3n/4 heads.
Solution. Let X be the random variable denoting the total number of heads in n flips of
a fair coin. We know that E[X] = n/2. Applying the above inequality we get
E[X] n/2 2
Pr[X ≥ 3n/4] ≤ = =
3n/4 3n/4 3
November 1, 2018 Lecture Outline 3
Example. Suppose we roll a die. Using Markov’s inequality bound the probability of
obtaining a number greater than or equal to 7.
Solution. Let X be the random variable denoting the result of the roll of a die. We know
that E[X] = 3.5. Using the Markov’s inequality we get
E[X] 1
Pr[X ≥ 7] ≤ ≤
7 2
As this result shows, Markov’s inequality gives a loose bound in some cases.
Variance
We are interested in calculating how much a random variable deviates from its mean.
This measure is called variance. Formally, for a random variable X we are interested in
E[X − E[X]]. By the linearity of expectation we have
E[X − E[X]] = E[X] − E[E[X]] = E[X] − E[X] = 0
Note that we have used the fact that E[X] is a constant and hence E[E[X]] = E[X]. This
is not very informative. While calculating the deviations from the mean we do not want
the positive and the negative deviations to cancel out each other. This suggests that we
should take the absolute value of X − E[X]. But working with absolute values is messy.
It turns out that squaring of X −E[X] is more useful. This leads to the following definition.
Definition. The variance of a random variable X is defined as
Var[X] = E[(X − E[X])2 ] = E[X 2 ] − (E[X])2
The standard deviation of a random variable X is

p
σ[X] = Var[X]
The standard deviation undoes the squaring in the variance. In doing the calculations it
does not matter whether we use variance or the standard deviation as we can easily com-
pute one from the other.
We show as follows that the two forms of variance in the definition are equivalent.
E[(X − E[X])2 ] = E[X 2 − 2XE[X] + E[X]2 ]

= E[X 2 ] − 2E[XE[X]] + E[X]2
= E[X 2 ] − 2E[X]2 + E[X]2
= E[X 2 ] − E[X]2
In step 2 we used the linearity of expectation and the fact that E[X] is a constant.
4 Lecture Outline November 1, 2018
Example. Consider three random variables X, Y, Z. Their probability mass distribution

is as follows.
1

2, x = −2
Pr[X = x] = 1
2, x =2

 0.001, y = −10
Pr[Y = y] = 0.998, y = 0
0.001, y = 10

1

 3, z = −5
1
Pr[Z = z] = 3, z =0
1
3, z =5

Which of the above random variables is more “spread out”?
Solution. It is easy to see that E[X] = E[Y ] = E[Z] = 0.
Var[X] = E[X 2 ]
= 0.5 · (−2)2 + 0.5 · (2)2
= 4
Var[Y ] = E[Y 2 ]
= 0.001 · (−10)2 + 0.998 · 02 + 0.001 · (10)2
= 0.2
Var[Z] = E[Z 2 ]
= (1/3) · (−5)2 + (1/3) · 02 + (1/3) · (5)2
= 16.67
Thus Z is the most spread out and Y is the most concentrated.
Example. In the experiment where we roll one die let X be the random variable denoting
the number that appears on the top face. What is Var[X]?
Solution. From the definition of variance, we have
Var[X] = E[X 2 ] − E[X]2

2
1 1
= (1 + 4 + 9 + 16 + 25 + 36) + (1 + 2 + 3 + 4 + 5 + 6)
6 6
91 49
= −
6 4
35
=
12
November 1, 2018 Lecture Outline 5
Example. In the hat-check problem that we did in one of the earlier lectures, what is the
variance of the random variable X that denotes the number of people who get their own
hat back?
Solution. We can express X as
X = X1 + X2 + · · · + Xn
where Xi is the random variable that denotes that is 1 if the ith person receives his/her
own hat back and 0 otherwise. We already know from an earlier lecture that E[X] = 1.
E[X 2 ] can be calculated as follows.
n
X X
2
E[X ] = E[Xi2 ] + 2 E[Xi · Xj ]
i=1 i<j
Xn X
= E[Xi2 ] + 2 1 · Pr[Xi = 1 ∩ Xj = 1]
i=1 i<j
n
X 1 n(n − 1) 1
= +2
n 2 n(n − 1)
i=1
1
= n· +1
n
= 2
Var[X] is given by
Var[X] = E[X 2 ] − E[X]2 = 2 − 1 = 1
Note that like the expectation, the variance is independent of n. This means that it is not
likely for many people to get their own hat back even if n is large.

Mathematical Foundations of Computer Science Lecture Outline

Uploaded by

Copyright:

Available Formats

Mathematical Foundations of Computer Science Lecture Outline

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mathematical Foundations of Computer Science Lecture Outline

Uploaded by

Copyright:

Available Formats

Mathematical Foundations of Computer Science

Using the linearity of expectation we get

where c is a constant less than 1.

Example (Markov’s Inequality). Let X be a non-negative random variable. Then for

E[X − E[X]] = E[X] − E[E[X]] = E[X] − E[X] = 0

Definition. The variance of a random variable X is defined as

Var[X] = E[(X − E[X])2 ] = E[X 2 ] − (E[X])2

The standard deviation of a random variable X is

E[(X − E[X])2 ] = E[X 2 − 2XE[X] + E[X]2 ]

Example. Consider three random variables X, Y, Z. Their probability mass distribution

Which of the above random variables is more “spread out”?

Solution. It is easy to see that E[X] = E[Y ] = E[Z] = 0.

Thus Z is the most spread out and Y is the most concentrated.

Solution. From the definition of variance, we have

Var[X] = E[X 2 ] − E[X]2

Solution. We can express X as

You might also like