Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Expected Value of The Greatest of A Set of Reals

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Using the Finite to Explain the Infinite

I recently considered a seemingly simple problem (that I was surprised I hadnt thought of before):
given k real numbers {x
i

i 0
k
each satisfying x
i
[0, 1], what is the expected value of the largest one
among them?
The exact expected value is given by the expression
]
0
1
]
0
1

]
0
1
Max[x
1
, x
2
, ... x
k
] x
1
x
2
x
k
,
where there are k integrals. Unfortnuately, I was unable to determine a way to evaluate this integral
generally, though not for lack of trying. I still think there may be some recursive method, and more
likely some extremely tedious method for evaluating this integral by splitting it into cases. It can be
evaluated easily for small k: that is, the fact that the integrand is a Max function isnt as weird as it
sounds. Take k 2 as an example:
]
0
1
]
0
1
Max[x, y] x y
]
0
1
_
]
0
y
y x
]
y
1
x x y

]
0
1
| y
2

1
2

1
2
y
2
] y

1
2
|
1
3
y
3
1]
0
1

2
3
Generally, however, we cant do the same thing (splitting the Max function into cases) as easily, since
the other variables are dependent on the integrand, not independent of it.
But enough of the wrong way to go about it. My solution is seemlingly strange, as I began by another
variable, which seems counter-intuitive. Instead of allowing the k numbers to take on any real value
in that range, I allowed them only to take one of n+1 values: in particular, any of the values {0,
1
n
,
2
n
,
,
n1
n
,
n
n
1}, each with equal probability. This way, if we can calculate what the expected value is
given a finite set of possible choices of the k reals, then we can let n , and solve our original
problem.
So, what is the probability that 0 is the largest number in the set? Each of the k numbers would have
to be equal to 0, so this has probability
1
n1

1
n1

1
n1
|
1
n1
]
k
.
What is the probability that
1
n
is the largest number? Each of the k numbers would have to be either
0 or
1
n
, and at least one of them would have to be exactly
1
n
. We may try this: first choose the one that
has value exactly
1
n
, which we can choose k ways each with probability
1
n1
; then the remaining k 1
numbers have 2 different options, giving a total probability of
k
n1
|
2
n1
]
k1

k2
k1
(n1)
k
. But this over-
looks the fact that if one of the remaining k 1 numbers happens to be exactly
1
n
, then the original
number we choose to have that value doesnt have to have that value.
Rather, well go with another counter-intuitive move: add yet another variable! To avoid confusion,
lets call the probability that the largest number is exactly
i
n
P (i). Then we will say that
P(i) _
j 1
k
Q(i, j ), where Q(i, j ) (well refer to it as Q( j ) for short, but it is important to note that it
is a function of i), defined as the probability that the largest value is exactly
i
n
and exactly j of the k
reals are equal to
i
n
. We will now see that calculating Q( j ) is doable without much trouble.
What is Q(1)? We choose the 1 number that is equal to
i
n
, which we can do k ways, then choose the
rest (they have i options, namely the numbers |0,
1
n
,
2
n
, ... ,
i 1
n
|), which gives a total probability of
ki
k1
(1n)
k
.
And what of Q(2)? We choose the 2 numbers equal to
i
n
, which we can do
k
2
ways, then the
remaining ones (each have i options again), giving a total probability of
k
2
i
k2
(1n)
k
.
Seeing a pattern? Generally, we first choose which j numbers are exactly equal to
i
n
, which we can do
k
j
ways, then pick the remaining k j numbers. This gives the value Q( j )
k
j
i
k j
(1n)
k
.
Now we have an explicit value for the expected value: it is
E(k) _
i 0
n
P(i)
i
n
, i.e. the probability of getting that value times the value itself, or
E(k)
1
n(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j 1
.
We can check quickly that this checks for our k 2 case, letting n :
E(2)
1
n(1n)
2
_
i 0
n
_
j 1
2
2
j
i
2 j 1

1
n(1n)
2
_
i 0
n
(2 i
2
i)
1
n(1n)
2

1
6
n(1 n) (5 4 n)
4 n5
6 n6
,
which quickly becomes
2
3
in the limit.
Generally, this sum is a little tricky to evaluate, but letting n makes it a tad easier.
Lim
n
E(k) Lim
n
1
n(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j 1
Lim
n
1
n(1n)
k
_
i 0
n
_
k
1
i
k

k
2
i
k1

k
k 1
i
2

k
k
i
Lim
n
1
n(1n)
k
_
i 0
n
k i
k
Lim
n
k
n(1n)
k
_
i 0
n
i
k
First some clarification: between the second and third lines, we removed all the terms but one in the
summigrand (yeah I made it up). How can we do that? Each term would become a polynomial in n of
degree one more than the degree of the i: thats how such sums evaluate (since the binomial coeffi-
cients attached to them are just constants). Thus, as n , only the leading term of the polynomial
will become important (notice that the largest power of i in the summigrand is k, which will become a
polynomial in n of degree k 1, which matches the denominator of the thing multiplying the sum).
But we are still left with the trouble of evaluating that sum: I have done it in general, but there isnt
even an explicit version. I used two different ways of calculating related sums to produce the value of
this sum recursively (the sum Im talking about is _
i 0
n
i
k
, i.e. some polynomial in n thats a function
of k). Heres what I wrote:
Let S
n,x
_
i 1
n
i
x
In general, we write _
i 1
n
((i 1)
p
i
p
) (n 1)
p
1 _
i 1
n
i
p
_
k0
p
_
p
k
i
pk
(1)
k1

(n 1)
p
1 _
k0
p1
p
k
S
n,k
(n 1)
p
1 _
k0
p2
p
k
S
n,k
p S
n, p1
, and finally we have
S
n, p1

1
p
(n 1)
p
1 _
k0
p2
p
k
S
n,k
, or in general,
S
n,x

1
x1
(n 1)
x1
1 _
k0
x1
x 1
k
S
n,k
.
How in the world I came up with this, I will never know. But what I do know is that this actually
helps us to determine what the leading coefficient of the leading term in the expansion of the sum is
(thats all we want, after all, for as n all the lesser terms drop out, and we have
k
1
times that
coefficient as the value for E(k)). The polynomial S
n,x
has a bunch of crap on the right side (that
weird sum): ignore all that stuff, and notice the (n 1)
x1
: that guy has the largest power of n, i.e.
1
x1
n
x1
. This is the coefficient:
1
k1
. Therefore E(k)
k
k1
. Hooray!
A (perhaps) interesting unintended consequence of this is the fact that the sum _
i 0
n
P(i) 1, i.e. that
the probabilities have to sum to 1. This gives us a different series summing to n
k
for any positive
integers n and k:
1
1
(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j
(1 n)
k
_
i 0
n
_
j 1
k
k
j
i
k j
n
k
_
i 0
n1
_
j 1
k
k
j
i
k j
Oddly enough, this gives one of the two series I mentioned above used to evaluate S
n,x
(as we take
polynomials in n and let k vary). Interestingly, it generalizes a commonly known series: if we let
n 2, we get
2
k
_
i 0
1
_
j 1
k
k
j
i
k j
(1)
k
1

k
2
...
k
k
But then, recursively,
3
k
2
k
2
k1
k
1
2
k2
k
2
... 2
k
k 1
1, or generallly,
(n 1)
k
n
k
n
k1
k
1
... n
k
k 1
1
which is likely provable by some algebraic manipulation; as is so common, our problem-solving
analysis above is the equivalent analysis to that manipulation.
Using the Finite to Explain the Infinite
I recently considered a seemingly simple problem (that I was surprised I hadnt thought of before):
given k real numbers {x
i

i 0
k
each satisfying x
i
[0, 1], what is the expected value of the largest one
among them?
The exact expected value is given by the expression
]
0
1
]
0
1

]
0
1
Max[x
1
, x
2
, ... x
k
] x
1
x
2
x
k
,
where there are k integrals. Unfortnuately, I was unable to determine a way to evaluate this integral
generally, though not for lack of trying. I still think there may be some recursive method, and more
likely some extremely tedious method for evaluating this integral by splitting it into cases. It can be
evaluated easily for small k: that is, the fact that the integrand is a Max function isnt as weird as it
sounds. Take k 2 as an example:
]
0
1
]
0
1
Max[x, y] x y
]
0
1
_
]
0
y
y x
]
y
1
x x y

]
0
1
| y
2

1
2

1
2
y
2
] y

1
2
|
1
3
y
3
1]
0
1

2
3
Generally, however, we cant do the same thing (splitting the Max function into cases) as easily, since
the other variables are dependent on the integrand, not independent of it.
But enough of the wrong way to go about it. My solution is seemlingly strange, as I began by another
variable, which seems counter-intuitive. Instead of allowing the k numbers to take on any real value
in that range, I allowed them only to take one of n+1 values: in particular, any of the values {0,
1
n
,
2
n
,
,
n1
n
,
n
n
1}, each with equal probability. This way, if we can calculate what the expected value is
given a finite set of possible choices of the k reals, then we can let n , and solve our original
problem.
So, what is the probability that 0 is the largest number in the set? Each of the k numbers would have
to be equal to 0, so this has probability
1
n1

1
n1

1
n1
|
1
n1
]
k
.
What is the probability that
1
n
is the largest number? Each of the k numbers would have to be either
0 or
1
n
, and at least one of them would have to be exactly
1
n
. We may try this: first choose the one that
has value exactly
1
n
, which we can choose k ways each with probability
1
n1
; then the remaining k 1
numbers have 2 different options, giving a total probability of
k
n1
|
2
n1
]
k1

k2
k1
(n1)
k
. But this over-
looks the fact that if one of the remaining k 1 numbers happens to be exactly
1
n
, then the original
number we choose to have that value doesnt have to have that value.
Rather, well go with another counter-intuitive move: add yet another variable! To avoid confusion,
lets call the probability that the largest number is exactly
i
n
P (i). Then we will say that
P(i) _
j 1
k
Q(i, j ), where Q(i, j ) (well refer to it as Q( j ) for short, but it is important to note that it
is a function of i), defined as the probability that the largest value is exactly
i
n
and exactly j of the k
reals are equal to
i
n
. We will now see that calculating Q( j ) is doable without much trouble.
What is Q(1)? We choose the 1 number that is equal to
i
n
, which we can do k ways, then choose the
rest (they have i options, namely the numbers |0,
1
n
,
2
n
, ... ,
i 1
n
|), which gives a total probability of
ki
k1
(1n)
k
.
And what of Q(2)? We choose the 2 numbers equal to
i
n
, which we can do
k
2
ways, then the
remaining ones (each have i options again), giving a total probability of
k
2
i
k2
(1n)
k
.
Seeing a pattern? Generally, we first choose which j numbers are exactly equal to
i
n
, which we can do
k
j
ways, then pick the remaining k j numbers. This gives the value Q( j )
k
j
i
k j
(1n)
k
.
Now we have an explicit value for the expected value: it is
E(k) _
i 0
n
P(i)
i
n
, i.e. the probability of getting that value times the value itself, or
E(k)
1
n(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j 1
.
We can check quickly that this checks for our k 2 case, letting n :
E(2)
1
n(1n)
2
_
i 0
n
_
j 1
2
2
j
i
2 j 1

1
n(1n)
2
_
i 0
n
(2 i
2
i)
1
n(1n)
2

1
6
n(1 n) (5 4 n)
4 n5
6 n6
,
which quickly becomes
2
3
in the limit.
Generally, this sum is a little tricky to evaluate, but letting n makes it a tad easier.
Lim
n
E(k) Lim
n
1
n(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j 1
Lim
n
1
n(1n)
k
_
i 0
n
_
k
1
i
k

k
2
i
k1

k
k 1
i
2

k
k
i
Lim
n
1
n(1n)
k
_
i 0
n
k i
k
Lim
n
k
n(1n)
k
_
i 0
n
i
k
First some clarification: between the second and third lines, we removed all the terms but one in the
summigrand (yeah I made it up). How can we do that? Each term would become a polynomial in n of
degree one more than the degree of the i: thats how such sums evaluate (since the binomial coeffi-
cients attached to them are just constants). Thus, as n , only the leading term of the polynomial
will become important (notice that the largest power of i in the summigrand is k, which will become a
polynomial in n of degree k 1, which matches the denominator of the thing multiplying the sum).
But we are still left with the trouble of evaluating that sum: I have done it in general, but there isnt
even an explicit version. I used two different ways of calculating related sums to produce the value of
this sum recursively (the sum Im talking about is _
i 0
n
i
k
, i.e. some polynomial in n thats a function
of k). Heres what I wrote:
Let S
n,x
_
i 1
n
i
x
In general, we write _
i 1
n
((i 1)
p
i
p
) (n 1)
p
1 _
i 1
n
i
p
_
k0
p
_
p
k
i
pk
(1)
k1

(n 1)
p
1 _
k0
p1
p
k
S
n,k
(n 1)
p
1 _
k0
p2
p
k
S
n,k
p S
n, p1
, and finally we have
S
n, p1

1
p
(n 1)
p
1 _
k0
p2
p
k
S
n,k
, or in general,
S
n,x

1
x1
(n 1)
x1
1 _
k0
x1
x 1
k
S
n,k
.
How in the world I came up with this, I will never know. But what I do know is that this actually
helps us to determine what the leading coefficient of the leading term in the expansion of the sum is
(thats all we want, after all, for as n all the lesser terms drop out, and we have
k
1
times that
coefficient as the value for E(k)). The polynomial S
n,x
has a bunch of crap on the right side (that
weird sum): ignore all that stuff, and notice the (n 1)
x1
: that guy has the largest power of n, i.e.
1
x1
n
x1
. This is the coefficient:
1
k1
. Therefore E(k)
k
k1
. Hooray!
A (perhaps) interesting unintended consequence of this is the fact that the sum _
i 0
n
P(i) 1, i.e. that
the probabilities have to sum to 1. This gives us a different series summing to n
k
for any positive
integers n and k:
1
1
(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j
(1 n)
k
_
i 0
n
_
j 1
k
k
j
i
k j
n
k
_
i 0
n1
_
j 1
k
k
j
i
k j
Oddly enough, this gives one of the two series I mentioned above used to evaluate S
n,x
(as we take
polynomials in n and let k vary). Interestingly, it generalizes a commonly known series: if we let
n 2, we get
2
k
_
i 0
1
_
j 1
k
k
j
i
k j
(1)
k
1

k
2
...
k
k
But then, recursively,
3
k
2
k
2
k1
k
1
2
k2
k
2
... 2
k
k 1
1, or generallly,
(n 1)
k
n
k
n
k1
k
1
... n
k
k 1
1
which is likely provable by some algebraic manipulation; as is so common, our problem-solving
analysis above is the equivalent analysis to that manipulation.
2 Expected Value of the Greatest of a Set of Reals.nb
Using the Finite to Explain the Infinite
I recently considered a seemingly simple problem (that I was surprised I hadnt thought of before):
given k real numbers {x
i

i 0
k
each satisfying x
i
[0, 1], what is the expected value of the largest one
among them?
The exact expected value is given by the expression
]
0
1
]
0
1

]
0
1
Max[x
1
, x
2
, ... x
k
] x
1
x
2
x
k
,
where there are k integrals. Unfortnuately, I was unable to determine a way to evaluate this integral
generally, though not for lack of trying. I still think there may be some recursive method, and more
likely some extremely tedious method for evaluating this integral by splitting it into cases. It can be
evaluated easily for small k: that is, the fact that the integrand is a Max function isnt as weird as it
sounds. Take k 2 as an example:
]
0
1
]
0
1
Max[x, y] x y
]
0
1
_
]
0
y
y x
]
y
1
x x y

]
0
1
| y
2

1
2

1
2
y
2
] y

1
2
|
1
3
y
3
1]
0
1

2
3
Generally, however, we cant do the same thing (splitting the Max function into cases) as easily, since
the other variables are dependent on the integrand, not independent of it.
But enough of the wrong way to go about it. My solution is seemlingly strange, as I began by another
variable, which seems counter-intuitive. Instead of allowing the k numbers to take on any real value
in that range, I allowed them only to take one of n+1 values: in particular, any of the values {0,
1
n
,
2
n
,
,
n1
n
,
n
n
1}, each with equal probability. This way, if we can calculate what the expected value is
given a finite set of possible choices of the k reals, then we can let n , and solve our original
problem.
So, what is the probability that 0 is the largest number in the set? Each of the k numbers would have
to be equal to 0, so this has probability
1
n1

1
n1

1
n1
|
1
n1
]
k
.
What is the probability that
1
n
is the largest number? Each of the k numbers would have to be either
0 or
1
n
, and at least one of them would have to be exactly
1
n
. We may try this: first choose the one that
has value exactly
1
n
, which we can choose k ways each with probability
1
n1
; then the remaining k 1
numbers have 2 different options, giving a total probability of
k
n1
|
2
n1
]
k1

k2
k1
(n1)
k
. But this over-
looks the fact that if one of the remaining k 1 numbers happens to be exactly
1
n
, then the original
number we choose to have that value doesnt have to have that value.
Rather, well go with another counter-intuitive move: add yet another variable! To avoid confusion,
lets call the probability that the largest number is exactly
i
n
P (i). Then we will say that
P(i) _
j 1
k
Q(i, j ), where Q(i, j ) (well refer to it as Q( j ) for short, but it is important to note that it
is a function of i), defined as the probability that the largest value is exactly
i
n
and exactly j of the k
reals are equal to
i
n
. We will now see that calculating Q( j ) is doable without much trouble.
What is Q(1)? We choose the 1 number that is equal to
i
n
, which we can do k ways, then choose the
rest (they have i options, namely the numbers |0,
1
n
,
2
n
, ... ,
i 1
n
|), which gives a total probability of
ki
k1
(1n)
k
.
And what of Q(2)? We choose the 2 numbers equal to
i
n
, which we can do
k
2
ways, then the
remaining ones (each have i options again), giving a total probability of
k
2
i
k2
(1n)
k
.
Seeing a pattern? Generally, we first choose which j numbers are exactly equal to
i
n
, which we can do
k
j
ways, then pick the remaining k j numbers. This gives the value Q( j )
k
j
i
k j
(1n)
k
.
Now we have an explicit value for the expected value: it is
E(k) _
i 0
n
P(i)
i
n
, i.e. the probability of getting that value times the value itself, or
E(k)
1
n(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j 1
.
We can check quickly that this checks for our k 2 case, letting n :
E(2)
1
n(1n)
2
_
i 0
n
_
j 1
2
2
j
i
2 j 1

1
n(1n)
2
_
i 0
n
(2 i
2
i)
1
n(1n)
2

1
6
n(1 n) (5 4 n)
4 n5
6 n6
,
which quickly becomes
2
3
in the limit.
Generally, this sum is a little tricky to evaluate, but letting n makes it a tad easier.
Lim
n
E(k) Lim
n
1
n(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j 1
Lim
n
1
n(1n)
k
_
i 0
n
_
k
1
i
k

k
2
i
k1

k
k 1
i
2

k
k
i
Lim
n
1
n(1n)
k
_
i 0
n
k i
k
Lim
n
k
n(1n)
k
_
i 0
n
i
k
First some clarification: between the second and third lines, we removed all the terms but one in the
summigrand (yeah I made it up). How can we do that? Each term would become a polynomial in n of
degree one more than the degree of the i: thats how such sums evaluate (since the binomial coeffi-
cients attached to them are just constants). Thus, as n , only the leading term of the polynomial
will become important (notice that the largest power of i in the summigrand is k, which will become a
polynomial in n of degree k 1, which matches the denominator of the thing multiplying the sum).
But we are still left with the trouble of evaluating that sum: I have done it in general, but there isnt
even an explicit version. I used two different ways of calculating related sums to produce the value of
this sum recursively (the sum Im talking about is _
i 0
n
i
k
, i.e. some polynomial in n thats a function
of k). Heres what I wrote:
Let S
n,x
_
i 1
n
i
x
In general, we write _
i 1
n
((i 1)
p
i
p
) (n 1)
p
1 _
i 1
n
i
p
_
k0
p
_
p
k
i
pk
(1)
k1

(n 1)
p
1 _
k0
p1
p
k
S
n,k
(n 1)
p
1 _
k0
p2
p
k
S
n,k
p S
n, p1
, and finally we have
S
n, p1

1
p
(n 1)
p
1 _
k0
p2
p
k
S
n,k
, or in general,
S
n,x

1
x1
(n 1)
x1
1 _
k0
x1
x 1
k
S
n,k
.
How in the world I came up with this, I will never know. But what I do know is that this actually
helps us to determine what the leading coefficient of the leading term in the expansion of the sum is
(thats all we want, after all, for as n all the lesser terms drop out, and we have
k
1
times that
coefficient as the value for E(k)). The polynomial S
n,x
has a bunch of crap on the right side (that
weird sum): ignore all that stuff, and notice the (n 1)
x1
: that guy has the largest power of n, i.e.
1
x1
n
x1
. This is the coefficient:
1
k1
. Therefore E(k)
k
k1
. Hooray!
A (perhaps) interesting unintended consequence of this is the fact that the sum _
i 0
n
P(i) 1, i.e. that
the probabilities have to sum to 1. This gives us a different series summing to n
k
for any positive
integers n and k:
1
1
(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j
(1 n)
k
_
i 0
n
_
j 1
k
k
j
i
k j
n
k
_
i 0
n1
_
j 1
k
k
j
i
k j
Oddly enough, this gives one of the two series I mentioned above used to evaluate S
n,x
(as we take
polynomials in n and let k vary). Interestingly, it generalizes a commonly known series: if we let
n 2, we get
2
k
_
i 0
1
_
j 1
k
k
j
i
k j
(1)
k
1

k
2
...
k
k
But then, recursively,
3
k
2
k
2
k1
k
1
2
k2
k
2
... 2
k
k 1
1, or generallly,
(n 1)
k
n
k
n
k1
k
1
... n
k
k 1
1
which is likely provable by some algebraic manipulation; as is so common, our problem-solving
analysis above is the equivalent analysis to that manipulation.
Expected Value of the Greatest of a Set of Reals.nb 3
Using the Finite to Explain the Infinite
I recently considered a seemingly simple problem (that I was surprised I hadnt thought of before):
given k real numbers {x
i

i 0
k
each satisfying x
i
[0, 1], what is the expected value of the largest one
among them?
The exact expected value is given by the expression
]
0
1
]
0
1

]
0
1
Max[x
1
, x
2
, ... x
k
] x
1
x
2
x
k
,
where there are k integrals. Unfortnuately, I was unable to determine a way to evaluate this integral
generally, though not for lack of trying. I still think there may be some recursive method, and more
likely some extremely tedious method for evaluating this integral by splitting it into cases. It can be
evaluated easily for small k: that is, the fact that the integrand is a Max function isnt as weird as it
sounds. Take k 2 as an example:
]
0
1
]
0
1
Max[x, y] x y
]
0
1
_
]
0
y
y x
]
y
1
x x y

]
0
1
| y
2

1
2

1
2
y
2
] y

1
2
|
1
3
y
3
1]
0
1

2
3
Generally, however, we cant do the same thing (splitting the Max function into cases) as easily, since
the other variables are dependent on the integrand, not independent of it.
But enough of the wrong way to go about it. My solution is seemlingly strange, as I began by another
variable, which seems counter-intuitive. Instead of allowing the k numbers to take on any real value
in that range, I allowed them only to take one of n+1 values: in particular, any of the values {0,
1
n
,
2
n
,
,
n1
n
,
n
n
1}, each with equal probability. This way, if we can calculate what the expected value is
given a finite set of possible choices of the k reals, then we can let n , and solve our original
problem.
So, what is the probability that 0 is the largest number in the set? Each of the k numbers would have
to be equal to 0, so this has probability
1
n1

1
n1

1
n1
|
1
n1
]
k
.
What is the probability that
1
n
is the largest number? Each of the k numbers would have to be either
0 or
1
n
, and at least one of them would have to be exactly
1
n
. We may try this: first choose the one that
has value exactly
1
n
, which we can choose k ways each with probability
1
n1
; then the remaining k 1
numbers have 2 different options, giving a total probability of
k
n1
|
2
n1
]
k1

k2
k1
(n1)
k
. But this over-
looks the fact that if one of the remaining k 1 numbers happens to be exactly
1
n
, then the original
number we choose to have that value doesnt have to have that value.
Rather, well go with another counter-intuitive move: add yet another variable! To avoid confusion,
lets call the probability that the largest number is exactly
i
n
P (i). Then we will say that
P(i) _
j 1
k
Q(i, j ), where Q(i, j ) (well refer to it as Q( j ) for short, but it is important to note that it
is a function of i), defined as the probability that the largest value is exactly
i
n
and exactly j of the k
reals are equal to
i
n
. We will now see that calculating Q( j ) is doable without much trouble.
What is Q(1)? We choose the 1 number that is equal to
i
n
, which we can do k ways, then choose the
rest (they have i options, namely the numbers |0,
1
n
,
2
n
, ... ,
i 1
n
|), which gives a total probability of
ki
k1
(1n)
k
.
And what of Q(2)? We choose the 2 numbers equal to
i
n
, which we can do
k
2
ways, then the
remaining ones (each have i options again), giving a total probability of
k
2
i
k2
(1n)
k
.
Seeing a pattern? Generally, we first choose which j numbers are exactly equal to
i
n
, which we can do
k
j
ways, then pick the remaining k j numbers. This gives the value Q( j )
k
j
i
k j
(1n)
k
.
Now we have an explicit value for the expected value: it is
E(k) _
i 0
n
P(i)
i
n
, i.e. the probability of getting that value times the value itself, or
E(k)
1
n(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j 1
.
We can check quickly that this checks for our k 2 case, letting n :
E(2)
1
n(1n)
2
_
i 0
n
_
j 1
2
2
j
i
2 j 1

1
n(1n)
2
_
i 0
n
(2 i
2
i)
1
n(1n)
2

1
6
n(1 n) (5 4 n)
4 n5
6 n6
,
which quickly becomes
2
3
in the limit.
Generally, this sum is a little tricky to evaluate, but letting n makes it a tad easier.
Lim
n
E(k) Lim
n
1
n(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j 1
Lim
n
1
n(1n)
k
_
i 0
n
_
k
1
i
k

k
2
i
k1

k
k 1
i
2

k
k
i
Lim
n
1
n(1n)
k
_
i 0
n
k i
k
Lim
n
k
n(1n)
k
_
i 0
n
i
k
First some clarification: between the second and third lines, we removed all the terms but one in the
summigrand (yeah I made it up). How can we do that? Each term would become a polynomial in n of
degree one more than the degree of the i: thats how such sums evaluate (since the binomial coeffi-
cients attached to them are just constants). Thus, as n , only the leading term of the polynomial
will become important (notice that the largest power of i in the summigrand is k, which will become a
polynomial in n of degree k 1, which matches the denominator of the thing multiplying the sum).
But we are still left with the trouble of evaluating that sum: I have done it in general, but there isnt
even an explicit version. I used two different ways of calculating related sums to produce the value of
this sum recursively (the sum Im talking about is _
i 0
n
i
k
, i.e. some polynomial in n thats a function
of k). Heres what I wrote:
Let S
n,x
_
i 1
n
i
x
In general, we write _
i 1
n
((i 1)
p
i
p
) (n 1)
p
1 _
i 1
n
i
p
_
k0
p
_
p
k
i
pk
(1)
k1

(n 1)
p
1 _
k0
p1
p
k
S
n,k
(n 1)
p
1 _
k0
p2
p
k
S
n,k
p S
n, p1
, and finally we have
S
n, p1

1
p
(n 1)
p
1 _
k0
p2
p
k
S
n,k
, or in general,
S
n,x

1
x1
(n 1)
x1
1 _
k0
x1
x 1
k
S
n,k
.
How in the world I came up with this, I will never know. But what I do know is that this actually
helps us to determine what the leading coefficient of the leading term in the expansion of the sum is
(thats all we want, after all, for as n all the lesser terms drop out, and we have
k
1
times that
coefficient as the value for E(k)). The polynomial S
n,x
has a bunch of crap on the right side (that
weird sum): ignore all that stuff, and notice the (n 1)
x1
: that guy has the largest power of n, i.e.
1
x1
n
x1
. This is the coefficient:
1
k1
. Therefore E(k)
k
k1
. Hooray!
A (perhaps) interesting unintended consequence of this is the fact that the sum _
i 0
n
P(i) 1, i.e. that
the probabilities have to sum to 1. This gives us a different series summing to n
k
for any positive
integers n and k:
1
1
(1n)
k
_
i 0
n
_
j 1
k
k
j
i
k j
(1 n)
k
_
i 0
n
_
j 1
k
k
j
i
k j
n
k
_
i 0
n1
_
j 1
k
k
j
i
k j
Oddly enough, this gives one of the two series I mentioned above used to evaluate S
n,x
(as we take
polynomials in n and let k vary). Interestingly, it generalizes a commonly known series: if we let
n 2, we get
2
k
_
i 0
1
_
j 1
k
k
j
i
k j
(1)
k
1

k
2
...
k
k
But then, recursively,
3
k
2
k
2
k1
k
1
2
k2
k
2
... 2
k
k 1
1, or generallly,
(n 1)
k
n
k
n
k1
k
1
... n
k
k 1
1
which is likely provable by some algebraic manipulation; as is so common, our problem-solving
analysis above is the equivalent analysis to that manipulation.
4 Expected Value of the Greatest of a Set of Reals.nb

You might also like