Continuous and Random Variables
Continuous and Random Variables
DC-1
Semester-II
Paper-III: Statistical Methods in Economics-I
Lesson: Continuous random variables
And probability distributions
Lesson Developer: Chandra Goswami
College/Department: Department of Economics,
Dyal Singh College, University of Delhi
TABLE OF CONTENTS
Number
Learning Objectives
Reference
Jay L. Devore: Probability and Statistics for Engineering and the Sciences,
Cengage Learning, 8th edition [Chapter 4]
Learning objectives:
In this chapter you will learn what is meant by a continuous random variable. You
will learn how to arrive at the probability distribution of such types of random
variables and how to represent these graphically, as well as presentation by
summary expressions. You will then learn how to derive cumulative distribution
functions from the probability distribution function. You will also be able to derive
the probability densities from the cumulative distribution function. If either the
probability density function or the cumulative distribution function is known then
you will be able to evaluate the probability that the random variable takes on
specific values or a range of values. You will also learn how to identify the
characteristics of the population distribution like the shape of the distribution.
Chapter Outline
1. Continuous random variables
2. Probability distributions for continuous random variables
3. Cumulative distribution functions for continuous random variables
4. Deriving probability densities from cumulative distribution functions
5. Percentiles of a continuous distribution
6. Shape of the probability distribution
Example 1.1
Students of a college are given an objective type test. The proportion of correct
answers that a student scores in the test is a continuous variable which can range
from 0 to 1. Measured as a percentage, the outcome varies from 0 to 100
percent.
Example 1.2
A student travels to college by metro. The frequency of trains in the morning is 4
minutes. If the student reaches the platform as one train is departing she will
have to wait for 4 minutes till the next train enters the station. If she reaches just
as one train enters the station then she will have to wait 0 minutes to board the
train. If she reaches after the earlier train has left and the next train is yet to
arrive, she will have to wait for a time period between 0 and 4 minutes. Waiting
time is a continuous variable with a minimum of 0 minutes and a maximum of 4
minutes.
Example 1.3
The daily consumption of water (in liters) by an individual at home varies from
day to day through any given year. It depends on various factors like amount of
time spent at home, weather conditions, time of year, how much of the time
spent at home is during waking hours, etc. The unit of measurement is a
continuous variable with a minimum value of 0 liters.
Definition 1
A random variable is continuous if both the following conditions apply
1. Its set of possible values consists either of all numbers in an interval on
the number line or all numbers in a disjoint union of such intervals.
2. No possible value of the random variable has a positive probability.
Condition 1 implies that there is no way to create a listing of all the infinite
number of possible values of the variable. Condition 2 implies that intervals of
values have positive probability. As the width of the interval diminishes,
probability of the interval decreases. In the limit, probability of the interval is zero
as the width of the interval reduces to zero.
Example 1.4
The university team is scheduled to visit any minute during a three hour long
examination starting at 9am. We may want to find the probability that the team
visits at a given time or we may be interested in the probability that the visit
takes place during a given time interval. The sample space is from 0 to 180
c
minutes. The probability that the team visits during an interval of length c is .
180
This assignment of probabilities applies only to intervals on the measurement axis
from 0 to 180. The probability decreases as the interval becomes shorter. For an
5
interval of 5 seconds, the probability is computed as 0.0004629 As the
10800
length of the interval approaches zero, the probability that the team will visit also
approaches zero. That is why we always assign zero probability for a single point
on the number line. This does not mean that the team will not visit. The team will
visit at some point in the interval from 0 to 180 minutes even though each point
has zero probability.
Variables such as time, height, distance, temperature, area, volume, weight, etc
that require measurement are continuous. In practice, however, limitations of
To derive the probability distribution for a continuous rv let us first begin with a
discrete rv. Let X be a discrete rv which can take integer values such that x 1 < X
< xn, where x1 and xn are the minimum and maximum values respectively of the
rv X.
If x = x1, x2, …., xn then we can draw a probability histogram with n rectangles.
The area of the rectangle centered at xj is the proportion of the population that
fj
has the value xj, ie, where N is the population size. Summing over the n
N
n
fi
values of X we obtain N
i 1
1
Now we allow X to take one additional value in each interval so that x 1’ is midway
between x1 and x2; x2’ is midway between x2 and x3; and so on. Then total
number of x values will be 2n – 1 (instead of 2n, as there are n - 1 intervals).
With measurements of x taken at smaller intervals, the rectangles become
narrower, though the sum of the areas of all rectangles remains one.
Definition 2
Let X be a continuous random variable. Then a probability distribution or
probability density function (pdf) of X is a function f(x) such that for any two
b
The probability that X takes on a value in the interval [a, b] is the area under the
graph of the density function above the interval [a, b] on the number line.
The first condition requires non-negative values of pdf for any x value. The
second condition requires that area under the entire curve of f(x) should equal
one, ie X values are collectively exhaustive. If all possible values of X are
considered then the second condition will be satisfied. Examples of pdf are the
continuous Uniform Distribution, the Normal Distribution, the Exponential
Distribution, etc.
Unlike the pmf, where we can obtain P(X = c) as the probability that the discrete
rv X takes the value c, the probabilities for a continuous rv are always associated
with intervals. The pdf yields P(X = c) = 0 for any particular value of the rv X.
This follows from the definition of a continuous rv as specified in condition 2 of
definition 1.
This is not the case with discrete random variables. If both a and b are possible
values of the discrete rv X then these probabilities will all be different. If a < b,
then for the discrete rv X,
P(a < X < b) ≠ P(a < X < b) ≠ P(a < X < b) ≠ P(a < X < b).
Example 2.1
A milk vendor has a refrigerated storage tank of 1000 liters capacity, which is
filled each morning for sale during the day. It is not possible to predict the
amount of milk sold on any particular day. The sale of milk on any day can vary
from 0 lt. to 1000 lt. Past experience shows that any demand in the interval of 0
and 1000 is equally likely. The rv X indicates the sale of milk on a particular day.
The pdf of X is given by the continuous Uniform Distribution
0.001 0 x 1000
f(x) =
0 otherwise
In general, if α and β are the lower and upper limits of the value that the
continuous rv X can take, then pdf of X is
1
0 x 1000
f(x; α, β ) =
0 otherwise
The probability of an interval depends only on the width of the interval in case of
the uniform distribution.
1
In our example, β – α = 1000 so that = 0.001. We can use this to obtain
the probability that sale of milk on a particular day is between 200 and 500 liters
as follows:
P(200 < X < 500) = (500 – 200)(0.001) = 0.3
Note that α and β are the parameters of a population of the continuous rv X that
is described by a uniform distribution. We have a family of uniform distributions
for different values of the two parameters. Each distribution is specified by a
particular pair of values of α and β.
Exercise 1
Show that f(x) = 3x2 for 0 < x < 1 represents a pdf and calculate P(0.1 < x <
0.5).
Solution
f(x) can represent a pdf if both conditions for a pdf are satisfied, ie, f(x) > 0 and
f ( x) dx 1 .
Since f(x) = 3x2 and x2 > 0 always, hence f(x) > 0 for all x values. Therefore,
for 0 < x < 1, f(x) > 0 and the first condition is satisfied.
1 1
3x 3
3x
2
dx = = 1 – 0 = 1, which satisfies the second condition for pdf.
0 3 0
Since both conditions are satisfied, f(x) = 3x 2 represents a pdf for 0 < x < 1
0. 5
Now, P(0.1 < x < 0.5) =
3x dx = (0.5)3 – (0.1)3 = 0.125 – 0.001 = 0.124
2
0 .1
Example 2.2
e x x0
The pdf for a continuous rv is given as f(x) =
0 x0
So that as x value increases from x = 0, f(x) decreases rapidly or exponentially,
as illustrated in Fig 3
e
x
Now, P(a < X < b) = dx. This is the shaded area in figure 3.
a
If a = 2 and b = 5, then
5 5
e x
e
x
P(2 < X < 5) = dx = 2 = - (0.006738 – 0.135335) = 0.128597 =
2
0.13
Therefore, 13 percent of the area under the curve of f(x) = e- x
lies above the
measurement axis in the interval [2, 5].
Exercise 2
Show that f(x) = e- x for 0 < x < ∞ represents a pdf, and compute the probability
that
X > 1.
Solution
f(x) = e -x
would represent a pdf if f(x) > 0 and f ( x) dx 1 for 0 < x < ∞
0
f ( x) dx e
x
= dx = 0 = [0 – 1] = 1.
0 0
e
x
P(X > 1) = dx = - [ 0 – e- 1] = e-1 = 0.368
1
Exercise 3
The pdf of the rv X is given by
k
0 x4
f ( x) x
0 otherwise
Find (a) the value of k, and (b) P(X > 1)
Solution
4
k
(a) Given that f(x) is a pdf we have
0 x
dx 1
4 4
4 k x 1
Now
0 x
dx
12
= 2k [2 – 0] = 4k. Equating 4k and 1 we get k = 4 so
0
1
that f(x) =
4 x
4 4
2 x 1
P(X > 1) =
1 1
(b) dx = 1 - 2 = 2 = 0.5
1 4 x 4 1
Exercise 4
If the continuous random variable X can take only non-negative values and has
the density function f(x) = e2x for x > 0, and 0 otherwise, what is the maximum
value of X?
Solution
e dx 1
2x
If f(x) is a density function then for x > 0, and 0 otherwise.
0
x x
e2 y e2 x 1
0 1 e2x = 3
2y
e dy
2 0
2 2
Therefore, 2x = ln 3 = 1.0986, so that x = 0.549
Hence, f(x) will be a density function for 0 < x < 0.549. Maximum value of X is
0.549
3 CUMULATIVE DISTRIBUTION FUNCTIONS FOR CONTINUOUS
RANDOM VARIABLES
Similar to the case of discrete random variables, there are many problems where
we need to know the probability that a continuous rv X takes a value that does
not exceed a specified value x. For this we need the cumulative distribution
function (cdf) of X.
Definition 3
If X is a continuous random variable then the cumulative distribution function
F(x) for X is defined for every number x by
x
For each x, F(x) is the area under the density curve to the left of x. As x value
increases, F(x) also increases smoothly until F(x) =1 and then it continues as a
flat line parallel to the measurement axis.
The cdf gives the probability P(X < x) obtained by integrating the pdf f(y)
between
-∞ and x. As in the case of the discrete rv, here too F(- ∞) = 0, F(∞) = 1, and
F(a) < F(b) when a < b.
Also P(a < X < b) = F(b) – F(a) where a < b.
Since X is a continuous rv,
P(a < X < b) = P(a < X < b) = P(a < X < b) = F(b) – F(a) where a < b.
Example 3.1
1
A x B
Given the uniform distribution f(x; A, B ) = B A ,
0 otherwise
x
0 x A
x A
F(x) = A x B
B A
1 x B
The pdf and cdf of the uniform distribution of a continuous rv are illustrated in Fig
4.
If the graph of the pdf is bell-shaped as in case of the Normal Distribution [fig 5
(a)], then the cdf will be as in Figure 5 (b)
Exercise 5
The density function of the rv X is given by
6 x1 x 0 x 1
f(x) =
0 otherwise
Obtain the cdf and compute P(X < ½).
Solution
yx
y 2 y3
x x x
F(x) = f ( y) dy 0 6 y 1 y dy 6 0 y y 2
dy 6 2 3
y 0
If x < 0, F(x) = 0
If 0 < x < 1, F(x) = 3x2 – 2x3
If x = 1, F(x) = 3 – 2 = 1
If x > 1 F(x) = 1 since f(x) = 0
Therefore the cdf can be represented as follows
0 x0
2
F(x) = 3 x 2 x 0 x 1
3
1 x 1
To compute P(X < ½), we substitute x = ½ in F(x) since P(X < ½) = P (X < ½)
for a continuous rv.
3
F(1/2) = 3(1/4) – 2(1/8) = 4 41 12 = 0.5
Exercise 6
x 1
Show that the expression g ( x) can serve as a cdf for -1 < x < 1.
2
Solution
If g(x) is to represent a cdf we must show that g(x) = 0 for x < -1, g(x) = 1 for x
> 1, and 0 < g(x) < 1 for the interval -1 < x < 1.
1 1 11
Now, g (1) 0, and g (1) 1. Let us select a value x = 0 in the
2 2
given interval.
1
Then g(0) = 2 where 0 < 1
2
< 1.
Since all three requirements are satisfied, g(x) can serve as a cdf for -1 < x < 1
For given cdf we can obtain the pdf by taking the derivative of F(x). By definition
3, if X is a continuous rv and the value of its probability density at y is f(y) then
the cdf is
x
dF ( x)
Hence, f(x) = = F'(x) at every x at which the derivative F'(x) exists.
dx
Example 4.1
In example 3.1, for the uniform distribution the cdf is
0 x A
x A
F ( x) A x B
B A
1 x B
The graph of F(x) is given in Fig 4(b).
It can be seen that F(x) is differentiable for A < x < B.
At x = A and x = B, F(x) cannot be differentiated.
For x < A, F(x) = 0 and for x > B, F(x) = 1
Hence, F'(x) = f(x) = 0 if x < A, or, if x > B.
0 x A
1
f ( x) A x B
B A
0 xB
Exercise 7
A continuous rv Y has a cdf given by
0 y0
2
F(y) = y 0 y 1
1 y 1
Compute P( 12 < Y < 3
4
) in the two ways by using (a) the cdf, and (b) the pdf
Solution
1 3 3 1 9 1 5
(a) P( 2 < Y < 4 ) = F( 4 ) – F( 2 ) = 16 - 4 = 16 = 0.3125
2 y dy y
2 4
Then P( 1
2
<Y< 3
4
)= 1
2
1
2
9 1 5
=
16 4 16
0.3125
Definition 4
Let p be a number between 0 and 1. The (100p)th percentile of the distribution
of a continuous random variable X, denoted by η(p), is defined by
p
p = F[η(p)] = f ( y) dy
Then η(p) is that value on the measurement axis such that 100p percent of the
area under the graph of f(x) lies to the left of η(p) and 100(1-p) percent lies to
the right. This is illustrated in Figure 7
Figure 7 Percentiles
If p = 0.3 then 30% of the area under the graph of f(x) lies to the left of η(0.3)
and 70% to the right of η(0.3). The 30 th percentile is denoted by η(0.3) since p =
0.3
Example 5.1
For the rv X with following pdf
1
x 1 2 x4
f ( x) 8
0 otherwise
To find the 75th percentile, η(0.75), we need to first obtain the cdf from the given
pdf.
8 x 1 dx
1
F(x) =
p p
x 1 x2 x
Therefore, F[η(p)] = p =
2
dx =
8 8
16 8 2
3
Substituting p = 0.75 = 4 we obtain
3 p p 4 2
2
= =
1
p 2 2 p 8
4 16 8 16 8 16
Rearranging the terms we get
[η(p)]2 + 2η(p) – 20 = 0
2 4 80
Factorising, η(p) = 1 4.58 3.58 or 5.58
2
Since minimum value of X is 2 and the maximum is 4, the 75th percentile is 3.58
because that is the only possible value that X can take. The alternative value -
5.58 does not fall in the range of possible values.
Hence, η(p) = 3.58
Definition 5
Example 6.1
The median of the pdf given in example 5.1 is computed by letting p = ½ so that
~ ~
1 x 1 x2 x ~ 2 ~ 1
F[ ~ ] =
2
= 2 8 8 = 16 8 2 = 16 8 2
dx
~ 2 ~
Therefore, 1 0 ~ 2 2~ 1600 0
16 8
2 4 64
So that ~ = 1 4.123
2
Since 2 < x < 4, ~ = 3.123.
Half the area of the density curve is to the left of 3.123 and the other half is to
the right.
If a random variable has a symmetric pdf then the median will coincide with the
point of symmetry since half the area under the density curve lies on either side
of the point.
A positively skewed distribution has a long right-hand tail. Similarly, a negatively
skewed distribution has a long left-hand tail. Figure 8 illustrates the three kinds of
distributions.
Example 6.2
The incomes of employees of a company will usually be positively skewed as
there are a large number of low income workers and fewer employees with high
income.
Example 6.3
A well known manufacturing company assures that its product will last a
minimum period of three years. However, due to a defective component sourced
from one of the suppliers, the lifetime of a batch of the product is likely to be
drastically reduced. The distribution will then be negatively skewed.
It can be shown that for a symmetric pdf the median coincides with the mean of
the distribution. If the mean and median have different values then the
distribution is asymmetric, ie, skewed. If mean is less than median the
distribution is skewed to the left or negatively skewed. On the other hand a
distribution is positively skewed or skewed to the right when the mean is greater
than the median.
The mode of the distribution is that value of the random variable at which the
graph of the probability distribution reaches its highest point. If there is only one
peak or “high point” it is a unimodal distribution. If there are two modes it is
called a bimodal distribution. A distribution having more than two modes is said
to be multimodal.
Example 6.4
Suppose that the rv X has pdf
1
9 4 x
2
1 x 2
f ( x)
0 otherwise
Differentiating f(x) with respect to x, we get
f ' x 0
2x
9
Setting f ' x 0 we get x = 0
Taking the second derivative,
Comparison of the mode and median can also be used to indicate the shape of
the distribution. For a symmetric distribution mode = median. In case of a
positively skewed distribution, medium > mode, whereas medium < mode for a
negatively skewed distribution.
The other characteristics of the distribution like mean and variance can be
computed with the help of mathematical expectations.
PRACTICE QUESTIONS
1. Suppose the rv Y has the pdf f(y = 4y3 for 0 < y < 1 and 0 otherwise.
Find
P(0 < Y < ½).
0 x 227.5
1
f ( x) 227.5 x 232.5
5
0 x 232.5
Find the probabilities that a 230-gram jar filled by this machine will
contain
(a) at most 228.65 gm of coffee
(b) anywhere from 229.34 to 231.66 gm of coffee
(c) at least 229.85 gm of coffee
0 x0
2
x
F ( x) 0 x2
4
1 2 x
value α = - 0.015 and maximum value β = 0.015. Find the probabilities that
such an error will
(i) be between – 0.002 and 0.003
(ii) exceed 0.005 in absolute value