PSM 2 Variabel Acak
PSM 2 Variabel Acak
PSM 2 Variabel Acak
Properties of Random
Variables
Random variables are encountered in every discipline in science. In this Chapter topics:
chapter we discuss how we may describe the properties random variables, 1. Random variables
in particular by using probability distributions, as well as defining the mean 2. Probability distributions
and the standard deviation of random variables. Since random variables (especially the normal
are encountered throughout chemistry and the other natural sciences, this distribution)
chapter is rather broad in scope. We do, however, introduce one particular 3. Measures of location and
type of random variable, the normally-distributed random variable. One dispersion
of the most important skills you will need to obtain in this chapter is the
ability to use tables of the standard normal cumulative probabilities to solve
problems involving normally-distributed variables. A number of numerical
examples in the chapter will illustrate how to do so.
1
2 1. Properties of Random Variables
discrete values
continuous values
Figure 1.1: Difference between discrete and continuous random variables. A dis-
crete variable can only assume certain values (e.g., the hash-marks on the number
line), whereas a continuous variable can assume any value within the range of all
possible values.
displays the mass to the nearest 0.1 mg. There are 106 possible values in
this range — a large value, to be sure, but not infinitely large. For most pur-
poses, however, we may treat this measurement as a continuous variable.
P(x = 0) = 0.25
P(x = 1) = 0.5
P(x = 2) = 0.25
2. Each trial has only two possible results, “success” and “failure.” The
probability of success is p and the possibility of failure is q. Obvi-
ously, p + q = 1.
3. The probabilities p and q remain constant for all the trials in the
experiment.
trials, n, increases (due to the factorial terms involving n). There some other distribution
functions that can give reasonable approximations to the binomial function in such cases.
1.2. Probability Distributions of Discrete Variables 5
0.25
0.15
0.10
0.05
0.00
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Number of successes, x
Probability
Figure 1.3: The Poisson probability distribution, shown here as both a table and
a plot, describes the probability of observing alpha particle counts, as calculated
from eqn. 1.2 with λ = 2.5 counts/second and t = 1 second.
e−2.5 (2.5)5
P(x = 5) = = 0.0668
5!
Figure 1.3 shows the probabilities of measuring zero through 10 counts
during one measurement period for this experiment.
Just like the binomial distribution, the Poisson distribution of discrete
variables has two important properties:
0.7
0.6
0.5 298 K
400 K
Probability
0.4
0.3
0.2
0.1
0.0
0 1 2 3 4 5 6 7 8 9 10
bility distribution assumes evenly spaced vibrational levels (i.e., the harmonic oscil-
lator assumption). Notice that at higher temperatures, there is a greater probability
that a molecule will be in a higher energy level.
where ∆E is the separation between energy levels. Figure 1.4 shows the
probability distribution for the vibrational energy of an I2 molecule at two
different temperatures.
We can interpret the Boltzmann distribution in two ways, both of which
are useful:
people are familiar with means, the other two properties are actually easier
to understand.
Mode
The mode is the most probable value of a discrete variable. More generally,
it is the maximum of the probability distribution function: the value of
xmode such that Multi-modal Distribution
p(xmode ) = Pmax
probability
Multi-modal probability distributions have more than one mode — distri-
butions with two modes are bimodal, and so on. Although multi-modal dis-
tributions may have several local maxima, there is usually a single global
maximum that is the single most probable value of the random variable.
In the example with the alpha particle measurements (see fig. 1.3), the value
Median
The median is only a little more complicated than the mode: it is the value
Q2 such that
P(x < Q2 ) = P(x > Q2 )
In other words, there is an equal probability of observing a value greater
than or less than the median.
The median is also the second quartile — hence the origin of the symbol A distribution is sometimes split
Q2 . Any distribution can be divided into four equal “pieces,” such that: up ten ways, into deciles. The
median is the fifth decile, D5 .
P(x < Q1 ) = P(Q1 < x < Q2 ) = P(Q2 < x < Q3 ) = P(x > Q3 )
The boundaries Q1 , Q2 (i.e., the median), and Q3 are the quartiles of the
probability distribution.
Mean
listed in the periodic tables are weighted averages of isotope masses; the
weights are determined by the relative abundance of the isotopes.
In general, a weighted sum is represented by the expression
weighted sum = wi xi (1.4)
+∞
µx = λ · t
• If you choose the mode, you are essentially betting on the single most
likely outcome of the experiment.
• If you choose the median, you are equally likely to be larger or smaller
than the outcome.
• If you choose the mean, you have the best chance of being closest to
the outcome.
median
median
mode
mode
mean
mean
(a) (b)
Figure 1.5: Comparison of values of mean, median and mode for (a) positively
and (b) negatively skewed probability distributions. For symmetrical distributions
(so-called ‘bell-shaped’ curves) the three values are identical.
be near. In most applications, the mean gives the best single description of The mean is the most common
the location of the variable. descriptor of the location of a
Just how different are the values of the mean, median and mode? It random variable.
turns out that the three values are different only for assymetric distribu-
tions, such as the two shown in figure 1.5. If a distribution is skewed to the
right (or positively skewed; fig. 1.5(a)) then
while for distributions skewed to the left (negatively skewed; fig. 1.5(b))
0.14
0.12
Probability density
0.10
0.08
0.06
0.04
0.02
0.00
0 5 10 15 20 25 30 35 40
Value
Figure 1.6: Comparing the variability of two random variables. The variable de-
scribed by the broader probability distribution (dotted line) is more likely to be
farther from the mean than the other variable.
(Semi-)Interquartile Range
The interquartile range, IQR, is the difference between the first and third
quartiles (see figure 1.7):
IQR = Q3 − Q1
Q3 − Q1
QR = (1.6)
2
The mean absolute deviation is Since the dispersion describes the spread of a random variable about its
the expected value of | x − µx | mean, it makes sense to have a quantitative descriptor of this quantity. The
mean absolute deviation, MD, is exactly what it sounds like: the expected
value (i.e., the man) of the absolute deviation of a variable from its mean
value, µx .
MD ≡ E (| x − µx |)
The concept behind the mean absolute deviation is quite simple: it indi-
cates the mean (‘typical’) distance of a variable from its own mean, µx . For
1.3. Important Characteristics of Random Variables 13
Interquartile Range
Probability
Q1 Q2 Q3
Value
a discrete variable,
MD = | xi − µx | p(x) (1.7a)
Like the mean absolute deviation, the variance and standard deviation mea- The variance is the expected
sure the dispersion of a random variable about its mean µx . The variance value of (x − µx )2 , and the
standard deviation is the positive
of a random variable x, σx2 , is the expected value of (x − µx )2 , which is the root of the variance.
squared deviation of x from its mean value:
σx2 ≡ E (x − µx )2
As you can see, the concept of the variance is very similar to that of the
mean absolute deviation. In fact, the variance is sometimes called the mean
squared deviation. The variance for discrete and continuous variables is
given by
σx2 = (xi − µx )2 p(xi ) (1.8a)
i
∞
σx2 = (x − µx )2 p(x) dx (1.8b)
−∞
14 1. Properties of Random Variables
Look at the discrete variable (eqn. 1.8a): we have another weighted sum!
The values being summed, (xi − µx )2 , are the squared deviations of the
variable from the mean. The squared deviations indicate how far the value
xi is from the mean value µx , and the weights in the sum, as in eqn. 1.5, are
the probabilities of xi . Thus, “broader” probability distributions will tend
to have larger weights for values of x that have larger squared deviations
(x − µx )2 (and hence are more distant from the mean). Such distributions
will give larger values for the variance, σx2 . Higher variance signifies greater
variability of a random variable.
One problem with using the variance to describe the dispersion of a
random variable is that the units of variance are the squared units of the
original variable. For example, if x is a length measured in m, then σx2 has
The standard deviation is units of m2 . The standard deviation, σx , has the same units as x, and so
calculated
from the variance: is a little more convenient at times. The standard deviation is simply the
σx = + σx2 positive square root of the variance.
Sometimes the variability of a random variable is specified by the rela-
tive standard deviation, RSD:
The RSD is an alternate way to σx σx
present the standard deviation
RSD = or RSD =
µx x
Both of these expressions are commonly used to calculate RSD; which one
is used is usually obvious from the context. The RSD can be expressed as a
fraction or as a percentage. The RSD is sometimes called the coefficient of
variation (CV).
The standard deviation, σx , is We have described three common ways to measure a random variable’s
the most common measure of dispersion: semi-interquartile range, QR , mean absolute deviation, MD, and
dispersion.
the standard deviation, σ . These measures are all related to each other, so,
in a sense, it makes no difference which we use. In fact, for distributions
that are only moderately skewed, MD ≈ 0.8σ and QR ≈ 0.67σ . For a
variety of reasons (which are beyond the scope of this text), the variance
and standard deviation are the best measures of dispersion of a random
variable.
The function p(x) is called the probability density function of the continu- The probability density function,
ous random variable x. Figure 1.8 demonstrates the general idea. sometimes called the probability
mass function, is used to
Just as the probability of a discrete variable must sum to one over the determine probabilities of
entire domain, the area under the probability density function within the continuous random variables.
range of possible values for x must be one. For example, if the domain
ranges from −∞ to ∞, then
∞
p(x) dx = 1
−∞
As in the discrete case, the value of the function p(x) must be posi-
tive over its entire range. The probability density function allows us to
construct a probability distribution for continuous variables; indeed, some-
times it is called simply a “distribution function,” as with discrete variables.
However, evaluation of the probability density function for a particular
value x0 does not yield the probability that x = x0 — that probability is
zero, after all — as it would for a discrete distribution function.
Probability distributions are thus a little more complicated for continu-
ous variables than for discrete variables. The main difference is that prob-
abilities of continuous variables are defined in terms of ranges of values,
rather than single values. The probability density function, p(x) (if one
exists) can be used to determine these probabilities.
16 1. Properties of Random Variables
x1 x2
Probability density
Value
Example 1.1
Johnny Driver is a conscientious driver; on a freeway with a posted speed
limit of 65 mph, he tries to maintain a constant speed of 60 mph. How-
ever, the car speed fluctuates during moments of inattention. Assuming
that car speed follows a normal distribution with a mean µx = 60 mph
and standard deviation σx = 3 mph, what is the probability that Johnny
is exceeding the speed limit at any time?
Figure 1.10 shows a sketch of the situation. The car speed is a random
variable that is normally distributed with µx = 60 mph and σx = 3 mph.
1.4. Probability Distributions of Continuous Variables 17
0.05
0.04
Probability density
0.03
0.02
0.01
0.00
0 20 40 60 80 100
Measurement value
Figure 1.9: Plot of the Gaussian (“normal”) probability distribution with µ = 50 and
σ = 10. Note that most of the area under the curve is within 3σ of the mean.
where µx and σx are given the appropriate values. When this integral is
evaluated, a value of 0.0478 is obtained. Thus, there is a 4.78% probability
that Johnny is speeding.
x = 65
0.14
0.12
Probability density
0.10
0.08
0.06
0.04
0.02
0.00
50 52 54 56 58 60 62 64 66 68 70
Figure 1.10: Sketch of distribution of the random variable in example 1.1. The
area under the curve is the value we want: P(x > 65) = 0.0478
Example 1.2
Let’s say we set up an experiment such that the outcome is described by
a normal distribution with µx = 25.0 and σx = 2.0. A single measurement
yields x0 = 26.4; what is the z-score of this measurement?
The next example will show how we can use the z-tables to calculate prob-
abilities of normally-distributed variables. P(z > z0 )
Example 1.3
In example 1.1 we determined by integration the probability that a car
of variable speed was exceeding the speed limit (65 mph); the mean and
standard deviation of the car speed were 60 mph and 3 mph, respec-
tively. Now solve this problem using z-tables.
P(x > x0 ) =?
x0 − µx 65 − 60
z0 = = ≈ 1.67
σx 3
Now we can use the z-table to find the area in the ‘right tail’ of the
z-distribution. From the z-table, we see that
This answer agrees (more or less) with our previous value, 0.0478 (see
example 1.1). The slight difference is due to the fact that 53 does not exactly
equal 1.67.
20 1. Properties of Random Variables
You should become very familiar The Appendix presents a number of useful statistical tables, including one
with the concepts presented in for the standard normal distribution (i.e., a z-table). Since the normal dis-
this section.
tribution is symmetric, there is no need to list the areas corresponding to
both negative and positive z-score values, so most tables only present half
of the information. The z-table given in this book lists right-tail areas asso-
ciated with positive z-scores. In order to calculate the areas corresponding
to various ranges of normally-distributed variables, using only right-tail ar-
eas, a few important relationships should be learned.
This expression allows one to calculate left-tail areas from right-tail areas,
and vice versa. This symmetric nature of the normal probability distribu-
tion is illustrated here:
-z0 z0
=
= –
1.4. Probability Distributions of Continuous Variables 21
Example 1.4
A soft-drink machine is regulated so that it discharges an average vol-
ume of 200. mL per cup. If the volume of drink discharged is normally
distributed with a standard deviation of 15 mL,
(a) what fraction of the cups will contain more than 224 mL of soft
drink?
(c) what is the probability that a cup contains between 191 and 209 mL?
(d) below what volume do we get the smallest 25% of the drinks?
(a) This problem is similar to previous ones: we must find a right-tail area
P(x > x0 ), where x0 = 224 mL. To do so, we can use the z-tables if we first
calculate z0 , the z-score of x0 . 224
x − µx z0 − µx
P(x > x0 ) = P >
σx σx
150 175 200 225 250
224 − 200
=P z> = P(z > 1.6) P(x > 224 mL)
15
= 0.0548
Looking in the z-tables yields the answer. There is a 5.48% probability that
a 224 mL cup will overflow.
(b) In this case, the z-score is negative, so that we must use eqn. 1.16 to find
the probability using the z-tables in the Appendix.
175 − 200
P(x > 175 mL) = P z > 175
15
= P(z > −1.67) = 1 − P(z > 1.67)
= 1 − 0.0475 = 0.9525
150 175 200 225 250
(c) We must find P(x1 > x > x2 ), where x1 = 191 mL and x2 = 209 mL. To do
so using the z-tables, we must find the z-scores for both x1 and x2 , and
then use eqn. 1.17 to calculate the probability.
191 − 200 209 − 200
P(191 mL < x < 209 mL) = P <z<
15 15
= P(−0.6 < z < +0.6)
= 1 − P(z < −0.6) − P(z > +0.6)
= 1 − 2 · P(z > +0.6) = 1 − 2 · 0.2743
= 0.4514
So there is a 45.14% probability that a cup contains 191–209 mL.
(d) This question is a little different than the others. We must find a value x0
such that P(x < x0 ) = 0.25. In all of the previous examples, we began with
x0 a value (or a range of values) and then calculated a probability; now we are
doing the reverse — we must calculate the value associated with a stated
probability. In both cases we use the z-tables, but in slightly different ways.
To begin, from the z-tables we must find a value z0 such that P(z < z0 ) =
25%
0.25. Looking in the z-tables, we see that P(z > 0.67) = 0.2514 and P(z >
150 175 200 225 250
0.68) = 0.2483; thus, it appears that a value of 0.675 will give a right-tailed
P(x < x0 ) = 0.25
area of approximately 0.25. Since we are looking for a left-tailed area, we
can state that
P(z < −0.675) ≈ 0.25
Our next task is to translate this z-score into a volume; in other words, we
want to “de-standardize” the value z0 = −0.675 to obtain x0 , the volume
that corresponds to this z-score. From eqn. 1.12 on page 18, we may write
x0 = µx + z0 · σx
= 200 + (−0.675)(15) mL
= 189.9 mL
Thus, we have determined that the drink volume will be less than 189.9 mL
with probability 25%.
Example 1.5
The mean inside diameter of washers produced by a machine is 0.502 in,
and the standard deviation is 0.005 in. The purpose for which these
washers are intended allows a maximum tolerance in the diameter of
0.496–0.508 in; otherwise, the washers are considered defective. Deter-
mine the percentage of defective washers produced by the machine, as-
suming that the diameters are normally distributed.
0.496 0.508
We are looking for the probability that the washer diameter is either less
than 0.496 in or greater than 0.508 in. In other words, we want to calculate
the sum P(x < 0.496 in) + P(x > 0.508 in).
First we must calculate the z-scores of the two values x1 and x2 , where
0.48 0.49 0.50 0.51 0.52
x1 = 0.496 in and x2 = 0.508 in. Then we can use the z-table to determine
P(x < 0.496) + P(x > 0.508) the desired probability.
x1 − µx
z1 =
σx
0.496 − 0.502
= = −1.2
0.005
x2 − µx
z2 =
σx
0.508 − 0.502
= = 1.2
0.005
1.4. Probability Distributions of Continuous Variables 23
We can see that z1 = −z2 ; in other words, the two tails have the same area.
Thus,
Before leaving this section, consider the following characteristics of all ran-
dom variables that follow a normal distribution.
• Approximately two-thirds of the time the random variable will be
within one standard deviation of the mean value; to be exact,
68% 95%
-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4
z-score z-score
(a) (b)
• Counting experiments are also common, and these often result in vari-
ables described by a Poisson distribution, which is also a discrete dis-
tribution.
Still, there are some situations that result in continuous variables that
cannot be described by a normal distribution. We will describe two other
continuous probability functions, but there are many more.
Let’s go back to counting experiments (see page 5). In this type of exper-
iment, we are interested in counting the number of events that occur in a
unit of time or space. However, let’s say we change things around, as in the
following examples.
• We may count the number of photons detected per unit time (a dis-
crete variable) or we may measure the time between detected photons
(a continuous variable).
• We may count the numbers of molecules that react per unit time (i.e.,
the reaction rate, a discrete variable) or we may be interested in the
time between reactions (a continuous variable).
2.5
Probability density
2.0
1.5
1.0
0.5
0.0
0.0 0.5 1.0 1.5 2.0 2.5
Time, s
chances are that you can think of the process in terms of a counting exper-
iment. Examples of exponential decays are:
Atomic and molecular orbitals are probability density functions for the po-
sition of an electron in an atom or molecule. Such orbitals are sometimes
Atomic and molecular orbitals called electron density functions. They allows us to determine the prob-
are simply probability ability that the electron will be found in a given position relative to the
distributions describing the
position of electrons in atoms
nucleus. The different orbitals (e.g., 2s or 3px atomic orbitals) correspond
and molecules, respectively. to different probability density functions.
The electron density functions actually contain three random variables,
since they give the probability that an electron is at any given point in
space. As such they are really examples of joint probability distributions of
the three random variables corresponding to the coordinate axes (e.g., x, y
and z in a Cartesian coordinate system). For spherically-symmetric orbitals,
it is convenient to rewrite the joint probability distribution in terms of a
single variable r , which is the distance of the electron from the nucleus. For
the 1s orbital, this probability density function (called a radial distribution
function) has the following form:
Radial probability distribution
p(r ) = kr 2 e−3r /µr
function for atomic 1s orbitals.
52.9 pm
0.010
Probability density
0.008
0.006
0.004
0.002
0.000
0 50 100 150 200 250
Radial distance, pm
Figure 1.13: Radial distribution function of the hydrogen 1s orbital. The dotted
line at the mode indicates the Bohr radius, a0 , of the orbital, where a0 = 52.9 pm.
The mean radial distance µr for this orbital is 79.4 pm.