Lecture 8
Lecture 8
Python
Probability
Distributions
1
What is a
distribution?
• Describes the ‘shape’ of a batch of numbers
2
Why
distribution?
• Can serve as a basis for standardized comparison of empirical
distributions
• Can help us estimate confidence intervals for inferential
statistics
• Form a basis for more advanced statistical methods
– ‘fit’ between observed distributions and certain theoretical
distributions is an assumption of many statistical procedures
3
Random
variable
• A variable which contains the outcomes of a chance experiment
• “Quantifying the outcomes”
• Example X= (1 = Head, 0 = Tails)
• A variable that can take on different values in the population
according to some “random” mechanism
• Discrete
– Distinct values, countable
– Year
• Continuous
– Mass
4
Probability
Distributions
• The probability distribution function or probability density function (PDF)
of a random variable X means the values taken by that random variable
and their associated probabilities.
5
PDF of
Number of Heads (X):
Discrete
0 1
r.v.2 sum
PDF (P(X)): ¼ ½ ¼ 1
T h e P D F o f t h e N u m b e r o f H e a d s in
T w o To s s e s o f a C o i n
0 .6
0 .5
0 .5
0 .4
Probability
0 .3 0 .2 0 .2
Density
5 5
0 .2
0 .1
0
0 1 2
Nu mb er of Heads
6
Probability Distribution for the
Random Variable X
A probability distribution for a discrete
random variable X:
x –8 –3 –1 0 1 4 6
P(X = x) 0.13 0.15 0.17 0.20 0.15 0.11 0.09
Find
a. PX 0.65
0 0.67
b. P 3 X
7
Discrete Distribution
-- Example
Distribution of Daily
Crises P
Number of Probability r 0.5
Crises o 0.4
b
0 0.37 a 0.3
b
1 0.31 0.2
i
2 0.18 l 0.1
3 0.09 i
0
4 0.04 t 0 1 2 3 4 5
5 0.01 Number of Crises
y
8
Requirements for a Discrete Probability Function
• Probabilities are between 0 and 1, inclusively
0 P( X) 1 for all X
P( X ) 1
over all x
9
Cumulative
Distribution Function
• The CDF of a random variable X (defined as F(X)) is a graph
associating all possible values, or the range of possible values with
P(X x).
• CDFs always lie between 0 and 1 i.e., 0 F(Xi) 1, Where F(Xi) is
the CDF.
10
The Expected Value of X
Let X be a discrete rv with set of possible values D and pmf p(x). The
expected value or mean value of X, denoted
E( X ) or X , is
E( X ) X x
p(x)
xD
11
Mean and Variance of a Discrete Random
Variable
14
The Variance and Standard Deviation
(or X2 or 2 ), is
22, 25, 20, 18, 12, 20, 24, 20, 20, 25, 24, 25,
18
Find the variance and standard deviation.
Value 12 18 20 22 24 25
Frequency 1 2 4 1 2 3
Probability .08 .15 .31 .08 .15 .23
21
V ( X ) p1 x1 2 p2 x2 2 ... pn xn
2
16
V ( X ) .08 12 212 .15 18 212 .3120
212
.08 22 212 .15 24 212 .23 25
V (2 X )
2113.25
V(X 13.25
) 3.64
17
Shortcut Formula for Variance
2
V(X) x p(x) 2
2
D
E X 2 E X
2
18
Mean of a Discrete
Distribution
E X X
X P(X) X.P(X)
P(
-1
X ) .1 -.1
0 .2 .0
1 .4 .4
2 .2 .4
3 .1 .3
1.0
19
Variance and Standard Deviation
of a Discrete Distribution
2
1.2
X ( X 1.10
2
2
P( X )
1.2 X P(X) X ( X )
2
) P( X )
2
-1 .1 -2 4 .4
0 .2 -1 1 .2
1 .4 0 0 .0
2 .2 1 1 .2
3 .1 2 4 .4
1.2
20
Mean of the Data
Example
E X X P( X
X P(X)
) 0 1.15
.37
X P(X)
.00
0.5
P
r
o 0.4
1 .31 .31 b
a 0.3
2 .18 .36 b
0.2
i
3 .09 .27 l 0.1
i
4 .04 .16 0
t 0 1 2 3 4 5
5 .01 .05 y
Number
1.15
21
Properties of
Expected Value
1.E(b) b, b is a
constant. 2. E(X +Y) =
E(X)+ E(Y).
E( X
3.E Y E(Y
4.E( XY ) E() X )E(Y ) unless they are
X )
.
indpendendent. 5.E(aX ) aE( X ), a constant.
6.E(aX b) aE( X ) b, a and b are constants.
22
Properties of
1. Var(constant) = 0 Variance
2. If X and Y are two independent random variables, then
Var(X + Y) = Var(X) + Var (Y) and
Var(X - Y) = Var(X) + Var (Y)
3. If b is a constant then Var(b+X) = Var(X)
4. If a is a constant then Var(aX) = a2Var(X)
5. If a and b are constants then Var(aX+b) = a2Var(X)
6. If X and Y are two independent random variables and a and b are
constants then Var(aX+bY) = a2Var(X) + b2Var(Y)
23
Covariance
Covariance: For two discrete random variables X and Y with E(X) =
x and E(Y) = y, the covariance between X and Y is defined as
Cov(XY) = xy = E(X - x) E(Y - y) = E(XY) - x y.
24
Covariance
• In general, the covariance between two random variables can be
positive or negative.
• If two random variables move in the same direction, then the
covariance will be positive, if they move in the opposite direction
the covariance will be negative.
Properties:
1. If X and Y are independent random variables, their covariance
is zero. Since E(XY) = E(X)E(Y)
2. Cov(XX) = Var(X)
3. Cov(YY) = Var(Y)
25
Correlation
Coefficient
• The covariance tells the sign but not the magnitude about how
strongly the variables are positively or negatively related. The
correlation coefficient provides such measure of how strongly the
variables are related to each other.
• For two random variables X and Y with E(X) = x and E(Y) =
y,
the correlation coefficient is defined as
xy Cov( XY ) xy
x y x y
26
27
Thank
You
28