Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

W7_Random Variables and Probability Distribution

This document covers the concepts of random variables and probability distributions in statistics, focusing on calculating expected values and variances for discrete random variables. It explains the differences between discrete and continuous random variables, introduces various probability distributions such as uniform, geometric, and binomial distributions, and illustrates how to compute expected values and variances when adding or subtracting random variables. Additionally, it provides examples related to insurance payouts and investor interest probabilities.

Uploaded by

J.C.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

W7_Random Variables and Probability Distribution

This document covers the concepts of random variables and probability distributions in statistics, focusing on calculating expected values and variances for discrete random variables. It explains the differences between discrete and continuous random variables, introduces various probability distributions such as uniform, geometric, and binomial distributions, and illustrates how to compute expected values and variances when adding or subtracting random variables. Additionally, it provides examples related to insurance payouts and investor interest probabilities.

Uploaded by

J.C.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

STAT 4001

STATISTICS I FOR ANALYTICS


WEEK 7 - RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS

Maryam Zangiabadi
Ch. 9: Random Variables and Probability
Distributions
Learning Objectives
1) Calculate the expected value and variance of discrete random variable
2) Analyze the effect of adding and subtracting random variables
3) Model discrete random variables

2
Expected Value of a Random Variable
A variable whose value is based on the outcome of a random event is called a random
variable
If we can list all possible outcomes, the random variable is called a discrete random
variable
Example: An inspector examines the light bulbs in a box of 6. The possible outcome of
this experiment defines the number of defective light bulbs in the box. The possible
outcomes are x = {0,1, 2, 3, 4, 5, 6}. The variable x is a discrete random variable.
If a random variable can take on any value between two values in an interval, it is
called a continuous random variable
Example: The exact time that takes a bus to complete its route between two towns
may be any value 20 minutes to 30 minutes. The variable x is a continuous random
variable.
3
Expected Value of a Random Variable
For both discrete and continuous random variables, the set of all the possible values
and their associated probabilities is called the probability model
When the probability model is known, then the expected value can be calculated:

E (X ) =  x  P (x) ( discrete random variable )


E (X ) is sometimes written as  or as EV .

4
Expected Value of a Random Variable
Example: The probability model for a particular life insurance policy is shown in the
table below. Find the expected annual payout on a policy.

 1   2   997 
E ( X ) = $100,000   + $50,000   + $0   $200
 1000   1000   1000 
Policyholder Payout x Probability P (X = x)
𝐸(𝑋) = $200 Outcome (cost)
We expect that the Death 100,000 1
insurance company will pay 1000
out $200 per policy per year Disability 50,000 2
1000
Neither 0 997
1000
Table - Probability model for an insurance policy.
5
Standard Deviation and Variance of a
Random Variable
The standard deviation also measures the dispersion or spread in the values of a
random variable.

Variance of a discrete probability


distribution Where,
Var(X)= σ 𝑥 − 𝐸(𝑋) 2 𝑃(𝑥) E(X) = Expected value of x
x = Values of the discrete random
Standard deviation of a discrete probability variable
P(x) = Probability of the random
distribution
variable taking on the value x
𝑆𝐷(𝑋) = ෍ 𝑥 − 𝐸(𝑋) 2 𝑃(𝑥)

6
Standard Deviation and Variance of a
Random Variable
Example: The probability model for a particular life insurance policy is shown. Find the
variance and standard deviation of the annual payout.
Policyholder Outcome Payout x (cost) Probability P (X = x) Deviation (x − E (x))

Death 100,000 1 (100,000 − 200) = 99,800


1000
Disability 50,000 2 (50,000 − 200) = 49,800
1000
Neither 0 997 (0 − 200) = −200
1000

Var(X)= σ 𝑥 − 𝐸(𝑋) 2 𝑃(𝑥)


SD( X ) = 14,960,000  $3867.82
 1  2 2  2  997 
Var ( X ) = 99,800 
2
 + 49,800   + ( −200)  
 1000   1000   1000 
= 14,960,000

7
Standard Deviation and Variance of a
Random Variable
Example: Consider the distribution of the number of children per household in a city shown
below. Find the standard deviation of the number of children in families per household.
Difference between outcome x & EV
For finding EV Deviates 𝟐 𝟐
𝑿 𝑷(𝒙) 𝒙𝑷(𝒙) (𝒙 − 𝑬(𝑿)) 𝒙−𝑬 𝑿 𝒙−𝑬 𝑿 𝒑(𝒙)
0 0.2 0 -1.6 2.56 0.512
1 0.25 0.25 -0.6 0.36 0.09
2 0.35 0.7 0.4 0.16 0.056
3 0.15 0.45 1.4 1.96 0.294
4 0.05 0.2 2.4 5.76 0.288
Helps measure the “spread”
Total 1 1.6 of values around the mean 1.24
This is E(X) This is Var(X)
2𝑃
𝑉𝑎𝑟 𝑋 = ෍ 𝑥 − 𝐸 𝑋 𝑥 = 1.24, 𝑆𝐷 𝑋 = 𝑉𝑎𝑟 𝑋 = 1.24 = 1.11

8
Book251

Adding and Subtracting Random


Variables
Adding a constant c to X:
E( X  c ) = E ( X )  c,
Var ( X  c ) = Var ( X ),and
SD( X  c ) = SD( X ).

Multiplying X by a constant a:
E (aX ) = aE ( X ),and variance depends on the square of
the deviations from the mean. Since
Var (aX ) = a 2Var ( X ).
variance is based on squared
SD(aX ) = a SD( X ). differences, any scaling effect is
also squared.

9
Book252
This insurance company sells policies to10
more than just one person. We’ve just

Adding and Subtracting Random seen how to compute means and


variances for one person at a time. What
happens to the mean and variance when

Variables find expected values and variances for sums


we have a collection of customers? The
profit on a group of customers is the sum
of the individual profits

Expected Value when Adding or Subtracting Random Variables


E ( X  Y ) = E ( X )  E (Y ).

Variances when Adding or Subtracting (independent) Random Variables


Var ( X  Y ) = Var ( X ) + Var (Y )
if X and Y are independent .

Note: we always add the Variances (even when subtracting the Random Variables)
11

Adding and Subtracting Random


Variables
Illustration: The expected annual payout per insurance policy is $200 and the variance is
$14,960,000. If the payout amounts are doubled, what are the new expected value and
variance?
E ( 2 X ) = 2E ( X ) = 2  200 = $400
Var ( 2 X ) = 22Var ( X ) = 4  14,960,000 = 59,840,000
12

Adding and Subtracting Random


Variables
Illustration: The expected annual payout and variance per insurance policy X and Y are as
follows: 𝐸(𝑋) = 10, 𝐸(𝑌) = 15, 𝑆𝐷(𝑋) = 2, 𝑆𝐷(𝑌) = 3
Find: =E(X)+5
=10+5
𝐸(𝑋 + 5) =15
=E(Y)+10
=15+10 𝐸 𝑌 + 10
=25 =|5| SD(X)
𝑆𝐷(5𝑋) =5x2
=10
SD(X+Y)
=√(Var X+ Var Y)
=√{[(SD(X)^2]+ [SD(Y)^2]}
=√(2^2+3^2)
=√13
Introduction to Discrete Probability
Distributions
The Uniform Distribution
If X is a random variable with possible outcomes 1, 2, …, n and
1
𝑃 𝑋 = 𝑖 = , 𝑓𝑜𝑟 𝑒𝑎𝑐ℎ 𝑖,
𝑛
then we say X has a discrete Uniform distribution U[1, …, n].
Example: Tossing a fair die is described by the Uniform model U[1, 2, 3, 4, 5, 6], with
1
𝑃 𝑋=𝑖 = .
6

13
Book259

Introduction to Discrete Probability


Distributions
Definition: A Bernoulli Trial is a trial with the following characteristics:
1) There are only two possible outcomes (success and failure) for each trial.
2) The probability of success, denoted p, is the same for each trial. The probability of
failure is q = 1 − p.
3) The trials are independent.

The next two probability models apply to experiments with


Bernoulli trials.

14
Book260
15

The Geometric Distribution


The Geometric Distribution: Predicting the number of Bernoulli trials required to
achieve the first success.

Geometric Probability Model for Bernoulli Trials


p = Probability of success (and q = 1 − p = probability of failure)
X = Number of trials until the first success occurs,
P(X = x) = qx−1p

1
Expected value: =
p
Standard deviation: q
=
p2
16

The Geometric Distribution


Example: A venture capital firm has a list of potential investors who have previously invested
in new technologies. On average, these investors invest in about 5% of the opportunities
presented to them. A new client of the firm is interested in finding investors for a 3-D
printing technology. An analyst at the firm starts calling potential investors.
1. How many investors will she have to call, on average, to find someone interested?
2. What is the probability that the number of calls she needs to make before finding
someone interested is 7?
Solution: The probability of finding an interested investor is 𝑝 = 0.05.
1. Let X = number of people she calls until she finds someone interested

2.
17

The Binomial Distribution


The Binomial Distribution: Predicting the number of successes in a series of Bernoulli
trials.

Binomial Model for Bernoulli Trials


n = Number of trials
p = Probability of success (and q = 1 − p = probability of failure)
X = Number of successes in n trials
𝑃(𝑋 = 𝑥) = probability of getting 𝑥 successes in n trials
𝑛 𝒙 𝒏−𝒙 𝑛 𝑛!
𝑷 𝒙 = 𝒑 𝟏−𝒑 , 𝒘𝒉𝒆𝒓𝒆 =
𝑥 𝑥 𝑥! 𝑛 − 𝑥 !
Mean: 𝜇 = 𝑛𝑝
Standard deviation: 𝜎 = 𝑛𝑝𝑞
18

The Binomial Distribution


Example: A venture capital firm has a list of potential investors who have previously invested
in new technologies. On average, these investors invest in about 5% of the opportunities
presented to them. A new client of the firm is interested in finding investors for a 3-D printing
technology. An analyst at the firm starts calling potential investors.
1. If she calls 10 investors, what is the probability that exactly 2 of them will be interested?
2. If she calls 10 investors, what is the probability that at least 2 of them will be interested?
Solution: The probability of finding an interested investor is 𝑝 = 0.05.
1. 𝒏 = 𝟏𝟎, 𝒙 = 𝟐, 𝒑 = 𝟎. 𝟎𝟓
10 10 10!
𝑷 𝒙 = 𝟎. 𝟎𝟓𝟐 𝟏 − 𝟎. 𝟎𝟓 𝟏𝟎−𝟐 , 𝒘𝒉𝒆𝒓𝒆 = = 45
𝟐 𝟐 2! 10 − 2 !
𝑷 𝒙 = 𝟒𝟓 (𝟎. 𝟎𝟓)𝟐 𝟎. 𝟗𝟓 𝟖 = 𝟎. 𝟎746
19

The Binomial Distribution


2. If she calls 10 investors, what is the probability that at least 2 of them will be interested?
Solution:
𝒏 = 𝟏𝟎, 𝒙 = 𝟐, 𝒑 = 𝟎. 𝟎𝟓
𝑷 𝒂𝒕 𝒍𝒆𝒂𝒔𝒕 𝟐 = 𝟏 − 𝑷 𝒙 = 𝟎 − 𝑷 𝒙 = 𝟏 = 𝟏 − (𝟎. 𝟗𝟓)𝟏𝟎 −(𝟏𝟎)(𝟎. 𝟎𝟓)𝟏 𝟎. 𝟗𝟓 𝟗 = 𝟎. 𝟎𝟖𝟔
20

The Binomial Distribution - Excel


Example: 20 percent of gears produced by a factory are defective. A sample of 8 gears are
randomly selected.
1. What is the probability that at most 3 of them will be defective?
2. What is the probability that 3 of them will be defective? Round your answer to 3 decimal
places.
Solution:
True = cumulative
1. 𝑃 𝑎𝑡 𝑚𝑜𝑠𝑡 3 = 𝑃 𝑥 ≤ 3 = 0.944

2. 𝑃 3 = 𝑃 𝑥 = 3 = 0.147
False = exact range
21

The Binomial Distribution - Excel


Example: A survey of people in the 40-50 age range shows that 41% of them have a Tax-Free
saving account. In a particular condominium, there are 18 adults in this age range.

A. What’s the probability that 6 of them have a Tax-Free saving account ?


B. What’s the probability at most one of them have a Tax-Free saving account ?
C. What’s the probability more than one of them have a Tax-Free Saving Account ?
22

The Binomial Distribution


Example (continued):
A. What’s the probability that 6 of them have a Tax-Free Saving Account?
𝑥 = 6, n = 18
𝑝 = 0.41, q = 1 − 0.41 = 0.59
18!
𝑃 𝑥=6 = (0.416 ) 0.5918−6 = 0.156891
6! 18−6 !

B. What’s the probability at most one of them have a Tax-Free Saving Account?
𝑃 𝑥 ≤1 =𝑃 𝑥 =0 +𝑃 𝑥 =1
18! 0 18 18!
= (0.41 ) 0.59 + (0.411 ) 0.5918−1
0! 18−0 ! 1! 18−1 !
= 0.000075 + 0.000939 = 0.001014
23

The Binomial Distribution


Example (continued):
C. What’s the probability more than one of them have a Tax-Free Saving
Account?

𝑃 𝑥 > 1 = 1 − 𝑃 𝑥 ≤ 1 = 1 − 0.001014 = 0.998986


24

The Binomial Distribution


➢ Figure out what the 𝑛 is.
➢ Figure out what the 𝑝 is.
➢ Figure out what the 𝑥 is.

➢ Do you want this many successes? (i.e., 𝑥=) • 𝑃 𝑋 = 𝑥 Use Excel


➢ Do you want this many or less successes? (i.e.,𝑥 ≤) • 𝑃 𝑋 ≤ 𝑥 Use Excel
➢ Do you want this many or more successes? (i.e.,𝑥≥) • 𝑃 𝑋 ≥𝑥 =1−𝑃 𝑋 ≤𝑥−1
➢ Do you want more than X? => (i.e.,𝑥 >) • 𝑃 𝑋 >𝑥 =1−𝑃 𝑋 ≤𝑥
➢ Do you want less than X? => (i.e.,𝑥 <) • 𝑃 𝑋 <𝑥 =𝑃 𝑋 ≤𝑥−1
25

The Poisson Distribution


The Poisson Model: Predicting the number of events that occur over a given interval of
time or space.
Poisson Probability Model for Occurrences
λ = Mean number of occurrences per unit of time
X = Number of occurrences per unit of time
e −  λx
P( X = x) =
x!
Where, e is the base of the natural logarithm system (2.71828)
Expected value: E( X ) = λ
Standard deviation: SD( X ) = λ

Independence Assumption: The events must be independent of each other. This is a


major assumption that needs to be satisfied when we use the Poisson distribution.
26

The Poisson Distribution


The experiment consists of observing outcomes of interest for the segment
intervals. Segment interval may be a period of time, length, area, or weight.
The average number of occurrences per segment interval is .
The number of occurrences are random, and the occurrence of one outcome
does not influence the chances of another outcome of interest.
The probability that an outcome of interest occurs in a given segment is the
same for all segments.
27

The Poisson Distribution


Examples:
Number of patients who arrive at a hospital emergency room per hour
Number of calls the GBC help desk receives in a 30-minute period
Number of cars served at the gas station in 24-hour period
Number of defects per box of bulb lights
28

The Poisson Distribution


Example: A website averages 4 hits per minute. Find the probability that there will be
no hits in the next minute.
Solution: 𝜆 = 4, 𝑥 = 0
e −  λx
P( X = x) =
x!

e −4 40
P ( X = 0) = = e −4 = 0.0183(recall that e  2.71828).
0!
29

The Poisson Distribution


Example: At a COVID-19 Vaccination Clinic, five customers arrive on average in a ten
minute period. 1) What is the probability that in a ten minute period 3 people will
arrive at the clinic? 2) What is the probability that in a ten minute period at most 2
people will arrive at the clinic? Round your answer to 4 decimal places.
Solution: 𝜆 = 5, 𝑥 = 3
𝜆𝑥 𝑒 −𝜆 53 2.71828−5
1) 𝑝 𝑋 = 3 = = = 0.1404
𝑥! 3!

2) 𝑝 𝑋 ≤ 2 = 𝑝 𝑋 = 0 + 𝑝 𝑋 = 1 + 𝑝 𝑋 = 2 = 0.1247
References
The slides are a combination of the material from your text and Pearson recourses as
well as the original material.

30

You might also like