Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
67 views

Engineering Data Analysis Chapter 3 - Discrete Probability Distribution

This document discusses discrete probability distributions and related concepts. It defines discrete and continuous random variables, and describes probability mass functions and cumulative distribution functions. It also defines expected value and provides examples of calculating probabilities and expected values for discrete random variables.

Uploaded by

etdr4444
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views

Engineering Data Analysis Chapter 3 - Discrete Probability Distribution

This document discusses discrete probability distributions and related concepts. It defines discrete and continuous random variables, and describes probability mass functions and cumulative distribution functions. It also defines expected value and provides examples of calculating probabilities and expected values for discrete random variables.

Uploaded by

etdr4444
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Chapter 3

DISCRETE PROBABILITY DISTRIBUTIONS

Introduction

Many physical systems can be modelled by a similar or the same random variables

and random experiments. The distribution of the random variables involved in each of

these common systems can be analyzed, and the result of that analysis can be used in

different applications and examples. In this chapter, the analysis of several random

experiments and discrete random variables that often appear in applications is discussed.

A discussion of the basic sample space of the random experiment is frequently omitted

and the distribution of a particular random variable is directly described.

Discrete Probability Distribution

A discrete distribution describes the probability of occurrence of each value of a

discrete random variable. A discrete random variable is a random variable that has

countable values, such as a list of non-negative integers.

With a discrete probability distribution, each possible value of the discrete random

variable can be associated with a non-zero probability. Thus, a discrete probability

distribution is often presented in tabular form.

Video: https://www.youtube.com/watch?v=mrCxwEZ_22o
3.1 Random Variables and Their Probability Distributions

Random Variables

In probability and statistics, a random variable is a variable whose value is subject

to variations due to chance (i.e. randomness, in a mathematical sense). As opposed to

other mathematical variables, a random variable conceptually does not have a single,

fixed value (even if unknown); rather, it can take on a set of possible different values,

each with an associated probability.

A random variable’s possible values might represent the possible outcomes of a

yet-to-be-performed experiment, or the possible outcomes of a past experiment whose

already-existing value is uncertain (for example, as a result of incomplete information or

imprecise measurements). They may also conceptually represent either the results of an

“objectively” random process (such as rolling a die), or the “subjective” randomness that

results from incomplete knowledge of a quantity. Random variables can be classified as

either discrete (that is, taking any of a specified list of exact values) or as continuous

(taking any numerical value in an interval or collection of intervals). The mathematical

function describing the possible values of a random variable and their associated

probabilities is known as a probability distribution.

Discrete Random Variables

Discrete random variables can take on either a finite or at most a countably infinite

set of discrete values (for example, the integers). Their probability distribution is given by

a probability mass function which directly maps each value of the random variable to a

probability. For example, the value of x1 takes on the probability p1, the value of x2 takes

on the probability p2, and so on. The probabilities pi must satisfy two requirements: every
probability pi is a number between 0 and 1, and the sum of all the probabilities is 1.

(p1+p2+⋯+pk=1)

Discrete Probability Distribution

This shows the probability mass function of a discrete probability distribution. The

probabilities of the singletons {1}, {3}, and {7} are respectively 0.2, 0.5, 0.3. A set not

containing any of these points has probability zero.

Examples of discrete random variables include the values obtained from rolling a

die and the grades received on a test out of 100.

Probability Distributions for Discrete Random Variables

Probability distributions for discrete random variables can be displayed as a

formula, in a table, or in a graph. A discrete random variable x has a countable number

of possible values. The probability distribution of a discrete random variable x lists the

values and their probabilities, where value x1 has probability p1, value x2 has

probability x2, and so on. Every probability pi is a number between 0 and 1, and the sum

of all the probabilities is equal to 1.

Examples of discrete random variables include:

• The number of eggs that a hen lays in a given day (it can’t be 2.3)

• The number of people going to a given soccer match

• The number of students that come to class on a given day

• The number of people in line at McDonald’s on a given day and time


A discrete probability distribution can be described by a table, by a formula, or by a

graph. For example, suppose that xx is a random variable that represents the number of

people waiting at the line at a fast-food restaurant and it happens to only take the values

2, 3, or 5 with probabilities 2/10, 3/10, and 5/10 respectively. This can be expressed

through the function f(x) = x/10, x=2, 3, 5 or through the table below. Of the conditional

probabilities of the event BB given that A1 is the case or that A2 is the case, respectively.

Notice that these two representations are equivalent, and that this can be represented

graphically as in the probability histogram below.

Video: https://www.youtube.com/watch?v=cqK3uRoPtk0

Probability Histogram: This histogram displays the probabilities of each of the three

discrete random variables. The formula, table, and probability histogram satisfy the

following necessary conditions of discrete probability distributions:

1. 0≤f(x) ≤1, i.e., the values of f(x) are probabilities, hence between 0 and 1.

2. ∑f(x) =1, i.e., adding the probabilities of all disjoint cases, we obtain the probability

of the sample space, 1.

Sometimes, the discrete probability distribution is referred to as the probability mass


function (pmf). The probability mass function has the same purpose as the probability

histogram, and displays specific probabilities for each discrete random variable.

The only difference is how it looks graphically.

Probability Mass Function

This shows the graph of a probability mass function. All the values of this function

must be non-negative and sum up to 1.

x f(x)

2 0.2

3 0.3

5 0.5

Discrete Probability Distribution: This table shows the values of the discrete random

variable can take on and their corresponding probabilities.

Example 1. A shipment of 20 similar laptop computers to a retail outlet contains 3 that

are defective. If a school makes a random purchase of 2 of these computers, find the

probability distribution for the number of defectives.

Solution:

Let X be a random variable whose values x are the possible numbers of defective

computers purchased by the school. Then x can only take the numbers 0, 1, and 2.
Now,

Thus, the probability distribution of X is

x 0 1 2

f(x) 68/95 51/190 3/190

3.2 Cumulative Distribution Functions

You might recall that the cumulative distribution function is defined for discrete

random variables as:

𝐹(𝑥) = 𝑃(𝑋 ≤ 𝑥) = ∑ 𝑓(𝑡)


𝑡≤𝑥

Again, F(x) accumulates all of the probability less than or equal to x. The cumulative

distribution function for continuous random variables is just a straightforward extension of

that of the discrete case. All we need to do is replace the summation with an integral.

The cumulative distribution function ("c.d.f.") of a continuous random variable X is

defined as:
𝑥
𝐹(𝑥) = ∫ 𝑓(𝑡)𝑑𝑡
−∞

For -∞<x<∞
Example 1. Suppose that a day’s production of 850 manufactured parts contains 50 parts

that do not con- form to customer requirements. Two parts are selected at random,

without replacement, from the batch. Let the random variable X equal the number of

nonconforming parts in the sample. What is the cumulative distribution function of X?

Solution:

The question can be answered by first finding the probability mass function of X.

Therefore,

The cumulative distribution function for this example is graphed in the figure below. Note

that F(x) is defined for all x from - < x <  and not only for 0, 1, and 2.

Graph of the cumulative distribution function for the above example


3.3 Expected Values of Random Variables

The expected value of a random variable is the weighted average of all possible

values that this random variable can take on.

Discrete Random Variable

A discrete random variable X has a countable number of possible values. The

probability distribution of a discrete random variable X lists the values and their

probabilities, such that xi has a probability of pi. The probabilities pi must satisfy two

requirements:

1. Every probability pi is a number between 0 and 1.

2. The sum of the probabilities is 1: p1+p2+⋯+pi = 1.

Expected Value Definition

In probability theory, the expected value (or expectation, mathematical

expectation, EV, mean, or first moment) of a random variable is the weighted average of

all possible values that this random variable can take on. The weights used in computing

this average are probabilities in the case of a discrete random variable.

The expected value may be intuitively understood by the law of large numbers: the

expected value, when it exists, is almost surely the limit of the sample mean as sample

size grows to infinity. More informally, it can be interpreted as the long-run average of the

results of many independent repetitions of an experiment (e.g. a dice roll). The value may

not be expected in the ordinary sense—the “expected value” itself may be unlikely or even

impossible (such as having 2.5 children), as is also the case with the sample mean.
How To Calculate Expected Value

Suppose random variable X can take value x1 with probability p1, value x2 with

probability p2, and so on, up to value xi with probability pi. Then the expectation value of

a random variable XX is defined as E[X] = x1 p1+ x2 p2+⋯+xi pi, which can also be written

as: E[X] =∑xi p1.

If all outcomes xi are equally likely (that is, p 1= p2 =⋯=pi), then the weighted average

turns into the simple average. This is intuitive: the expected value of a random variable is

the average of all values it can take; thus, the expected value is what one expects to

happen on average. If the outcomes xi are not equally probable, then the simple average

must be replaced with the weighted average, which takes into account the fact that some

outcomes are more likely than the others. The intuition, however, remains the same: the

expected value of X is what one expects to happen on average.

For example, let X represent the outcome of a roll of a six-sided die. The possible values

for X are 1, 2, 3, 4, 5, and 6, all equally likely (each having the probability of 1/6). The

expectation of X is:

E[X] = (1x1/6) + (2x2/6) + (3x3/6) + (4x4/6) + (5x5/6) + (6x6/6) = 3.5.

In this case, since all outcomes are equally likely, we could have simply averaged the

numbers together:

(1 + 2 + 3 + 4 + 5 + 6) /6 = 3.5.
Average Dice Value Against Number of Rolls

An illustration of the convergence of sequence averages of rolls of a die to the

expected value of 3.5 as the number of rolls (trials) grows.

3.4 The Binomial Distribution

Binomial Experiment

A binomial experiment is a statistical experiment that has the following properties:

• The experiment consists of n repeated trials.

• Each trial can result in just two possible outcomes. We call one of these outcomes

a success and the other, a failure.

• The probability of success, denoted by P, is the same on every trial.

• The trials are independent; that is, the outcome on one trial does not affect the

outcome on other trials.

Consider the following statistical experiment. You flip a coin 2 times and count the

number of times the coin lands on heads.


This is a binomial experiment because:

1. The experiment consists of repeated trials. We flip a coin 2 times.

2. Each trial can result in just two possible outcomes - heads or tails.

3. The probability of success is constant - 0.5 on every trial.

4. The trials are independent; that is, getting heads on one trial does not affect

whether we get heads on other trials.

The following notation is helpful, when we talk about binomial probability.

• x: The number of successes that result from the binomial experiment.

• n: The number of trials in the binomial experiment.

• P: The probability of success on an individual trial.

• Q: The probability of failure on an individual trial. (This is equal to 1 - P.)

• n!: The factorial of n (also known as n factorial).

• b (x; n, P): Binomial probability - the probability that an n-trial binomial experiment

results in exactly x successes, when the probability of success on an individual

trial is P.

• nCr: The number of combinations of n things, taken r at a time.

Binomial Distribution

A binomial random variable is the number of successes x in n repeated trials of

a binomial experiment. The probability distribution of a binomial random variable is called

a binomial distribution.
Suppose we flip a coin two times and count the number of heads (successes). The

binomial random variable is the number of heads, which can take on values of 0, 1, or 2.

The binomial distribution is presented below.

Number of Heads Probability

0 0.25

1 0.50

2 0.25

The binomial distribution has the following properties:

▪ The mean of the distribution (μx) is equal to n * P.

▪ The variance (σ2x) is n * P * (1 - P).

▪ The standard deviation (σx) is sqrt [n * P * (1 - P)].

Binomial Formula and Binomial Probability

The binomial probability refers to the probability that a binomial experiment

results in exactly x successes. For example, in the above table, we see that the binomial

probability of getting exactly one head in two-coin flips is 0.50.

Given x, n, and P, we can compute the binomial probability based on the binomial

formula.

Binomial Formula. Suppose a binomial experiment consists of n trials and results

in x successes. If the probability of success on an individual trial is P, then the binomial

probability is:
b (x; n, P) = nCx * Px * (1 - P) n - x

or

b (x; n, P) = {n! / [ x! (n - x)!]} * Px * (1 - P) n - x

Example 1.Suppose a die is tossed 5 times. What is the probability of getting exactly 2

fours?

Solution:

This is a binomial experiment in which the number of trials is equal to 5, the number

of successes is equal to 2, and the probability of success on a single trial is 1/6 or about

0.167. Therefore, the binomial probability is:

b (2; 5, 0.167) = 5C2 * (0.167)2 * (0.833)3

b (2; 5, 0.167) = 0.161

Cumulative Binomial Probability

A cumulative binomial probability refers to the probability that the binomial

random variable falls within a specified range (e.g., is greater than or equal to a stated

lower limit and less than or equal to a stated upper limit).

For example, we might be interested in the cumulative binomial probability of obtaining

45 or fewer heads in 100 tosses of a coin. This would be the sum of all these individual

binomial probabilities.

b (x < 45; 100, 0.5) =

b (x = 0; 100, 0.5) + b (x = 1; 100, 0.5) + ... + b (x = 44; 100, 0.5) + b (x = 45; 100, 0.5)
Example 1. What is the probability of obtaining 45 or fewer heads in 100 tosses of a

coin?

Solution:

To solve this problem, we compute 46 individual probabilities, using the binomial

formula. The sum of all these probabilities is the answer we seek. Thus,

b (x < 45; 100, 0.5) = b (x = 0; 100, 0.5) + b (x = 1; 100, 0.5) + . . . + b (x = 45; 100, 0.5)

b (x < 45; 100, 0.5) = 0.184

Example 3. The probability that a student is accepted to a prestigious college is 0.3. If 5

students from the same school apply, what is the probability that at most 2 are

accepted?

Solution:

To solve this problem, we compute 3 individual probabilities, using the binomial

formula. The sum of all these probabilities is the answer we seek. Thus,

b (x < 2; 5, 0.3) = b(x = 0; 5, 0.3) + b(x = 1; 5, 0.3) + b(x = 2; 5, 0.3)

b(x < 2; 5, 0.3) = 0.1681 + 0.3601 + 0.3087

b(x < 2; 5, 0.3) = 0.8369


3.5 The Poisson Distribution

A Poisson distribution is the probability distribution that results from a Poisson

experiment.

Attributes of a Poisson Experiment

A Poisson experiment is a statistical experiment that has the following properties:

▪ The experiment results in outcomes that can be classified as successes or

failures.

▪ The average number of successes (μ) that occurs in a specified region is known.

▪ The probability that a success will occur is proportional to the size of the region.

▪ The probability that a success will occur in an extremely small region is virtually

zero.

Note that the specified region could take many forms. For instance, it could be a length,

an area, a volume, a period of time, etc.

Notation

The following notation is helpful, when we talk about the Poisson distribution.

• e: A constant equal to approximately 2.71828. (Actually, e is the base of the natural

logarithm system.)

• μ: The mean number of successes that occur in a specified region.

• x: The actual number of successes that occur in a specified region.

• P (x; μ): The Poisson probability that exactly x successes occur in a Poisson

experiment, when the mean number of successes is μ.


Poisson Distribution

A Poisson random variable is the number of successes that result from a

Poisson experiment. The probability distribution of a Poisson random variable is called

a Poisson distribution.

Given the mean number of successes (μ) that occur in a specified region, we can

compute the Poisson probability based on the following Poisson formula.

Poisson Formula. Suppose we conduct a Poisson experiment, in which the average

number of successes within a given region is μ. Then, the Poisson probability is:

P (x; μ) = (e-μ) (μx) / x!

where x is the actual number of successes that result from the experiment, and e is

approximately equal to 2.71828.

The Poisson distribution has the following properties:

▪ The mean of the distribution is equal to μ.

▪ The variance is also equal to μ.

Example 1. The average number of homes sold by the Acme Realty company is 2

homes per day. What is the probability that exactly 3 homes will be sold tomorrow?

Solution:

This is a Poisson experiment in which we know the following:

▪ μ = 2; since 2 homes are sold per day, on average.

▪ x = 3; since we want to find the likelihood that 3 homes will be sold tomorrow.
▪ e = 2.71828; since e is a constant equal to approximately 2.71828.

We plug these values into the Poisson formula as follows:

P (x; μ) = (e-μ) (μx) / x!

P (3; 2) = (2.71828-2) (23) / 3!

P (3; 2) = (0.13534) (8) / 6

P (3; 2) = 0.180

Thus, the probability of selling 3 homes tomorrow is 0.180.

Cumulative Poisson Probability

A cumulative Poisson probability refers to the probability that the Poisson

random variable is greater than some specified lower limit and less than some specified

upper limit.

Example. Suppose the average number of lions seen on a 1-day safari is 5. What is the

probability that tourists will see fewer than four lions on the next 1-day safari?

Solution: This is a Poisson experiment in which we know the following:

▪ μ = 5; since 5 lions are seen per safari, on average.

▪ x = 0, 1, 2, or 3; since we want to find the likelihood that tourists will see fewer

than 4 lions; that is, we want the probability that they will see 0, 1, 2, or 3 lions.

▪ e = 2.71828; since e is a constant equal to approximately 2.71828.

To solve this problem, we need to find the probability that tourists will see 0, 1, 2, or 3

lions. Thus, we need to calculate the sum of four probabilities: P (0; 5) + P (1; 5) + P (2;

5) + P (3; 5).
To compute this sum, we use the Poisson formula:

P (x < 3, 5) = P (0; 5) + P (1; 5) + P (2; 5) + P (3; 5)

P (x < 3, 5) = [ (e-5) (50) / 0!] + [ (e-5) (51) / 1!] + [(e-5) (52) / 2!] + [(e-5) (53) / 3!]

P (x < 3, 5) = [(0.006738) (1) / 1] + [(0.006738) (5) / 1] + [(0.006738) (25) / 2] +

[(0.006738) (125) / 6]

P (x < 3, 5) = [0.0067] + [0.03369] + [0.084224] + [0.140375]

P (x < 3, 5) = 0.2650

Thus, the probability of seeing at no more than 3 lions is 0.2650.

REFERENCES:

Montgomery, D. C. et al. (2003). Applied Statistics and Probability for Engineers 3rd Edition. USA.
John Wiley & Sons, Inc.

Walpole, R. E. et al. (2016). Probability & Statistics for Engineers & Scientists 9th Edition. England.
Pearson Education Limited

https://courses.lumenlearning.com/boundless-statistics/chapter/discrete-random-variables/

https://newonlinecourses.science.psu.edu/stat414/node/98/

https://stattrek.com/probability-distributions/binomial.aspx

https://stattrek.com/probability-distributions/poisson.aspx

You might also like