Statistics Handout CH 1&2
Statistics Handout CH 1&2
Statistics Handout CH 1&2
CHAPTER ONE
If there are as many numbers of points as the set of counting numbers, the sample space is called
count ably infinite. If the number of points is as many as those found in an interval of the set of
real numbers, the sample space is referred to as non-count ably infinite. A finite or count ably
infinite sample space is called discrete, while a non-count ably infinite sample space is called
non-discrete or continuous. or
Sample Space: is all the possible outcomes of an experiment. Alternatively, the entire collection
of possible outcomes from an experiment is termed the sample space. Example: choosing a card
from a deck. There are 52 cards in a deck (not including Jokers). So the Sample Space is all 52
possible cards: {Ace of Hearts, 2 of Hearts, etc...}
It is important to note that S corresponds to the universal set. Similarly, we can speak of an
impossible event. The analogue to an impossible event in the set theoretic context is the empty
set.
It is important to note that all the set operations can be applied to events. Therefore, if A and B
are two events in S, then A∪B stands for the event “A or B or both A and B occurred”, A∩B is
¿
the event “Both A and B occurred”, A is the event “Not A”, A−B is the event “A but not B”.
If A∩B = φ , it means the events A and B cannot occur simultaneously. In this case we say A
and B are mutually exclusive events.
Experiment or Trial: an action where the result is uncertain. Tossing a coin, throwing dice,
seeing what pizza people choose are all examples of experiments.
Terms to note in the definition of classical probability are random, n, mutually exclusive, and
equally likely.
There is always a certain degree of uncertainty as to whether an event associated with a random
experiment will occur or not. The chances that an event will occur range between 0 and 100
percent. In statistical considerations, we use the term probability instead of the term chance. Also
it is convenient to assign numbers between 0 and 1, inclusive, instead of percentages. If we are
certain that an event will occur, then we assign a probability 1 to it. On the other hand, if the
event cannot occur, the probability 0 is assigned to it. Any other events that are likely to occur
will be assigned probabilities between 0 and 1.
There are two approaches to the computation of the probability of an event that is associated with
a random experiment.
1/ Classical or Priori Approach: If an event can occur in m different ways out of a total n
m
possible ways, all of which are equally likely, then the probability of the event is n.
Note that both the classical and the frequency approaches have their own deficiencies. In the
former case, the phrase “equally likely” is vague. It may not always to be easy to think of equally
likely occurrences. The later suffers a shortcoming in that it is not clear.
Axiom a basic assumption in the definition of classical probability is that n is a finite number;
that is, there is only a finite number of possible outcomes. If there are an infinite number of
possible outcomes, the probability of an outcome is not defined in the classical sense.
Mutually exclusive: The random experiment result in the occurrence of only one of the n
outcomes. E.g. if a coin is tossed, the result is ahead or a tail, but not both. That is, the outcomes
are defined so as to be mutually exclusive.
Equally likely: Each outcome of the random experiment has an equal chance of occurring.
A random experiment is a process leading to at least two possible outcomes with uncertainty as
to which will occur.
Probability: is the chance that something will happen - how likely it is that some event will
happen.
Sometimes you can measure a probability with a number like "10% chance of rain", or you can
use words such as impossible, unlikely, and possible, even chance, likely and certain.
Probability: How likely something is to happen. Many events can't be predicted with total
certainty. The best we can say is how likely they are to happen, using the idea of probability.
Tossing a coin. When a coin is tossed, there are two possible outcomes: heads (H) or tails (T)
We say that the probability of the coin landing H is ½.
And the probability of the coin landing T is ½.
Throwing Dice:
When a single die is thrown, there are six possible outcomes: 1, 2, 3, 4, 5, 6. The probability of
any one of them is 1/6.
In general:
Number of ways it can happen: 1 (there is only 1 face with a "4" on it)
Example: there are 5 marbles in a bag: 4 are blue, and 1 is red. What is the probability that a blue
marble gets picked?
is called a probability function, and P(A) is the probability of the event A, if the following
axioms are satisfied.
Axiom 1: For every event A in C, the probability 0≤P ( A )≤1 holds true.
Axiom 2: For the sure event P(S) = 1.
Axiom 3: If
A1 , A2 , A 3 ,... , are mutually exclusive events in C, then
P ( A1 ∪ A2 ∪ A3 ∪. . . )=P ( A1 ) + P ( A2 ) + P ( A3 ) +. ..
Theorem 3: P ( φ ) =0
P ( A )=1−P ( A ) .
¿
¿
Theorem 4: If A is the complement of A, then
Theorem 5: If
A1 , A2 , A 3 ,..., A n are mutually exclusive events, then
P ( A1 ∪ A2 ∪ A3 ∪. . .∪ An )=P ( A 1 ) + P ( A 2 ) + P ( A 3 ) +. . .+ P ( A n )
Theorem 6: If A and B are any two events, then P ( A∪B ) =P ( A )+P ( B )−P ( A∩B ) .
If
A1 , A2 , A 3 are any three events, then
P ( A1 ∪ A2 ∪ A3 ) =P ( A 1 ) + P ( A 2 ) + P ( A 3 ) −P ( A1 ∩ A2 )−P ( A 2∩ A 3 ) −P ( A 1 ∩ A3 ) + P ( A1 ∩ A2 ∩ A3 )
Extension of this result to the union of more than three events is also possible.
Example-
Example
B) Permutations
Modifying the above example, if you have six books, but there is room for only four books on
the shelf, in how many ways can you arrange these books on the shelf?
Solution: the number of ordered arrangements of four books selected from six books is equal to
the number of ordered arrangements of four books selected from six books is equal to:
In many situations, you are not interested in the orderof the outcomes but only in the number of
ways that xitems can be selected from nitems, irrespective of order. Each possible selection is
called a combination.
C).Combinations
If you compare this rule to counting rule 4, you see that it differs only in the inclusion of a term
X! in the denominator. When permutations were used, all of the arrangements of the X objects
are distinguishable. With combinations, the x! Possible arrangements of objects are irrelevant.
Modifying the above example, if the order of the books on the shelf is irrelevant, in how many
ways can you arrange these books on the shelf?
Solution: the number of combinations of four books selected from six books is equal to
1. Marginal Probability
2. Joint Probability
3. Conditional Probability
1) Marginal Probabilities Under Statistical Independence.
A marginal or unconditional probability is the simple probability of the occurrence of an event.
Example: In a fair coin toss, P (H) = 0.5, that is, the probability of heads equal 0.5, and the
probability of tails equal 0.5. This is true for every toss, no matter how many tosses have been
made or what their outcomes have been. Every toss stands alone and is in no way connected
with any other toss. Thus, the outcome of each toss of a fair coin is an event that is statistically
independent of the outcomes of every other toss of the coin.
Example: In terms of the fair coin example, the probability of heads on two successive tosses is
the probability of heads on the first toss (which we shall call H 1) times the probability of heads
on the second toss (H2). We have shown that the events are statistically independent, because the
probability of heads on any toss is 0.5, and P (H 1n H2) = 0.5 x 0.5 = 0.25. Thus the probability of
heads on two successive tosses is 0.25.
Exercises: 1.What is the probability of getting, tails, heads, and tails in that order on three
successive tosses of a fair coin?
= 1 – 0.125 = 0.875
Conditional probability is the probability that a second event (let’s say B) will occur if a first
event (let’s say A) has already happened.
Symbolically: P (B/A) read as probability of B given that event A has occurred.
- For statistically independent events, the conditional probability of event B given that
event A has occurred is simply the probability of event B:
P (B/A) = P (B)
- Thus, statistical independence can be defined symbolically as the condition in which
P (B/A) = P (B).
- Examples: What is the probability that the second toss of a fair coin will result in heads,
given that heads resulted on the first toss?
- Solution: In this case the two events are independent.
- Symbolically: the question is written as: P (H2/H1)
- Using conditional probability under statistically independent situation, P(H2/H1) = P(H2)
- P (H2/H1) = 0.5
1.6 BAYES’ THEOREM
In our discussion of conditional probability, we indicated that revising probabilities when new
information is obtained is an important phase of probability analysis. Often, we begin our
analysis with initial or prior probability estimates for specific events of interest. Then, from
sources such as a sample, a special report, or some other means, we obtain some additional
information about the events. Given this new information, we update the prior probability values
The quality of the purchased parts varies with the source of supply. Historical data suggest that
the quality ratings of the two suppliers are as shown in the table below.
With the prior probabilities and the join probabilities, Bayes’ theorem can be used to
answer these questions.
-Letting B denote the event that the part is bad, we are looking for the posterior
probabilities P (A1/B) and P (A2/B). From the law of conditional probability and
marginal probability, we know that:
P( A 1 nB )
P( A1 /B )=
P( B)
P (A1 n B) = P (A1). P (B/A1) and P (A1 n B) = P (A1). P (B/A1)
P (B) = P (A1nB) + P (A2 n B)
P (B) = P (A1) P (B/A1) + P (A2) P (B/A2)
Substituting the above equations, we obtain Bayes’ theorem for the case of two events.
P( A1 nB)
P( A1 /B )=
P( A1 nB)+P( A 2 nB)
P( A1 ) P(B / A 1 )
P( A1 /B )=
P( A1 ) P(B / A 1 )+P ( A 2 )P( B/ A2 )
P( A 2 nB)
P( A2 /B )=
P( A 1 nB)+P( A 2 nB )
P( A 2 ) P(B / A 2 )
P( A1 /B )=
P( A1 ) P(B / A 1 )+P ( A 2 )P( B/ A2 )
0 . 65 x 0 . 02 0. 0130
P( A1 /B )= = =0 . 4262
(0. 65 x 0. 02 )+(0 .35 x 0 .05 ) 0. 0305
0 . 35 x 0 . 05 0. 0175
P( A2 /B )= = =0 . 5738
(0 .65 x 0. 02 )+(0 .35 x 0 .05 ) 0. 0305
Note that in this application we started with a probability of .65 that a part selected at random
was from supplier 1. However, given information that the part is bad, the probability that the
part is from supplier 1 drops to .4262. In fact, if the part is bad, there is a better than 50-50
chance that the part came from supplier 2; that is, P (A2/B) = .5738.
Bayes’ theorem is applicable when the events for which we want to compute posterior
probabilities are mutually exclusive and their union is the entire sample space. Bayes’ theorem
can be extended to the case where there are n mutually exclusive events A1, A2,…, An whose
union is the entire sample space. In such case, Bayes’ theorem for computing posterior
probability P (Ai/B) can be written symbolically as:
P ( A i )P( B/ A i )
P( Ai /B )=
P ( A 1 )P( B/ A1 )+ P( A2 ) P(B / A 2 )+ .. .+ P( A n )P (B / A n )
Bayes’ theorem calculated can be conducted using tabular approach as well as tree diagram.
Self –Test
1. Suppose two dice are rolled. What is the sample space? Identify the event, “dice sum to
seven.”
2. List the outcomes in the sample space for tossing a coin three times (use H for heads and
T for tails).
3. Using the sample space in #2, find the probabilities below as reduced fractions: a) Of
getting exactly one tail b) Of getting no heads c) Of getting all heads or all tails
4. During a sale at a men’s store, 16 white sweaters, 3 red sweaters, 9 blue sweaters, and 7
yellow sweaters were purchased. If a customer is selected at random, find the probability
that he bought: (as fractions) a) A blue sweater b) A yellow or white sweater
c) A sweater that was not white
5. When two dice are rolled, find the probability of getting: (as reduced fractions)
a) A sum of 5 or 6 b) A sum greater than 9 c) A sum less than 4 or greater than 9
d) A sum that is divisible by 4 e) A sum of 14 f) A sum less than 13
CHAPTER TWO
RANDOM VARIABLES AND PROBABILITY DISTRIBUTIONS
2.1 The Concept & Definition of a Random Variable
A random variable is a numerical description of the outcome of an experiment. Random
variables must have numerical values.
In effect, a random variable associates a numerical value with each possible experimental
outcome. The particular numerical value of the random variable depends on the outcome of the
experiment. A random variable can be classified as being either discrete or continuous depending
on the numerical values it assumes.
Discrete Random Variables
A random variable that may assume either a finite number of values or an infinite sequence of
values such as 0, 1, 2, . . . is referred to as a discrete random variable. For example, consider
the experiment of an accountant taking the certified public accountant (CPA) examination. The
examination has four parts. We can define a random variable as x = the number of parts of the
CPA examination passed. It is a discrete random variable because it may assume the finite
number of values 0, 1, 2, 3, or 4.
As another example of a discrete random variable, consider the experiment of cars arriving at a
tollbooth. The random variable of interest is x _ the number of cars arriving during a one-day
period. The possible values for x come from the sequence of integers 0, 1, 2, and so on. Hence, x
is a discrete random variable assuming one of the values in this infinite sequence.
Although the outcomes of many experiments can naturally be described by numerical values,
others cannot. For example, a survey question might ask an individual to recall the message in a
recent television commercial. This experiment would have two possible outcomes:
The individual cannot recall the message and the individual can recall the message. We can still
describe these experimental outcomes numerically by defining the discrete random variable x as
follows: let x = 0 if the individual cannot recall the message and x = 1 if the individual can recall
the message. The numerical values for this random variable are arbitrary (we could use 5 and
10), but they are acceptable in terms of the definition of a random variable—namely, x is a
random variable because it provides a numerical description of the outcome of the experiment.
Table 2.1 provides some additional examples of discrete random variables. Note that in each
example the discrete random variable assumes a finite number of values or an infinite sequence
of values such as 0, 1, 2, . . . . These types of discrete random variables are discussed in detail in
this chapter.
Table 2.1: Examples of discrete random variables
Note: One way to determine whether a random variable is discrete or continuous is to think of
the values of the random variable as points on a line segment. Choose two points representing
values of the random variable. If the entire line segment between the two points also represents
possible values for the random variable, then the random variable is continuous.
Properties of Variance
1. The variance of a constant is zero
2. If X and Y are two independent random variables, then Var. (X + Y) = Var. (X) + Var. (Y)
3. If b is constant, then Var. (x + b) = Var. (x)
4. If a is constant, then Var. (ax) = a2. Var. (x)
5. If x and Y are random variables and a and b are constants; the Var. (ax + by) = a 2 var.(x) + b2
var.(y)