Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
89 views

Frequency With Which That Outcome Would Be Obtained If The Process Were

The document discusses two interpretations of probability and provides examples. It also discusses experiments and events, set theory related to probability, the definition and axioms of probability, finite sample spaces, and counting methods for determining outcomes and events.

Uploaded by

Avdesh Kothari
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views

Frequency With Which That Outcome Would Be Obtained If The Process Were

The document discusses two interpretations of probability and provides examples. It also discusses experiments and events, set theory related to probability, the definition and axioms of probability, finite sample spaces, and counting methods for determining outcomes and events.

Uploaded by

Avdesh Kothari
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 23

2 Interpretations of Probability

In addition to the many formal applications of probability theory, the concept of


probability enters our everyday life and conversations.

It probably will rain tomorrow.


It is very likely that the bus will arrive late.
The chances are good he will win the game.

Despite the fact that the concept of probability is such a common and natural
part of our experience, no single scientific interpretation of the term probability is
accepted by all statisticians.

Frequency interpretation: In many problems, the probability that some specific


outcome of a process will be obtained can be interpreted to mean the relative
frequency with which that outcome would be obtained if the process were
repeated a large number of times under similar conditions.

Example: the probability of obtaining a head when a coin is tossed is


approximately 1/2. Difficulties: Large number, similar conditions.

The frequency interpretation applies only to a problem in which there can be, at
least in principle, a large number of similar repetitions of a certain process. Many
important problems are not of this type. For example, the probability that a
particular medical research project will lead to the development of a new
treatment for a certain disease within a specified period of time.

Remark: The mathematical theory of probability can be usefully applied,


regardless of which interpretation of probability is used in a particular problem.
The theories and techniques that will be presented in this book have served as
valuable guides and tools in almost all aspects of the design and analysis of
effective experimentation.
2 Experiments and Events

The term experiment is used in probability theory to describe virtually every


process whose outcome is not known in advance with certainty. Here are some
examples.

(1) In an experiment in which a coin is to be tossed 10 times, the experimenter


might want to determine the probability that at least four heads will be obtained.

(2) In an experiment in which the air temperature at a certain location is to be


observed every day at noon for 90 successive days, a person might want to
determine the probability that the average temperature during this period will be
less than some specified value.
(3) In evaluating an industrial research and development project at a certain
time, a person might want to determine the probability that the project will result
in the successful development of a new product within a specified number of
months.

The interesting feature of an experiment is that each of its possible outcomes can
be specified before the experiment is performed, and probabilities can be
assigned to various combinations of outcomes that are of interest.
3 Set Theory

3.1 The sample space

The collection of all possible outcomes of an experiment is called the sample


space of the experiment. Each possible outcome is called a point, or an element,
of the sample space. The sample space can be classified into two types:
countable and uncountable. A sample space is countable if there is a one-to-one
correspondence between the elements of A and the set of natural numbers {1, 2, 3,
. . .}. Otherwise, it is uncountable.

3.2 Relations of set theory

Let S denote the sample space of some experiment. Then each possible outcome
s of the experiment is said to be a member of the space S, or to belong to the
space S. The statement that s is a member of S is denoted symbolically by the
relation s 2 S.
An event C is a certain subset of possible outcomes in the space S. We say the
event C occurs if the outcome of the experiment is in the set C.

Example 3.1 A six-sided die is rolled. The sample space is S = {1, 2, 3, 4, 5, 6}. Let A =
{2, 4, 6} denote the event A that an even number is obtained, and B = {4, 5, 6} denote
the event that a number greater than 3 is obtained. We say A occurs if we get
the number 2 in a roll.

It is said that an event A is contained in another event B i f every outcome that


belongs to the subset defining the event A also belongs to the subset defining
the event B. The relation is denoted by A B. Mathematically, we have
A B , x 2 A ) x 2 B (containment)
If two events are so related that A B a nd B A , then A = B, which means
that A and B m ust contain exactly the same sample points. Mathematically, we
have

A = B , A B and B A (equality)
A subset of S that contains no outcomes is called the empty set, or null set, and
it is denoted by the symbol ;. For each event, it is true that ; A S .
3.3 Operations of set theory
Given any two events (or sets) A a nd B, we have the following elementary set
operations:
Union: The union of A a nd B, written A [ B , is the set of elements that belong to
either A or B or both:
A [ B = {x : x 2 A or x 2 B}
Intersection: The intersection of A and B, written A \ B, is the set of elements that
belong to both A and B:
A \ B = {x : x 2 A and x 2 B}
Complements: The complement of A, written A c , is the set of all elements that are not
in A:

Ac = {x : x A}
Disjoint events

It is said that two events A and B are disjoint, or mutually exclusive, if A and B
have no outcomes in common. It follows that A and B are disjoint if and only if
A\B=;.
The events A1, . . . ,An or the events A1,A2, . . . are disjoint if for every i 6= j, we
have that Ai and Aj are disjoint, that is, Ai \ Aj = ; for all i 6= j.
Example 3.2 (Event operations) Consider the experiment of selecting a card at
random from a standard deck and noting its suit: clubs (C), diamond (D), hearts
(H), or spades (S). The sample space is

S = {C,D,H, S},
and some possible events are

A = {C,D}, and B = {D,H, S}.


From these events we can form

A [ B = {C,D,H, S}, A \ B = {D}, and Ac = {H, S}.

Furthermore, notice that A [ B = S and ( A [ B ) c = ;.


4 The Definition of Probability

4.1 Axioms and Basic Theorems

In order to satisfy the mathematical definition of probability, P(A), the probability


assigned to event A, must satisfy three specific axioms.

Axiom 1 For every event A, P(A) 0. In words, the probability of every event must
be nonnegative.
Axiom 2 P(S) = 1. If an event is certain to occur, then the probability of that
event is 1.

Axiom 3 For every infinite sequence of disjoint events A1,A2, . . .,

The mathematical definition of probability can now be given as follows: A


probability distribution, or simply a probability, on a sample space S is a
specification of numbers P(A) that satisfy Axioms 1, 2, and 3.
Some Theorems

Theorem 4.1 P(;) = 0.


Theorem 4.2 For every finite sequence of n disjoint events A1, . . . ,An,

Theorem 4.3 For every event A, P(Ac) = 1 P (A).

Theorem 4.4 If A B, then P(A) P (B).


Theorem 4.5 For any event A, 0 P (A) 1 .

Theorem 4.6 For every two events A and B,

P(A [ B) = P(A) + P(B) P(AB).

5 Finite sample spaces

5.1 Requirements of probabilities

In an experiment of which the sample space contains only a finite number of


points s1, . . . , s n, a probability distribution on S is specified by assigning a
probability pi to each point si 2 S. In order to satisfy the axioms for a probability
distribution, the numbers p1, . . . , pn must satisfy the following two conditions:

pi 0 , for i = 1, . . . , n
and

The probability of each even A c an then be found by adding the probabilities pi


of all outcomes si that belong to A.

Example 5.1 Consider an experiment in which five fibers having different lengths
are subject to a testing process to learn which fiber will break first. Suppose that
the lengths of the five fibers are 1,2,3,4, and 5 inches, respectively. Suppose also
that the probability that any given fiber will be the first to break is proportional to
the length of that fiber.

A sample space S c ontaining n o utcomes s1, . . . , sn is called a simple sample


space if the probability assigned to each of the outcomes s1, . . . , sn is 1/n. If an
event A i n this simple sample space contains exactly m o utcomes, then
5 Finite sample spaces

Example 5.2 (Tossing coins) Suppose that three fair coins are tossed
simultaneously. We shall determine the probability of obtaining exactly two
heads.

6 Counting Methods

In many experiments, the number of outcomes in S i s so large that a complete


listing of these outcomes is too expensive, too slow, or too likely to be incorrect
to be useful. In such an experiment, it is convenient to have a method of
determining the total number of outcomes in the space S a nd in various events
in S w ithout compiling a list of all these outcomes.
Multiplication rule

For example, suppose that there are three different routes from city A to city B
and five different routes from city B t o city C. Then the number of different
routes from A to C that pass through B is 3 5 = 15.
6.2 Permutations

Suppose that k c ards are to be selected one at a time and removed from a deck
of n c ards (k = 1, 2, . . . , n). Then each possible outcome of this experiment will
be a permutation of k cards from the deck, and the total number of these
permutations will be Pn,k = n(n 1) (n k + 1). This number Pn,k is called the
number of permutations of n e lements taken k a t a time.

If k = n, Pn , n = n(n 1) 1 = n!. Note that 0! = 1.

Example 6.1 (Arranging books) Suppose that six different books are to be
arranged on a shelf. The number of possible permutations of the books is 6! =
720.

Example 6.2 (Ordered sampling without replacement) Suppose that a club


consists of 25 members, and that a manager and a secretary are to be chosen
from the membership. The number of all possible choices is

P25,2 = 25 24 = 600.

Example 6.3 (Ordered sampling with replacement) Consider the following


experiment: A box contains n balls numbered 1, . . . , n. First, one ball is selected at
random from the box and its number is noted. This ball is then put back in the
box and another ball is selected. As many balls can be selected in this way. This
process is called ordered sampling with replacement. The sample space of this
experiment will contain all vectors of the form (x1, . . . , xk), where xi is the outcome
of the ith selection. The total number of vectors in S is nk. The probability of the
event that each of the k balls that are selected will have a different number is
Example 6.4
(The birthday problem) The problem is to determine the probability p that at
least two people in a group of k people (2 k 365) will have the same birthday.
Under appropriate assumptions (independence and uniformity),the probability
that all k persons will have different birthdays is P365,k / 3 65k. The probability that
at least two of the people will have the same birthday is

7 Combinatorial Methods

7.1 Combinations

Suppose that there is a set of n d istinct elements from which it is desired to


choose a subset containing k e lements (1 k n ). The number of different
subsets that can be chosen is

The number Cn,k is also denoted by the symbol . When this notation is used,
this number is called a binomial coefficient because it appears in the binomial
theorem.

Binomial theorem. For all numbers x and y and each positive integer n,
Examples

Example 7.1 (Unordered sampling without replacement) Suppose that a class


contains 15 boys and 30 girls, and that 10 students are to be selected for a
special assignment. We shall determine the probability that exactly 3 boys will
be selected.

The desired probability is

In summary, we have the following table for counting.

Example 7.3 (Unordered sampling with replacement) A restaurant has n items on


its menu. During a particular day, k customers will arrive and each one will
choose one item. The manger wants to count how many different collections of
customer choices are possible without regard to the order in which the choices
are made.

Conditional Probability

A major use of probability in statistical inference is the updating of probabilities


when certain events are observed. The updated probability of event A after we
learned that event B has occurred is the conditional probability of A given B. The
notation for this conditional probability is P(A | B).

Example 1.1 Rolling Dice Suppose that two dices were rolled and it was observed
that the sum T of the two numbers was odd. We shall determine the probability
that T was less than 8.
1.1 The multiplication rule for conditional probability

The definition of conditional probability implies the following multiplication rule,

P(AB) = P(B)P(A | B)

Or

P(AB) = P(A)P(B | A).

Example 1.3 Suppose that two balls are to be selected at random, without
replacement, from a box containing r red balls and b blue balls. The probability
that the first ball will be red and the second ball will be blue is

2 Independent Events

Definition 2.1 Two events are independent if P(AB) = P(A)P(B).

Suppose that P(A) > 0 and P(B) > 0, then it follows from the definitions of
independence and conditional probability that A and B are independent if and
only if:
P(A | B) = P(A) and
P(B | A) = P(B),
that is, learning that B (A) occurred does not change the probability of A (B).

Example 2.1 Machine operations. Suppose that two machines 1 and 2 in a factory
are operated independently of each other. Let A be the event that machine 1 will
become inoperative during a given 8-hour period; let B be the event that
machine 2 will become inoperative during the same period; and suppose that
P(A) = 1/3 and P(B) = 1/4. What is the probability that both machines will become
inoperative during the period.

Theorem 2.1 If two events A and B are independent, then the events A and B c

are also independent.

Definition 2.2 The k events A1, . . . , Ak are independent if, for every subset Ai1 , . . . , Aij
of j of these events (j = 2, 3, . . . , k), P(Ai1 \ \ Aij ) = P(Ai1 ) P(Aij ).

In particular, in order for three events A, B, and C to be independent, the


following four relations must be satisfied:
P(AB) = P(A)P(B)
P(AC) = P(A)P(C)
P(BC) = P(B)P(C)
P(ABC) = P(A)P(B)P(C).
Example 2.3

Suppose that a machine produces a defective item with probability p (0 < p < 1)
and produces a nondefective item with probability q = 1 - p. Suppose further that
six items produced by the machine are selected at random and inspected, and
that the outcomes for these six items are independent. The probability that at
least one of the six items will be defective is
p = 1 - P(D1D2D3D4D5D6) = 1 - q 6 ,

The probability that exactly two of six items are defective is:

Example 2.5 Suppose that the probability p that the machine produces a
defective item takes one of two values, either 0.01 or 0.4, the first corresponding
to normal operation and the sec-ond corresponding to a need for maintenance.
Let B1 be the event that p = 0.01, and let B2 be the event that p = 0.4. If we knew that
B1 had occurred, then we would proceed under the assumption that the events
D1, D2, . . . were independent with P(Di | B1) = 0.01 for all i. Let A be the event that we
observe exactly two defectives in a random sample of six items. Then

1 Random variables and discrete distributions

Definition 1.1 Consider an experiment for which the sample space is defined by
S. A random variable is a function from a sample space into the real numbers. In
other words, in a particular experiment a random variable X would be some
function that assigns a real number X(s) to each possible outcome s 2 S.
Example 1.1 Consider an experiment in which a coin is tossed 10 times. In this
experiment the sample space can be regarded as the set of outcomes consisting
of the 210 different sequences of 10 heads and tails that are possible, and the
random variable X could be the number of heads obtained on the 10 tosses. For
each possible sequence s consisting of 10 heads and tails, this random variable
would assign a number X(s) equal to the number of heads in the sequence. Thus,
if s is the sequence HHTTTHTTTH, then X(s) = 4.

The distribution of a random variable

When a probability distribution has been specified on the sample space of an


experiment, we can determine a probability distribution for the possible values of
each random variable X. Let A be a subset of the real line, and let P (X 2 A)
denote the probability that the value of X will belong to the subset A. Then P (X 2
A) is equal to the probability that the outcome s of the experiment will be such
that X(s) 2 A. In symbols,
P(X 2 A) = P {s : X (s) 2 A}.
Example 1.3

Consider again an experiment in which a coin is tossed 10 times, and let X be


the number of heads that are obtained. In this experiment the possible values of
X are 0, 1, 2, . . ., 10.
Then:

for x = 0, 1, . . . , 10.

Discrete distributions (Discrete Random Variable)

It is said that a random variable X has a discrete distribution or that X is a


discrete random variable if X can take only a finite number k of different values
x1, . . . , xk or, at most, an infinite sequence of different values x1, x2, . . ..

Continuous distributions (Continuous Random Variable)

Random variables that can take every value in an interval are called continuous
random variables.

Probability Distribution Function (abbreviated p.d.f.)

If a random variable X has a discrete distribution, the probability distribution


function (abbreviated p.f.) of X is defined as the function f such that for every
real number x, f (x) = P (X = x). For every point x that is not one of the possible
values of X, it is evident that f (x) = 0. Also, if the sequence x1, x2, . . . includes all
the possible values of X, then

The uniform distribution on integers

Suppose that the value of a random variable X is equally likely to be each of the
k integers 1, 2, . . . , k. Then the p.d.f. of X is as follows:

The Binomial distribution

Suppose that a certain machine produces a defective item with a probability of p


(0 < p < 1) and produces a nondefective item with a probability of q = 1 p.
Suppose further that n independent items produced by the machine are
examined, and let X denote the number of these items that are defective. Then
the random variable X will have a discrete distribution, and the possible values
of X will be 0,1, . . ., n. The probability of X = x is
2 Continuous Distributions

The probability density function It is said that a random variable X has a


continuous distribution or that X is a continuous random variable if there exists a
nonnegative function f, defined on the real line, such that for every subset A of
the real line, the probability that X takes a value in A is the integral of f over the
set A. In this course, we shall be concerned primarily with subsets that are
intervals. For an interval (a, b],

The function f is called the probability density function of X. Every p.d.f. f must
satisfy the following two requirements:

f (x) 0, for all x,


and

The uniform distribution on an interval

Let a and b be two given real numbers such that a < b, and consider an
experiment in which a point X is selected from the interval S = {x : a x b} in such
a way that the probability that X will belong to each subinterval of S is
proportional to the length of that subinterval. This distribution of the random
variable X is called the uniform distribution on the interval [a, b].

Here X represents the outcome of an experiment that is often described by


saying that a point is chosen at random from the interval [a, b].

For example, if a random variable X has a uniform distribution on the interval [1,
4], then the pdf of X is

Example 2.2

Suppose that the pdf of a certain random variable X has the following form
3 The Distribution Function

Definition and Basic Properties The cumulative distribution function (cdf) F of a


random variable X is a function defined for each real number x as follows:

F (x) = P (X x), 1 < x < 1.


It should be emphasized that the cdf is defined in this way for every random
variable X, regardless of whether the distribution of X is discrete, continuous, or
mixed. The cdf of every random variable X must have the following three
properties:

Property 3.1 The function F(x) is nondecreasing as x increases; that is, if x1 < x2,
then F(x1) F(x2).

1 The Expectation of a Random Variable

Expectation for a discrete distribution


Suppose that a random variable X has a discrete distribution for which the pf is f.
The expectation of X, denoted by E(X), is a number defined as follows:

Example 1.1 Suppose that a random variable X can have only the four different
values -2, 0, 1, and 4, and that P(X = 2) = 0.1, P(X = 0) = 0.4, P(X = 1) = 0.3,
and P(X = 4) = 0.2. Then
E(X) = 2(0.1) + 0(0.4) + 1(0.3) + 4(0.2) = 0.9.
Expectation for a continuous distribution

If a random variable X has a continuous distribution for which the pdf is f, then
the expectation E(X) is defined as follows:

Example 1.2 Suppose that the pdf of a random variable X is

Interpretation of the expectation

The number E(X) is also called the expected value of X or the mean of X; and the
terms expectation, expected value, and mean can be used interchangeably.

3 Variance

Definitions of the variance and the standard deviation


Suppose that X is a random variable with mean = E(X). The variance of X,
denoted by Var(X), is defined as follows:

Var(X) = E [(X )2].

5 The Median

Definition 5.1 For each random variable X, a median of the distribution of X is


defined as a number m such that P (X m) 1/2 and P (X m) 1/2.
Gaussian pdf
If X represents the sum of N independent random components, and if each
component makes only a small contribution to the sum, then the CDF of X
approaches a Gaussian CDF as N becomes large, regardless of the distribution of
the individual components.
Gaussian pdf
Random Processes (Stochastic Processes)

Random variable that changes with time.

Example:. The set of voltages generated by thermal electron motion in a large number of identical resistors.

Ergodicity

A Random Process is Ergodic if all time averages of sample functions equal the corresponding ensemble averages.

All ensemble averages of an Ergodic process are independent of time.

You might also like