Module 2 Introduction To Probability and Statistics
Module 2 Introduction To Probability and Statistics
MODULE 2:
Introduction to
Probability & Statistics
2.1 Introduction
Probability and statistics are concerned with events which occur by chance. Examples
include occurrence of accidents, errors of measurements, production of defective and non-
defective items from a production line, and various games of chance, such as drawing a
card from a well-mixed deck, flipping a coin, or throwing a symmetrical six-sided die. In
each case we may have some knowledge of the likelihood of various possible results, but
we cannot predict with any certainty the outcome of any particular trial. Probability and
statistics are used throughout engineering. In electrical engineering, signals and noise are
analyzed by means of probability theory. Civil, mechanical, and industrial engineers use
statistics and probability to test and account for variations in materials and goods.
Chemical engineers use probability and statistics to assess experimental data and control
and improve chemical processes. It is essential for today’s engineer to master these tools.
Chance
- A necessary part of any process to be described by probability or
statistics. Sometimes that element of chance is due partly or even perhaps
entirely to our lack of knowledge of the details of the process. For
example, if we had complete knowledge of the composition of every part
of the raw materials used to make bolts, and of the physical processes and
conditions in their manufacture, in principle we could predict the
diameter of each bolt. But in practice we generally lack that complete
knowledge, so the diameter of the next bolt to be produced is an
unknown quantity described by a random variation. Under these
conditions the distribution of diameters can be described by probability
and statistics. If we want to improve the quality of those bolts and to
make them more uniform, we will have to look into the causes of the
variation and make changes in the raw materials or the production
process. But even after that, there will very likely be a random variation
in diameter that can be described statistically.
Probability
- An area of study which involves predicting the relative likelihood of
various outcomes. It is a mathematical area which has developed over the
past three or four centuries. One of the early uses was to calculate the
odds of various gambling games. Its usefulness for describing errors of
scientific and engineering measurements was soon realized. Engineers
study probability for its many practical uses, ranging from quality control
Statistics
- A word with a variety of meanings. To the man in the street it most often
means simply a collection of numbers, such as the number of people
living in a country or city, a stock exchange index, or the rate of inflation.
These all come under the heading of descriptive statistics, in which items
are counted or measured and the results are combined in various ways to
give useful results.
- a branch of applied mathematics dealing with data collection,
organization, analysis, interpretation and presentation
- Descriptive statistics summarizes data. It involves using numbers to
describe features of data. For example, the average height of women in
the United States is a descriptive statistic: it describes a feature (average
height) of a population (women in the United States)
- Inferential statistics predicts data. As an example, the size of an animal is
dependent on many factors. Some of these factors are controlled by the
environment, but others are by inheritance. A biologist might therefore
make a model that says that there is a high probability that the offspring
will be small in size—if the parents were small in size. This model
probably allows predicting the size in better ways than by just guessing at
random. Testing whether a certain drug can be used to cure a certain
condition or disease is usually done by comparing the results of people
who are given the drug against those who are given a placebo.
In the study of statistics, we are concerned basically with the presentation and
interpretation of chance outcomes that occur in a planned study or scientific investigation.
For example, we may record the number of accidents that occur monthly at the intersection
of Driftwood Lane and Royal Oak Drive, hoping to justify the installation of a traffic light;
we might classify items coming off an assembly line as “defective” or “non-defective”; or we
may be interested in the volume of gas released in a chemical reaction when the
concentration of an acid is varied. Hence, the statistician is often dealing with either
numerical data, representing counts or measurements, or categorical data, which can be
classified according to some criterion.
The set of all possible outcomes of a statistical experiment is called the sample space and is
represented by the symbol S.
Each outcome in a sample space is called an element or a member of the sample space, or
simply a sample point. If the sample space has a finite number of elements, we may list the
members separated by commas and enclosed in braces. Thus, the sample space S, of
possible outcomes when a coin is flipped, may be written:
S = {H, T}
Sample spaces with a large or infinite number of sample points are best described
by a statement or rule method. For example, if the possible outcomes of an experiment are
the set of cities in the world with a population over 1 million, our sample space is written:
which reads “S is the set of all x such that x is a city with a population over 1 million.” The
vertical bar is read “such that.”
Similarly, if S is the set of all points (x, y) on the boundary or the interior of a circle
of radius 2 with center at the origin, we write the rule:
S = *(x, y) | x2 + y2 ≤ 4+
S1 = {1, 2, 3, 4, 5, 6}
If we are interested only in whether the number is even or odd, the sample space is simply:
S2 = {even, odd}
Example 1-1 illustrates the fact that more than one sample space can be used to describe
the outcomes of an experiment. In this case, S1 provides more
information than S2. If we know which element in S1 occurs, we can
tell which outcome in S2 occurs; however, knowledge of what happens
in S2 is of little help in determining which element in S1 occurs. In
general, it is desirable to use the sample space that gives the most
information concerning the outcomes of the experiment. In some
experiments, it is helpful to list the elements of the sample space
systematically by means of a tree diagram.
Example 2: Two gas stations are located at a certain intersection. Each one has six gas
pumps. Consider the experiment in which the number of pumps in use at a particular time
of day is determined for each of the stations. An experimental outcome specifies how many
pumps are in use at the first station and how many are in use at the second one. One
possible outcome is (2, 2), another is (4, 1), and yet another is (1, 4). The 49 outcomes are
displayed in the accompanying table.
2.3 Events
For any given experiment, we may be interested in the occurrence of certain events
rather than in the occurrence of a specific element in the sample space. An event is any
collection (subset) of outcomes contained in the sample space. A simple event consists of
exactly one outcome and a compound event consists of more than one outcome. It is
conceivable that an event may be a subset that includes the entire sample space S or a
subset of S called the null set ( ), which contains no elements at all.
Example 3: Given the sample space S = *t | t ≥ 0+, where t is the life in years of a certain
electronic component, then the event A that the component fails before the end of the fifth
year is the subset A = *t | 0 ≤ t < 5+.
Example 4: If we let A be the event of detecting a microscopic organism by the naked eye in
a biological experiment, then A = .
Suppose that when the experiment is performed, the outcome is LLL. Then the simple event
E1 has occurred and so also has the events B and C (but not A).
Example 6: If B = {x | x is an even factor of 7}, then B must be the null set, since the only
possible factors of 7 are the odd numbers 1 and 7.
The complement of an event A with respect to S is the subset of all elements of S that are not
in A. We denote the complement of A by the symbol A’.
Example 7: Let R be the event that a red card is selected from an ordinary deck of 52 playing cards,
and let S be the entire deck. Then R’ is the event that the card selected from the deck is not a red
card but a black card.
Example 8: Consider the sample space S = {book, cell phone, mp3, paper, stationery, laptop}. Let A
= {book, stationery, laptop, paper}. Then the complement of A is A’ = {cell phone, mp3}.
A diagram, called a tree diagram because of its appearance, is often used connection
with the above principle.
By proceeding along all paths, we see that the sample space is:
S = {HH, HT, T1, T2, T3, T4, T5, T6}; 8 possible outcomes
Now, the various paths along the branches of the tree give the distinct sample
points. Starting with the first path, we get the sample point DDD, indicating the possibility
that all three items inspected are defective. As we proceed along the other paths, we see
that the sample space is:
S = {DDD, DDN, DND, DNN, NDD, NDN, NND, NNN}; 8 possible outcomes
Example 11: A family has just moved to a new city and requires the services of both an
obstetrician and a pediatrician. There are two easily accessible medical clinics, each having
two obstetricians and three pediatricians. The family will obtain maximum health
insurance benefits by joining a clinic and
selecting both doctors from that clinic. In
how many ways can this be done?
If an operation can be performed in n1 ways and if for each of these ways a second
operation can be performed in n2 ways, then the two operations can be performed together
in n1n2 ways. (Product Rule)
Example 12: How many sample points are there in the sample space
when a pair of dice is thrown once?
Solution: The first die can land face-up in any one of n1 = 6 ways. For
each of these 6 ways, the second die can also land face-up in n2 = 6
ways. Therefore, the pair of dice can land in n1n2 = (6) (6) = 36 possible ways.
Solution: If we denote the plumbers by P1, . . . , P12 and the electricians by Q1, . . . , Q9, then
we wish the number of pairs of the form (Pi, Qj). With n1=12 and n2=9, the product rule
yields N= (12) (9) =108 possible ways of choosing the two types of contractors.
In Example 1-7, the choice of the second element of the pair did not depend on which first
element was chosen or occurred. As long as there is the same number of choices of the
second element for each first element, the product rule is valid even when the set of
possible second elements depends on the first element.
Example 14 (Continuation of 1-12): Suppose the home remodeling job involves first
purchasing several kitchen appliances. They will all be purchased from the same dealer,
and there are five dealers in the area. How many ways to choose first an appliances dealer,
then a plumbing contractor, and finally an electrical contractor?
References:
Fundamentals of Probability and Statistics for Engineers (8th Ed) by T.T. Soong
Probability and Statistics: A Course for Physicists and Scientists by A. Mathai and
H. Haubold
Probability and Statistics For Engineers and Scientists (9e) by Walpole and
Myers
Schaum’s Outlines of Probability and Statistics (4th Ed.) by M. Spiegel, J. Schiler,
and A. Srinivasan
Statistics and Probability for Engineering Application by W.J. DeCoursey