Lecture Notes - Prob and Stat
Lecture Notes - Prob and Stat
1
Chapter 1
1. Introduction
2
Definition for Statistics
• Statistics (In the plural sense) refers to the systematic collection of
numerical facts or measurements that describe specific attributes of a
sample or population. Example: birth rate, death rate , import and
export of goods, etc.
• It indicates information in terms of numbers or numerical data.
• Statistics (In the singular sense) is the science or discipline that deals
with the collection, organization, presentation, analysis, and
interpretation of data.
3
Classification of Statistics
• The body of knowledge called statistics is sometimes divided into two
main areas, depending on how data are used. The two areas are
1. Descriptive statistics
2. Inferential statistics
4
1. Descriptive Statistics
• Descriptive statistics involves summarizing and organizing data so it
can be easily understood.
• This type of statistics focuses on describing the characteristics of a
data set without making any inferences or predictions beyond the data
at hand.
• In descriptive statistics the statistician tries to describe a situation.
• Data can be described using charts, graphs, tables and mathematical
calculations.
5
2. Inferential Statistics
• Inferential statistics goes beyond simply describing the data and instead
makes predictions or generalizations about a population based on a
sample data.
• This category uses probabilistic models to make inferences and test
hypotheses, allowing us to draw conclusions and make decisions based
on sample data.
• Statisticians also use statistics to determine relationships among
variables. For example, relationships between Smoking and Health.
There is a relationship between smoking and lung cancer.
6
Class Activities
Determine whether descriptive or inferential statistics were used.
A. The average price of laptops show in a recent year was $50 dollars.
B. The CSA predicts that the population of Ethiopia in 2030 will be 150 million
people.
C. A medical report stated that taking statins is proven to lower heart attacks, but
some people are at a slightly higher risk of developing diabetes when taking
statins.
D. A survey of 2234 people conducted by research centers found that 55% of the
respondents said that excessive complaining by adults was the most annoying
social media habit.
7
8
Basic Terms
• Population: the entire group or set of items that you are interested in
studying. For example, all residents in a city could be a population.
• Sample: a subset of the population chosen for analysis.
• Parameter: is a numerical value that describes a characteristic of a
population. Example: population mean and population variance
• Statistic: is a numerical value that describes a characteristic of a
sample. Example: sample mean and sample variance.
9
Con’d
• A variable is a characteristic or attribute that can assume different values. Examples include age,
height, income, and test scores.
• Qualitative variables represent categories or labels and describe characteristics that cannot be
quantified directly. Example: Gender, religious preference and geographic locations.
• Quantitative variables represent numerical values that quantify a characteristic. Example: Age,
heights, weights, and body temperatures.
• Quantitative variables can be further classified into two groups: discrete and continuous.
• Discrete variables : These variables take on a finite or countable number of distinct values. They are
often whole numbers. Example: Number of children in a family, number of cars in a parking lot, the
number of students in a classroom.
• Continuous variables These variables can take on an infinite number of values within a given
range. They are measured rather than counted and can include decimals. Example: Height, weight,
temperature, time.
10
Cont’d
• Data refers to collected information, measurements, or observations
that are used to analyze trends, make decisions, and draw conclusions.
• The levels of measurement refer to the ways in which data can be
categorized, ordered, and quantified.
• These levels define the types of statistical analyses that can be
performed on the data and the precision with which variables can be
measured.
• There are four main levels of measurement: Nominal, Ordinal,
Interval, and Ratio.
11
12
Summary
13
14
Sampling Techniques (Optional)
15
Probability Sampling
(Random Sampling)
16
1. Simple Random Sampling
• The simple random sample means that every case of the population
has an equal probability of inclusion in sample.
• This sampling method is as easy as assigning numbers to the
individuals (sample) and then randomly choosing from those numbers
through an automated process.
• Finally, the numbers that are chosen are the members that are included
in the sample.
• Simple random sampling can be done either using the lottery method
or table of random numbers (using number generating software).
17
2. Systematic Sampling
• Researchers use the systematic sampling method to choose the
sample members of a population at regular intervals, i.e. every
nth individual to be a part of the sample.
• For example, if surveying a sample of consumers, every fifth
consumer may be selected from your sample.
• Let N=population size, n=sample size, k=N/n=sampling interval.
Choose any number between 1 and K. Suppose it is j where (1 <= j <=
k) then jth unit is selected at first and then (j+k)th , (j+2k)th, . . . , etc
until the required sample size is selected.
• The advantage of this sampling technique is its simplicity.
18
3. Stratified Sampling
• It involves a method where the researcher divides a more extensive population into smaller
subgroups (or called strata) that usually don’t overlap but represent the entire population.
• While sampling, organize these groups and then draw a sample from each group separately.
• A standard method is to arrange or classify by sex, age, ethnicity, and similar ways. Splitting
subjects into mutually exclusive groups and then using simple random sampling to choose
members from groups.
• Stratified sampling is often used where there is a great deal of variation within a population.
• Elements in the same strata should be more or less homogeneous while different in
different strata.
• Its purpose is to ensure that every stratum is adequately represented.
19
4. Cluster Sampling
• Cluster Sampling is a way to select participants randomly that are spread out geographically. For
example, if you wanted to choose 100 participants from the entire population of Ethiopia, it is
likely impossible to get a complete list of everyone. Instead, the researcher randomly selects
clusters (i.e., cities or regions etc) and all the sampling units in the selected clusters will be
surveyed or considered.
• Clusters are formed in a way that elements within a cluster are heterogeneous, i.e. observations in
each cluster should be more or less dissimilar.
• Cluster sampling usually analyzes a particular population in which the sample consists of more
than a few elements, for example, city, family, university, etc. Researchers then select the clusters
by dividing the population into various smaller sections.
20
5. Multistage Sampling
• Multistage sampling divides large populations into stages to make the
sampling process more practical.
• A combination of stratified sampling or cluster sampling and simple
random sampling is usually used.
21
Example
22
23
Non- Probability Sampling
(Non-Random Sampling)
24
1.Convenience Sampling
• Convenience sampling is a non-probability sampling technique where
samples are selected from the population only because they are
conveniently available to the researcher.
• Researchers choose these samples just because they are easy to recruit,
and the researcher did not consider selecting a sample that represents
the entire population.
25
2. Judgmental or Purposive Sampling
• In the judgmental sampling method, researchers select the samples
based purely on the researcher’s knowledge and credibility.
• In other words, researchers choose only those people who they
believe to fit to participate in the research study.
• Judgmental or purposive sampling is not a scientific method of
sampling, and the downside to this sampling technique is that the
preconceived notions of a researcher can influence the results.
• Thus, this research technique involves a high amount of ambiguity.
26
3. Quota Sampling
• In quota sampling, the selection of members in this sampling technique happens based on
a pre-set standard.
• In this case, as a sample is formed based on specific attributes, the created sample will
have the same qualities found in the total population. It is a rapid method of collecting
samples.
• Hypothetically consider, a researcher wants to study the career goals of male and female
employees in an organization.
• There are 500 employees in the organization, also known as the population.
• To understand better about a population, the researcher will need only a sample, not the
entire population.
• Further, the researcher is interested in particular strata within the population.
• Here is where quota sampling helps in dividing the population into strata or groups
27
4. Snowball Sampling
• It is a sampling method that researchers apply when the samples/subjects are difficult to
trace/locate. For example, it will be extremely challenging to survey shelterless people or illegal
immigrants.
• Researchers also implement this sampling method in situations where the topic is highly sensitive
and not openly discussed for example, surveys to gather information about HIV Aids.
• Not many victims will readily respond to the questions. Still, researchers can contact people they
might know or volunteers associated with the cause to get in touch with the victims and collect
information.
• This sampling system works like the referral program.
• Researchers use this technique when the sample size is small and not easily available.
• Once the researchers find suitable subjects, he asks them for assistance to seek similar subjects to
form a considerably good size sample.
28
Methods of Data Collection
• Interview
• Questionnaire
• Scheduled through enumerator
• Observation
• Experiment
• Registration methods
• Documentary sources
• Etc
29
Functions of Statistics
• To Organization and Summarization data.
• To compare two or more groups of data.
• To understand the variability or spread of data.
• Formulation and testing of hypothesis.
• To forecast future trends or outcomes.
• To measure uncertainty
30
Limitation of Statistics
• Statistics primarily deals with quantitative data and may not be as effective
for analysing qualitative or subjective data without additional methods.
• Statistical analysis often focuses on aggregated data, not deal with
individual items.
• Statistics laws and results are true only on average.
• Statistical results can be misinterpreted, especially if the data is not
collected, analyzed or presented properly.
• Some errors are possible in statistical decisions. Particularly the inferential
statistics involves certain errors. We don’t know whether an error has been
committed or not.
31
Chapter 2
2. Descriptive Statistics
32
Definition
• Descriptive statistics is a branch of statistics that deals with the
summarization and organization of data in a meaningful way.
• Unlike inferential statistics, which attempts to draw conclusions and
make predictions based on data, descriptive statistics focuses on
providing a clear, concise overview of the characteristics of a dataset.
• Descriptive statistics uses the data to provide descriptions of the
population, either through tables, graphs/diagrams or numerical
calculations.
33
Frequency Distributions(FD)
• A frequency distribution is a statistical tool that organizes and summarizes a set of data
by showing the frequency (or number of occurrences) of each distinct value or range of
values in a dataset.
• It provides a clear picture of how the data is distributed across different values or
intervals, making it easier to identify patterns, trends, and the overall distribution of the
data.
• Two types namely Categorical and Numerical FDs.
34
1. Categorical Frequency Distributions
• In this type of frequency distribution the data represents categories
or labels (nominal or ordinal) rather than numerical values.
• For example, data such as political affiliation, religious affiliation, or
major field of study would use categorical frequency distributions.
35
Solution
36
2. Numerical Frequency Distribution
• This type of frequency distribution is used for numerical data(interval
level and ratio), where the data consists of numbers that can be
measured.
• Two types.
A. Ungrouped Frequency distribution
B. Grouped Frequency Distribution
37
A. Ungrouped(simple) Frequency distribution
• A distribution where each distinct value of a variable is listed along
with its corresponding frequency.
• This is used for discrete data where there are not too many distinct
values.
• Example: Twenty students were asked about their test score for
statistics course. Their responses are listed below. Arrange the data in
table form.
• Example:
5, 6, 3, 3, 2, 4, 7, 5, 2, 3, 5, 6, 5, 4, 4, 3, 5, 2, 5, 3
38
Solution:
Test Score No. of students
2 3
3 5
4 3
5 6
6 2
7 1
Total 20
39
B. Grouped Frequency Distributions
• Instead of listing each distinct value, data is grouped into intervals (or
classes), and the frequency of each interval is recorded.
• This is used for continuous data or when there are many distinct
values, making the data easier to interpret.
40
Basic Terms
• Class limits define the range of values within each class.
• Class boundaries are the actual limits(adjusted limits) that used to
eliminate gaps between consecutive class intervals and ensure
continuity in the data.
• The class mark is the midpoint value of a class interval.
• The class width(class size) is the difference between consecutive
lower or upper class limits.
• Unit of measurement: the smallest difference of any two values of
the given data set.
41
Constructing a Grouped Frequency
Distribution
1. Determine the unit of measurement.
2. Find the range of the data set.
3. Decide the number of classes(K) using sturges rule. K= 1+3.322logN
4. Find the width by dividing the range by the number of classes and rounding
to the nearest possible value.
5. Generate class limits, class boundaries. Usually select a starting point as the
lowest value of the data.
6. Tally the data and find the numerical frequencies from the tallies.
42
Summary
• There should be between 5 and 20 classes.
• The classes must be continuous.
• The classes must be exhaustive.
• The classes must be mutually exclusive.
• The classes must be equal in width
43
Example
These data represent the record high temperature in degree Fahrenheit
for each of the 50 states. Construct a grouped frequency distribution for
the data.
44
Solution
1. U= 1,
2. R= 134-100= 34
3. K= 1+ 3.322 Log 50 = 6.64 ~ 7
4. w = R/K = 34/7 = 4.9 ~ 5
5. Lowest value = lowest class limit = 100
45
Homework
• A distribution has constant class width with 6 classes and the second
class mark is 8. If the class mark of the forth distribution is 18, find the
class width, the class limits and class boundaries of the distribution.
Assume unit of measurement is 1.
46
Relative Frequency Distribution
• It shows the proportion or percentage of the total number of
observations that fall within each class interval.
47
Cumulative frequency distribution
• Definition: This distribution shows the cumulative total of frequencies up to each class or value. It
can be presented in two forms:
• Less than cumulative frequency: Cumulative frequency up to the upper boundary of each
class. It shows how many data points fall below the upper boundary of each class interval.
• More than cumulative frequency: Cumulative frequency from the lower boundary of each
class downwards. It shows how many data points are above the lower boundary of each class
interval.
48
49
Measure of Central Tendency
• Measures of central tendency are statistical metrics used to
summarize and describe the centre or typical value of a dataset.
• They provide a single value that represents the central point around
which the data tends to cluster.
50
Cont’d
. It is also called
summary
measures of statistics
central location
average
51
Characteristics
• It should be simple to understand and easy to compute.
• It should be rigidly defined.
• It should be based on all the observations.
• It should be suitable for further algebraic treatments.
• It should not be affected by extreme values.
52
Mean
53
Class Activity
54
Challenging Questions
1. Let the mean of x1, x2, x3 ……xn be A, then what is the mean of:
A. (x1+k) ,(x2+k), (x3x3+k), ……(xn+k)?
B. (x1-k) ,(x2-k), (x3-k), ……(xn-k)?
C. kx1, kx2, kx3 ……kxn?
2. The mean of 5 numbers is 18 If one number is excluded, their mean is
16. Find the excluded number.
55
Mean for FD
• Mean for FD is computed as follows.
56
Example
57
Solution
58
Median
• The median is the middle value of a dataset when the values are arranged in
ascending or descending order.
• Useful for ordinal, interval, and ratio data. The median is less sensitive to outliers
and skewed data compared to the mean.
59
Example
For example, consider the data: 4, 4, 6, 3, 2. find the median.
60
Cont’d
• Let's consider the data: 50, 67, 24, 34, 78, 43. What is the
median?
Solution
• Arranging in ascending order, we get: 24, 34, 43, 50, 67,
78. Here, n (no. of observations) = 6
• Median = (43+50)/2 = 46.5
• Median = 46.5
61
Median for FD
62
Class Activity
• Consider the following data and compute the median for the data.
Data Frequency
51-55 2
56-60 7
61-65 8
66-70 4
63
Solution
64
Mode
• Definition: The mode is the value(s) that occur most frequently in a
dataset.
• A dataset may have one mode, more than one mode (bimodal or
multimodal), or no mode if no value repeats.
• When no data value occurs more than once, the data set is said to have
no mode.
• Usage: Can be used with nominal, ordinal, interval, and ratio data.
The mode is particularly useful for categorical data to identify the
most common category.
65
Class Activity 1
• The data show the number of public libraries in a sample of eight
states. Find the mode.
66
Class Activity 2
• Since the category of soft drinks has the largest frequency, 52, we can
say that the mode or most typical drink is a soft drink.
67
Class Activity 4
68
Mode for FD
69
Class Activity 1
Consider the following data and compute the mode for the data.
70
Solution
71
Weighted Mean
72
Example: Weighted Mean
• A student received an A in English Composition I (3 credits), a C in Introduction to Psychology (3
credits), a B in Biology I (4 credits), and a D in Physical Education (2 credits). Assuming A = 4 grade
points, B = 3 grade points, C = 2 grade points, D = 1 grade point, and F = 0 grade points, find the
student’s grade point average.
73
Quintiles
• Quintiles are values that divide a dataset into equal-sized, contiguous intervals.
• If the position is an integer, the quintile value is the data point at that position in the ordered list. If the position is not
an integer, interpolate between the nearest ranks.
• The main types of quintiles include:
• Quartiles: Divide the data into four equal parts.
• First Quartile (Q1): 25th percentile
• Second Quartile (Q2): 50th percentile (also the median)
• Third Quartile (Q3): 75th percentile
• Deciles: Divide the data into ten equal parts.
• First Decile (D1): 10th percentile
• Second Decile (D2): 20th percentile
• And so on, up to the Ninth Decile (D9): 90th percentile
• Percentiles: Divide the data into one hundred equal parts.
• 1st Percentile: The value below which 1% of the data falls
• 50th Percentile: The median, where 50% of the data falls below it
• 99th Percentile: The value below which 99% of the data falls
74
Dataset: 5, 7, 8, 12, 15, 18, 22, 25, 28, 30
75
76
77
78
79
Measure of Variation(Dispersion )
• Measures of variation (or dispersion) describe the spread or
variability of data values around the central measure (mean, median,
or mode).
• They provide insights into how much the data values differ from each
other and from the central value.
80
Importance/ Purpose of Measuring
Variation
• To test the reliability of an average
• To serve as a basis for control of variability
• To compare two or more series with regard to their variability
• To facilitate as a basis for further statistical analysis
81
Types
• Absolute measures of variation provide the amount of spread in the
same units as the data.
• They give a direct sense of how data values are dispersed around the
central value. Example: Range, Variance, Standard Deviation,
Interquartile range, Mean absolute deviation, etc.
• Relative measures of variation provide a standardized way to
compare dispersion across datasets with different units or scales.
• They express dispersion relative to a central value, often as a
percentage. Example: Coefficient of variation, Z-score, etc
82
Range
• The difference between the maximum and minimum values in a
dataset.
• Range=Maximum Value−Minimum value
• Characteristics: Simple to compute but sensitive to outliers, as it only
considers the extreme values.
83
Variance
84
85
Example
86
87
Variance and Standard Deviation for Grouped Data
88
Example
• Find the sample variance and the sample standard deviation for the
frequency distribution of the data shown. The data represent the
number of miles that 20 runners ran during one week.
89
Solution
Class Activities
• Consider a sample of mark of students randomly selected from any
class. Find
A.The average mark of students.
B.The sample variance and standard deviation of mark of students.
Mark Number of students
0-5 2
5-10 4
10-15 6
15-20 4
20-25 2
91
Coefficient of Variation
The distribution having greater C.V is considered more variable than the
other, the distribution with lesser C.V shows greater consistency,
homogeneity and uniformity. 92
Class Activity 1
The mean of the number of sales of cars over a 3-month period is 87,
and the standard deviation is 5. The mean of the commissions is $5225,
and the standard deviation is $773. Compare the variations of the two.
93
94
Class Activity 2
The mean speed for the five fastest wooden roller coasters is 69.16 miles per hour, and the variance is
2.76. The mean height for the five tallest roller coasters is 177.80 feet, and the variance is 157.70.
Compare the variations of the two data sets.
95
Homework
96
Standard Score (Z-Score)
• The standard score or Z-score measures how many standard
deviations a data point (or observation) is from the mean of its data
set.
• It standardizes individual data points in terms of their distance from
the mean, providing a way to compare scores across different
distributions or datasets.
97
Example:
98
Example:
99
Chapter 3
3. Elementary Probability
100
Definition
• Probability is defined as a measure of the likelihood (or chance) that a specific
event will occur.
Example:
• Meteorologists use weather patterns to predict the probability of rain.
• In epidemiology, probability theory is used to understand the relationship
between exposures and the risk of health effects.
101
SOME IMPORTANT TERMS &CONCEPTS
• A Random Experiment is an action or process that leads to one or more
outcomes where the result is uncertain and cannot be predicted in advance.
• Each time the experiment is conducted, it may produce a different result.
Eg: tossing a coin, rolling a die, or drawing a card from a deck.
• An outcome is a single possible result of a random experiment.
• Sample Space (S): The set of all possible outcomes of an experiment. For
example, if you roll a die, the sample space is S={1,2,3,4,5,6}
• Event (E): A subset of the sample space.
• An event is a single outcome or a set of outcomes from the sample space. For
example, getting an even number when rolling a die is an event E={2,4,6}.
102
Cont’d
• Mutually Exclusive Events: Two or more events that cannot happen at the same
time. For example, when flipping a coin, getting heads and getting tails are
mutually exclusive events.
• Exhaustive Events: A set of events is exhaustive if they cover all possible
outcomes of the experiment. For example, in rolling a die, the events “ getting an
odd number" and “getting an even number" are exhaustive, as one of these must
occur.
• Independent Events: Events are independent if the occurrence of one does not
affect the occurrence of the other. For example, flipping two coins is independent:
getting heads on the first flip does not affect the result of the second flip.
• Dependent Events: Events are dependent if the outcome or occurrence of one
affects the other. For example, drawing cards from a deck without replacement
makes the events dependent, because the sample space changes after each draw.
103
Cont’d
• Equally Likely Outcomes: Outcomes are said to be equally likely if each outcome
has the same chance of occurring. This means that no outcome is favored over any
other.
Examples:
• In rolling a fair die, the outcomes {1,2,3,4,5,6} are equally likely, each with a
probability of 1/6.
• In flipping a fair coin, the outcomes "heads" and "tails" are equally likely, each with a
probability of 1/2.
• Null Event (Impossible Event): A null event (or impossible event) is an event that
has no chance of occurring. The probability of a null event is always 0.
104
Class Activity 1
• Find the sample space for rolling two dice.
105
Class Activity 2
106
Class Activity 3
107
Tree Diagram
108
Examples
109
110
Counting Techniques
111
1. Addition Rule
112
Example
1. In a certain class a class representative is to be chosen from 3 female and 4
male students. Count the ways in which a class representative can be chosen.
• Here, a female representative is to be chosen in 3 ways and a male
representative is to be chosen in 4 ways. Therefore the number of ways in
which a class representative can be chosen will be 3+4=7ways.
2. If there are 3 types of sandwiches and 2 types of drinks, and you want to
choose either a sandwich or a drink, the total number of options is: 3+2=5.
113
2. Multiplication Rule
•Addition Rule is used to find the probability of either of two events occurring (union of events).
•Multiplication Rule is used to find the probability of both events occurring together (intersection of events).
114
Example
• A coin is tossed and a die is rolled. Find the number of outcomes for
the sequence of events.
Solution: Since the coin can land either heads up or tails up and since
the die can land with any one of six numbers showing face up, there
are 2 · 6 = 12 possibilities.
• If you have 3 shirts and 4 pants, the number of different outfits you
can form is: 3×4=12 (You can choose any shirt with any pair of pants.)
115
3. Permutations
116
Class Activities 1
Solution
• 4!=24 different ways
117
Example
• It is required to seat 5 men and 4 women in a row so that the women occupy the even
places. How many such arrangements are possible?
Solution:
We are given that there are 5 men and 4 women. i.e. there are 9 positions.
The even positions are: 2nd, 4th, 6th and the 8th places
The number of ways to arrange 4 women in 4 even places is given by the number of
permutations of 4 items: 4! = 4 . 3. 2. 1 = 24 ways
The remaining 5 positions can be occupied by 5 men in: 5! = 5.4.3.2.1 = 120 ways
Total Number of Arrangements: Multiply the number of arrangements for women by
the number of arrangements for men: 24 x 120 = 2880
118
119
Examples
1. A business owner wishes to rank the top 3 locations selected from 5
locations for a business. How many different ways can she rank
them?
2. How many permutations of the letters can be made from the word
STATISTICS?
120
Solutions
121
4. Combination
• A combination is a selection of objects where the order does not
matter.
122
Example
Homework: How many ways can you choose 3 students from a group of 10?
123
Example
124
Approaches to Define Probability
1. Classical Probability (or Theoretical Probability)
2. Empirical (or Relative Frequency) Probability
3. Subjective Probability
4. Axiomatic Probability (or Modern Approach)
125
1. Classical Probability
It refers to the approach where the probability of an event is determined by the ratio
of favorable outcomes to the total number of possible outcomes.
126
2. Empirical Probability
• This approach defines probability based on past data or experimental results.
• It estimates the probability of an event occurring by looking at the number of
times (or frequency) the event has occurred in the past relative to the total number
of trials or experiments (n).
127
Example
128
129
Class Activity
• In a sample of 50 people, 21 had type O blood, 22 had type A blood, 5
had type B blood, and 2 had type AB blood. Set up a frequency
distribution and find the following probabilities.
A. A person has type O blood.
B. A person has type A or type B blood.
C. A person has neither type A nor type O blood.
D. A person does not have type AB blood.
130
131
3. Subjective Probability
• Subjective probability is based on personal judgment, experience, intuition, or
belief about how likely an event is to occur.
• It is often used when there is no clear mathematical or experimental data available
to calculate probability.
Example:
• If a meteorologist predicts a 70% chance of rain tomorrow based on weather patterns, experience,
and judgment, this is a subjective probability.
• A sportswriter may say that there is a 70% probability that Bahir Dar Kenema will win the Coffee
Club next game.
• A physician might say that, on the basis of her diagnosis, there is a 30% chance the patient will
need an operation.
• A seismologist might say there is an 80% probability that an earthquake will occur in a certain
area.
132
4. Axiomatic Probability (Modern Approach)
• It is based on set theory and establishes a set of fundamental rules (called axioms)
that probability must follow.
133
134
Cont’d
135
Cont’d
136
Example(Addition Rule)
In a hospital unit there are 8 nurses and 5 physicians; 7 nurses and 3 physicians are
females. If a staff person is selected, find the probability that the subject is a nurse or
a male.
137
Example (Multiplication Rule)
• A bag contains 10 balls: 4 red, 3 blue, and 3 green. What is the probability of
drawing 3 balls and getting exactly 1 red, 1 blue, and 1 green ball?
138
Examples(Dependent and Independent Events)
139
Example(Conditional Probability)
140
141
Total Probability Theorem
• The Total Probability Theorem is a fundamental rule in probability theory used to
compute the probability of an event based on a partition of the sample space.
• It is particularly useful when the event of interest can occur in multiple mutually
exclusive ways.
142
143
Bayes Theorem
• Bayes’ theorem is a way to figure out conditional probability.
• Conditional probability is the probability of an event happening given that it has
some relationship to one or more other events.
• Let B1, B2… Bk is partition value of the sample space S and A be an event
associated with S. Then the probability of the ith partition, Bi, given event A can
be found using Bayes’ theorem,
144
Example
145
Example
1. Three types of machines A, B, and C produce 50%, 30% and 20%
respectively of the total output. The percentages of defective output of
this computer are 3%, 4% and 5% respectively. If an item is selected
at random,
A. What is the probability that the selected item is defective?
B.What is the probability that this selected item is taken from machine
B?
146
Solution
147
Homework
1. A bag contains 6 white and 4 red balls. Another bag contains 5 white
and 7 red balls. Two balls are transferred from the first bag to the
second bag and then one ball is taken from the second bag. What is
the probability that the ball drawn from the second bag is red?
2. For three persons A, B, C the chance of being the selected as a
manager of the firm is in the ratio of 4:1:2 respectively. The
respective probabilities for them to introduce a radical change in
marketing strategy are 0.4, 0.8 and 0.6. If the change does take place,
find the probability that it is due to the appointment of person A?
148
Chapter 4
4. Probability Distribution
&
Random Variables
149
Random Variables
• A Random variable is a mathematical representation that assigns numerical
values to the outcomes of random events, enabling us to quantify uncertainty and
randomness.
• There are two types of random variables:
• Discrete Random Variables: These can take on a finite or countable infinite
number of distinct values or whole numbers.
• Examples: The number of heads when flipping three coins, The number of cars
passing a street light in an hour, Rolling a die (values: 1, 2, 3, 4, 5, 6).
• Continuous Random Variables: These can take on an infinite number of values
within a given range.
• For example, the height of people is a continuous random variable, as it can take
on any real value within a range.
150
Probability Distribution
• A probability distribution shows how probabilities are assigned to a values of
random variable's.
• Discrete Probability Distribution: gives the probability of each possible value of
a discrete random variable, described by a probability mass function (PMF).
Example: Binomial Distribution, Poisson Distribution, etc.
• Continuous Probability Distribution: gives the likelihood of a continuous
random variable falling within a range, described by a probability density function
(PDF). Example: Normal Distribution
151
• Let X represent the number of heads obtained when tossing a coin 3 times. The
possible values for X are 0, 1, 2, or 3.
152
153
Class Activities
154
155
Expected Value and Variance for Discrete RVs
• Expected Value (E[X]): The expected value is the weighted average of all
possible values of the random variable, where the weights are the probabilities of
those values.
• Variance (Var[X]): Variance measures how spread out the values of the random
variable are around the expected value.
156
Example
• Let X represent the number of heads obtained when tossing a coin 3 times.
157
158
Example
Calculate constant c , mean, variance.
159
160
Binomial Distribution
• The binomial distribution is a discrete probability distribution that models the
number of successes in a fixed number of independent Bernoulli trials.
• It is characterized by two parameters: the number of trials (n) and the probability of
success(p) on each trial.
Properties:
• Fixed Number of Trials (n): The experiment is conducted a specific number of
times.
• Two Outcomes: Each trial results in a "success" or "failure."
• Independent Trials: The outcome of one trial does not affect the others.
• Constant Probability (p): The probability of success remains the same for each
trial.
161
162
Class Activities
163
164
Class Activities 1
165
Example
A company sends out 10 promotional emails to its customers and past
data shows that 20% of recipients typically respond.
A. What is the probability that exactly 5 recipients respond?
B. What is the probability that at least 2 recipients respond?
C. Find expected and variance of number of recipients respond.
166
167
Class Activity
168
169
Poisson Distribution
• The Poisson distribution is a discrete probability distribution that
expresses the probability of a given number of events occurring in a
fixed interval of time or space, given that these events occur with a
known constant mean rate and independently of the time since the last
event.
170
171
Class Activities 1
1. A sales firm receives, on average, 2 calls per hour on its toll-free number. Find
the probability that it will receive the following.
A. exactly 1 calls per hour
B. At most 2 calls per hour
C. At least 3 calls per 2 hour
D. 1 or more calls per 30 minutes.
172
173
Class Activities
1. If there are 200 typographical errors randomly distributed in a 500
page manuscript, find the probability that a given page contains
exactly 3 errors.
174
175
Normal Distributions
• The normal distribution is a continuous probability distribution characterized by
its bell-shaped curve, which is symmetric about the mean. It is defined by two
parameters:
• Mean (μ): The average of the distribution, which determines the center of the
curve.
• Standard Deviation (σ): Measures the spread or dispersion of the distribution.
176
177
Normal and Skewed Distribution
178
The Standard Normal Distribution
• The standard normal distribution is a normal distribution with a mean
of 0 and a standard deviation of 1.
• All normally distributed variables can be transformed into the standard
normally distributed variable by using the formula for the standard
score:
179
Class Activities 1
1. Find the area under the standard normal distribution curve to the left
of z = 1.73.
2. Find the area under the standard normal distribution curve to the
right of z = −1.24.
3. Find the area under the standard normal distribution curve between z
= 1.62 and z = −1.35.
180
181
182
Homework
1. Find the probability for each. (Assume this is a standard normal
distribution.)
A. P(0 < z < 2.53)
B. P(z < 1.73)
C.P(z > 1.98)
183
Class Activities 2
1. Each month, an American household generates an average of 28
pounds of newspaper for garbage or recycling. Assume the variable
is approximately normally distributed and the standard deviation is 2
pounds. If a household is selected at random, find the probability of
its generating
A. Between 27 and 31 pounds per month
B. More than 30.2 pounds per month
184
185
186
187
Example:
• A school wants to know the average test score of all its students. The test scores
have a population mean of 70 and a standard deviation of 20. A sample of 100
students is taken.
A. What is the probability that the average score in this sample is greater than 72?
B. What is the probability that the average score in this sample is less than 69?
C. What is the probability that the average score in this sample is between 68 and
73? Short Way
188
189
190
Chapter 5
191
Introduction
• Researchers are interested in answering many types of questions.
• For example,
Automobile manufacturers are interested in determining whether a new type of seat belt
will reduce the severity of injuries caused by accidents.
A physician might want to know whether a new medication will lower a person’s blood
pressure.
An educator might wish to see whether a new teaching technique is better than a
traditional one.
A pharmaceutical company want to know if the new drug is more effective than the
standard treatment.
A telecom company wants to predict which customers are likely to leave their service.
192
Cont’d
• These types of questions can be addressed through statistical techniques and
methods for making valid inferences.
• Statistical inference is the process of using data from a sample to make
generalizations (inferences) about a population.
• The two main types of statistical inference are estimation and hypothesis testing.
• Both rely on probability theory to draw conclusions about the population from a
sample.
193
5.1. Estimation
194
A. Point Estimation
195
Desirable Properties of Good Estimators
1. The estimator should be an unbiased estimator. A point estimator is unbiased if its
expected value equals the true value of the parameter being estimated.
2. The estimator should be a relatively efficient estimator. An estimator is efficient if it
has the smallest variance among all unbiased estimators for the parameter.
3. The estimator should be consistent. An estimator is consistent if it converges to the
true parameter value as the sample size increases.
4. The estimator should be sufficient. A statistic is sufficient if it uses all the
information in the data related to the parameter being estimated.
196
B. Confidence Interval Estimation
• A confidence interval provides a range of values within which the population
parameter is likely to lie.
• The interval is associated with a specific confidence level, typically 95% or 99%,
which represents the probability that the interval contains the true parameter value.
• A 95% confidence interval means that if we were to take 100 different samples
and compute a confidence interval for each, we expect 95 of those intervals to
contain the true population parameter.
197
Cases in Confidence Interval
• When the population standard deviation (𝜎) is known and n is large(n≥30), then the population
mean is estimated by:
• When the population standard deviation (𝜎) is unknown and n is large(n≥30), then the population
mean is estimated by:
• When the population standard deviation (𝜎) is unknown and n is small(n<30), then the population
mean is estimated by:
Note: Even if the sample size is small (n<30), use Z-distribution when the population data is
normal.
198
Cont’d
199
Example
• The Dean of the Business School wants to estimate the mean number of hours worked per week by
students. A sample of 49 students showed a mean of 24 hours with a standard deviation of 4 hours. What
is the population mean?
Solution:
• the value of the population mean is not known. Our best estimate of this value is the sample mean of 24.0
hours. This value is called a point estimate. To find the 95 percent confidence interval for the population
mean, we can compute it as
• The confidence limits range from 22.88 to 25.12. About 95% of the similarly constructed intervals
included the population parameter.
200
5.2. Hypothesis Testing
• A statistical hypothesis is an assumption or claim about a population parameter which
may or may not be true.
• Hypothesis testing provides a method to evaluate and test these assumptions/claims
regarding population parameters.
• Two types of hypothesis.
A.The Null hypothesis (H0) is the default or initial assumption about a population
parameter. Typically, it represents the idea of "no effect" or "no difference."
B.The Alternative hypothesis(research hypothesis) (H1). This represents what you
want to prove or the opposite of the null hypothesis. It suggests there is an “effect” or
a “difference.”
201
Types of Hypothesis Tests
202
Illustration
• Situation A: A medical researcher is interested in finding out whether a
new medication will have any undesirable side effects. The researcher is
particularly concerned with the pulse rate of the patients who take the
medication. The researcher knows that the mean pulse rate for the
population under study is 82 beats per minute. Will the pulse rate
increase, decrease, or remain unchanged after a patient takes the
medication?
• the hypotheses for this situation are
H0: μ = 82 and H1: μ ≠ 82
• This test is called a two-tailed test since the possible side effects of the
medicine could be to raise or lower the pulse rate.
203
• Situation B: A chemist invents an additive to increase the life of an
automobile battery. If the mean lifetime of the automobile battery
without the additive is 36 months, then her hypotheses are
H0: μ = 36 and H1: μ > 36
• In this situation, the chemist is interested only in increasing the lifetime
of the batteries, so her alternative hypothesis is that the mean is greater
than 36 months.
• This test is called right-tailed, since the interest is in an increase only.
204
• Situation C: A contractor wishes to lower heating bills by using a
special type of insulation in houses. If the average of the monthly
heating bills is $78, her hypotheses about heating costs with the use of
insulation are
H0: μ = $78 and H1: μ < $78
• This test is a left-tailed test, since the contractor is interested only in
lowering heating costs.
205
Class Activities
206
207
Types of Errors in Hypothesis Testing
208
Basic Terms in Hypothesis testing
• The test statistic is a standardized value that is calculated from sample data during
a hypothesis test.
• The critical value is a threshold that defines the boundary between the acceptance
region and the rejection region in hypothesis testing. It is determined based on the
significance level and the specific test being used.
• The critical region, also known as the rejection region, is the range of values for
the test statistic that leads to the rejection of the null hypothesis.
• The noncritical region, also known as the nonrejection region, is the range of
values for the test statistic where the null hypothesis cannot be rejected.
209
• A right-tailed test is a type of hypothesis test where the alternative hypothesis
suggests that the population parameter is greater than the value specified in the
null hypothesis. In this test, the critical region is located in the right tail of the
probability distribution.
• A left-tailed test is a type of hypothesis test where the alternative hypothesis
suggests that the population parameter is less than the value specified in the null
hypothesis. In this test, the critical region is located in the left tail of the
probability distribution.
210
• A two-tailed test is a type of hypothesis test where the alternative hypothesis
suggests that the population parameter is different from a specific value in either
direction.
• This means that the test checks for the possibility of an effect in two directions:
both greater than and less than the value specified in the null hypothesis.
211
Steps in Hypothesis Testing
212
213
214
215
Example
1. A telecom company wants to check if the average SMS delivery time has changed
(either improved or worsened) after a software upgrade. Historically, the average
time to deliver an SMS was 5 seconds. A sample of 100 SMS deliveries shows an
average delivery time of 4.8 seconds with a standard deviation of 0.5 seconds.
A. Construct a 95% CI for average delivery time for all SMS.
B. Test the claim at 5% level of significance.
216
217
218
Examples:
A telecom provider wants to test if its 5G network offers faster data
speeds than its 4G network. The average speed for 4G is known to be
100 Mbps. The provider collects data from 50 5G users, showing an
average speed of 110 Mbps with a standard deviation of 15 Mbps.
A. Construct a 99% CI for the average speed of 5G networks.
B. Test the claim at 1% level of significance.
219
220
221
Examples:
• A factory claims the average production time of a device is 15
minutes. A sample of 10 devices shows a mean time of 14 minutes
with a standard deviation of 2 minutes. Test if the production time is
less than 15 minutes at a 5% significance level.
222
223
224
Class Activities
1. The mean life of a sample of 200 tyres/tire taken from the lot is found to be 40,000kms. Past experience shows
that the standard deviation for life of tyres in the lot is 3200kms.
A. Construct a 95% confidence interval for the mean life of tyre in the lot is expected to lie?
B. Is it reasonable to suppose the mean life of tyres in the lot as 41,000kms?(At 5% level of significance)
2. A soap manufacturing company was distributing a particular type of brand through a large number of retails
soaps. Before a heavy advertising movement, the mean sales per weak per shop were 140 dozens. After the
movement, a sample of 49 shops was taken and the mean sales were found to be 147 dozens with SD 16.
A. Construct a 95% confidence interval for the mean sales of soap manufacturing company?
B. Can you consider the advertisement effective?
3. An automobile tyre manufacturing climes that the average life of a particular grade of tyres is more than
20,000kms when used under normal driving conditions. A random sample of 16 tyres was tested and mean and
SD of 22,000Kms and 5000kms respectively were computed.
A. Construct a 95% confidence interval for the average life of an automobile tyre manufacturing company?
B. At 5% level of significance, decide whether the manufacturer’s clime is true?
225
226
227
228
Homework
1. A researcher believes that the mean age of medical doctors in a large hospital system is older than the
average age of doctors in the United States, which is 46. Assume the population standard deviation is 4.2
years. A random sample of 30 doctors from the system is selected, and the mean age of the sample is
48.6. Test the claim at α = 0.05.
2. The Medical Rehabilitation Education Foundation reports that the average cost of rehabilitation for
stroke victims is $24,672. To see if the average cost of rehabilitation is different at a particular hospital,
a researcher selects a random sample of 35 stroke victims at the hospital and finds that the average cost
of their rehabilitation is $26,343. The standard deviation of the population is $3251. At α = 0.01, can it
be concluded that the average cost of stroke rehabilitation at a particular hospital is different from
$24,672?
3. A researcher claims that the average wind speed in a certain city is 8 miles per hour. A sample of 32 days
has an average wind speed of 8.2 miles per hour. The standard deviation of the population is 0.6 mile per
hour. At α = 0.05, is there enough evidence to reject the claim?
229