STT205
STT205
STT205
GUIDE
STT 205
STATISTICS FOR MANAGEMENT SCIENCES 1
Lagos Office
14/16 Ahmadu Bello Way
Victoria Island, Lagos
e-mail: centralinfo@nou.edu.ng
URL: www.nou.edu.ng
ISBN: 978-978-786-034-2
ii
STT 205 COURSE GUIDE
CONTENTS
Introduction ……………………………………………………… iv
Course Competencies ………………………………………….... iv
Course Objectives ……………………………………………….. v
Working through this Course ……………………………………. v
Course Materials …………………………………………………. vi
Course Guide …………………………………………………….. vi
Modules and Units ……………………………………………….. vii
References and Further Readings ……………………………….. vii
Presentation Schedule ……………………………………………. viii
Course Overview ……………………………………………….... viii
Assessment ………………………………………………………. ix
Portfolio ………………………………………………………….. ix
Application of Knowledge Gained ………………………………. ix
Mini Projects with Presentation …………………………………. ix
Assignment File ………………………………………………….. x
Tutor-Marked Assignments (TMAS) ……………………………. x
Final Examination and Grading …………………………………. xi
How to Get the Most from the Course …………………………… xi
Tutors and Tutorials ……………………………………………… xiii
iii
STT 205 COURSE GUIDE
INTRODUCTION
COURSE COMPETENCIES
iv
STT 205 COURSE GUIDE
COURSE OBJECTIVES
The course major aim is to give the students a detailed and simplified
yet standard approach to the understanding of statistical techniques
useful in presentation, description and analysis of statistical data. It will
also expose the students to decision making techniques in statistical
inference. It will as well help the students to become adequately
conversant with statistical techniques required to successfully prosecute
their academic programmes, especially skills needed to carry out their
final year research project and those required in future research works
and development.
The course is divided into modules and units. The modules are derived
from the course competencies and objectives. The competencies will
guide you on the skills you will gain at the end of this course. So, as you
work through the course, reflect on the competencies to ensure mastery.
The units are components of the modules. Each unit is sub-divided into
ii introduction, intended learning outcome(s), main content, self-
v
STT 205 COURSE GUIDE
You are required to read the study units, textbooks and other materials
on the course.
COURSE MATERIALS
COURSE GUIDE
This Course Guide tells you what the course is about, what course
materials you will be using and how you can work your way through
these materials. It suggests some general guidelines for the amount of
time you are likely to spend on each unit of the course in order to
complete it successfully. It also gives you some guidance on your tutor--
marked assignments. Detailed information on tutor-marked assignment
is found in the separate file.
vi
STT 205 COURSE GUIDE
STUDY UNITS
Each unit consists of the week direction for study, reading material,
other resources and summaries of key issues and ideas. The units direct
you to work on exercises related to the required readings
Recommended Textbooks
It is advisable you have some of the following books;
vii
STT 205 COURSE GUIDE
Moore, David S., "The Basic Practice of Statistics." Third edition. W.H.
Freeman and Company. New York. 2003.
PRESENTATION SCHEDULE
There are twenty-two units in this course. Each unit represent particular
area to be covered in the study. Also, the weekly activities are presented
in Table 1 while the required hours of study and the activities are
presented in Table 2. This will guide your study time. You may spend
more time in completing each module or unit.
COURSE OVERVIEW
This table 1 displays the units, the number of weeks you should take to
complete them and the assignment as follows;
viii
STT 205 COURSE GUIDE
ASSESSMENT
There are two types of the assessment of the course. The first is the
tutor-marked assignments and the second is written examination.
At the end of the course, you will need to sit for a written examination
of three hours’ duration. This examination will also account for 70% of
your total course mark.
PORTFOLIO
A portfolio has been created for you tagged ―My Portfolio‖. With the
use of Microsoft Word, state the knowledge you gained in every Module
and in not more than three sentences explain how you were able to apply
the knowledge to solve problems or challenges in your context or how
you intend to apply the knowledge. Use this Table format:
ix
STT 205 COURSE GUIDE
ASSIGNMENT FILE
In this file, you will find the details of the work you must submit to your
tutor for marking. The marks you obtain for these assignments will
count towards the final mark you obtain for this course. Further
information on assignments will be found in the Assignment File itself
and later in this Course Guide in the section on Assessment.
Take the assignment and click on the submission button to submit. The
assignment will be scored, and you will receive a feedback.
There are seven tutor-marked assignments for this course, all of which
you are expected to submit. You are encouraged to work all the
questions thoroughly. Each assignment accounts for 8.333% of your
total course mark.
Assignment questions for the units in this course are contained in the
Assignment File. You will be able to complete your assignments from
the information and materials contained in your course material,
textbooks and further readings. However, it is desirable in all degree
level of education to demonstrate that you have read and researched
more widely than the required minimum. You should use other
references to have a broad viewpoint of the subject and also have deeper
understanding of the subject.
x
STT 205 COURSE GUIDE
The final examination will be of three hours' duration and have a value
of 70% of the total course grade. The examination will consist of
questions which reflect the types of self-testing, practice exercises and
tutor-marked problems you have previously encountered. All areas of
the course will be assessed.
You are advised to use the time between finishing the last unit and
sitting the examination to revise the entire course. You might find it
useful to review your self-tests, tutor-marked assignments and
comments on them before the examination as the final examination
covers information from all parts of the course.
Finally, the examination will help to test the cognitive domain. The test
items will be mostly application, and evaluation test items that will lead
to creation of new knowledge/idea.
The main body of the unit serves as a guide through the required reading
from other sources. This will usually be from either the recommended
text books or from a Readings section. Some units require the
undertaking of computer practical work. The purpose of the computing
work is twofold. Firstly, it will enhance the student’s understanding of
the material in the unit and secondly, it will endow students with
practical experience of using programs, which they could well encounter
in their work life.
IMPORTANT INFORMATION;
xii
STT 205 COURSE GUIDE
There are some hours of tutorials provided in support of this course. You
will be notified of the dates, times and location of these tutorials
together with the name and phone number of your tutor, as soon as you
are allocated a tutorial group.
xiii
STT 205 COURSE GUIDE
Your tutor will mark and comment on your assignments, keep a close
watch on your progress and on any difficulties, you might encounter,
and provide assistance to you during the course. You must mail your
tutor-marked assignments to your tutor before the due date (at least two
working days are required). They will be marked by your tutor and
returned to you as soon as possible.
You should try your best to attend the tutorials. This is the only chance
to have face to face contact with your tutor and to ask questions which
are answered instantly. You can raise any problem encountered in the
course of your study. To gain the maximum benefit from course
tutorials, prepare a question list before attending them. You will learn a
lot from participating in discussions actively.
ONLINE FACILITATION
There will be two hours of online real time contact per week
making a total of 26 hours for thirteen weeks of study time.
At the end of each video conferencing, the video will be
uploaded for view at your pace.
You are to read the course material and do other assignments as
may be given before video conferencing time.
The facilitator will concentrate on main themes.
The facilitator will take you through the course guide in the first
lecture at the start date of facilitation
xiv
STT 205 COURSE GUIDE
Read all the comments and notes of your facilitator especially on your
assignments, participate in forum discussions. This will give you
opportunity to socialise with others in the course and build your skill for
teamwork. You can raise any challenge encountered during your study.
To gain the maximum benefit from course facilitation, prepare a list of
questions before the synchronous session. You will learn a lot from
participating actively in the discussions.
7 7 Unit 1 Unit 2
8 Revision
LEARNER SUPPORT
xv
STT 205 COURSE GUIDE
COURSE BLUB
xvi
MAIN
COURSE
CONTENTS
Unit Structure
1.1 Introduction
1.2 Intended Learning Outcomes (ILOs)
1.3 Main Content
1.3.1 Definition of Statistical Terms
1.3.2 Event
1.3.3 Classes of Event
1.3.4 Elementary Event
1.3.5 Composite Event
1.3.6 Branches of Statistics
1.3.6.1Descriptive Statistics
1.3.6.2 Inferential Statistics
1.3.7 Statistical Variable
1.3.7.1 discrete variables
1.3.7.2 continuous variable
1.3.8 Types of Variables:
1.3.8.1 Discrete Variable
1.3.8.2 Continuous Variable
1.3.9 The Importance of Statistics
1.4 Summary
1.5 References/Further Reading/Web Resources
1.1 Introduction
The word Statistics has several uses and meanings; it can be used to
denote numerical data sets, information from experimental units, and so
on. It is also defined differently in various fields like Management
Science Statistics, Biostatistics, Agriculture Statistics, Industrial
1
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
By the end of this unit, you will be able to know the three main
objectives of statistics which are:
Estimation,
Prediction, and
Decision making from analysis of properly collected data.
∑n ̅ 2
i=1(Xi −X)
Sample variance = S 2 = 1.2
n−1
2
STT 205 MODULE 1
1.3.2 Event
3
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Both of these are employed in scientific analysis of data and both are
equally important students of statistics.
- discrete variables
- continuous variable
4
STT 205 MODULE 1
Self-Assessment Exercise(s)
1.9 Summary
Unit 1 Introduces basic terms and definitions and discusses how and
when statistics are used in research and in real-life.
5
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Moore, David S., "The Basic Practice of Statistics." Third edition. W.H.
Freeman and Company. New York. 2003.
6
STT 205 MODULE 1
Unit Structure
2.1 Introduction
2.2 Intended Learning Outcomes (ILOs)
2.3 Sources and Methods of Statistical
Data
2.3.1 Overview of Statistical data
2.3.1.1 Definition of Data Sets
2.3.1.2 Definition of terms in Data Sets
2.3.2 Types of Data and Variables
2.3.2.1 Numerical and Non-Numerical Data
2.3.2.2 Deductive and Inductive Statistics
2.3.2.3 Quantitative and Qualitative Data
2.3.3 Sources of Statistical Data
2.3.3.1 Primary Sources
2.3.3.2 Secondary Sources
2.3.4 Method of Collecting Data
2.3.4.1 Direct Observation
2.3.4.3 Mail Questionnaire
2.3.4.3 Personal Interviews
2.4 Summary
2.5 References/Further Reading/Web Resources
2.1 Introduction
7
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Raw Data: raw data is data that has not been processed for use.
It is important to note that information is end product
of data processing.
i. numerical data
ii. non-numerical data
iii. Nominal data
Numerical Data: these are data reduced to numbers which are purely
quantitative depending on the type of variable.
Nominal data: They are strictly connected with names and cannot be
reduced to numbers.
8
STT 205 MODULE 1
Deductive Statistics
If on the other hand the sample data are only analysed,
summarized, and presented without making any reference about
the population. We call such phase of statistics deductive
statistics.
Inductive Statistics
If statistical data, obtained on the basis of a sample selected from
a population of units of enquiry, are used to make valid
conclusion about the entire population we call such phase
inductive statistics or statistical inference.
Statistical data can also be divided into two: quantitative and qualitative
data.
9
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
may or may not have been published. Such sources of data are called
primary sources.
2.3.3.2 Secondary Sources: Data for which the investigator is not the
originator/ initiator but which was collected from other published
records, gazettes, books, journals, registers etc. are classified as
secondary sources.
For primary data, collection can be done by using any of the following
methods;
- Direct observation
- Mail questionnaire
- Personal interviews
10
STT 205 MODULE 1
Simulation of data.
Existing data: It can also be seen as a Secondary data since the data are
originally collected and then archived or any other kind of “data” that
was simply left behind at an earlier time for some other purpose.
Self-Assessment Exercise(s)
2.4 Summary
11
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
12
STT 205 MODULE 1
Unit Structure
3.1 Introduction
3.2 Intended Learning Outcomes (ILOs)
3.3 Main Content
3.3.1 Definition of a Questionnaire:
3.3.2 Qualities of a Good Questionnaire
3.3.3 Types of Questionnaire
3.3.4 Parties to a Questionnaire
3.3.4.1 Enumerator
3.3.4.2 Respondent
3.3.4.3 Enumeration (Observation Unit)
3.4 Design of Questionnaire
3.5 Forms of Questionnaire
3.6 Component of a Good Questionnaire
3.7 Advantages and Disadvantages of the Use of Questionnaires
3.8 Summary
3.9 References/Further Reading/Web Resources
3.1 Introduction
13
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
By the end of unit 3, you will be able to translate the objectives of the
data collection process into a well conceptualized and methodologically
comprehensive study.
14
STT 205 MODULE 1
3.3.6 Respondent: This is the person that answers the questions in the
questionnaire.
15
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Example:
1. Through which source did you hear about bank consolidation?
A. Radio
B. Television
C. Newspaper
D. Internet
E. Friends.
Open ended: Here a questionnaire is designed in such a way that
the respondent has to express his or her view on a particular issue
at hand.
16
STT 205 MODULE 1
3.8 Summary
Unit 3, discusses in the details of how to translate the objectives of the
data collection process into a well conceptualized and methodologically
comprehensive study using questionnaires.
In this unit, you have learnt to translate the objectives of the data
collection process into a well conceptualized and methodologically
comprehensive study.
17
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
1.1 Introduction
1.2 Intended Learning Outcomes (ILOs)
1.3 Main Content
1.3.1 Statistical Tabulation
1.3.2 Kinds of Tabulation
1.3.3 Types of Tables
1.3.4 Characteristics of a Tables
1.3.5 Preparation of a Tables
1.3.6 Advantages and Disadvantages of Use of Tables
1.4 Summary
1.5 References/Further Reading/Web Resources
1.1 Introduction
18
STT 205 MODULE 2
19
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
A statistical table has some general features no matter the purpose for
which the table is constructed. Some of these are:
20
STT 205 MODULE 2
SELF-ASSESSMENT EXERCISE(S)
1.4 Summary
When data are collected and put into numerical form, they do not seem
to be meaningful until they are summarized, presented in tables, grouped
into categories or frequencies, or prepared as charts and diagrams and
summary calculations made which is what we achieved in unit 5.
They main focus of this unit is to simplifying the details given in a mass
of data into such form that the main features may be brought out to
make the assembled data easily understood.
21
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
22
STT 205 MODULE 2
UNIT 2 FREQUENCY
Unit Structure
2.1 Introduction
2.2 Intended Learning Outcomes (ILOs)
2.3 Main Content
2.3.1 Frequency Distribution
2.3.2 Group Frequency Distribution
2.3.2.1 Class Interval and Class Limits
2.3.2.2 Class Boundary (True Class Limit)
2.3.2.3 Class Size (Class Width) of a Class Interval
2.3.2.4 The Class Mark
2.3.3 Relative Frequency
2.3.3.1 Cumulative Frequency
2.3.3.2 Cumulative Frequency Distribution
2.4 Histogram
2.5 Frequency Polygon
2.6 Cumulative Frequency Polygon (Ogive Curve)
2.7 Summary
2.8 References/Further Reading/Web Resources
2.1 Introduction
The major outcome is to covers one of the basic uses of statistics, which
is organizing raw data into simpler, more useful and understandable
form by creating a frequency distribution. This chapter
also introduces how to graph statistical information.
23
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Example: The following are the ages of children in Christ the King
Choir: 12, 8, 9, 12, 15, 10, 9, 12, 10, and 12.
Sometimes, data are too much and becomes practically impossible to put
them in a normal table as in the example above. They are then put into
specified groups known as class interval. The frequency can
conveniently be known through group arrangement, that is, some set of
numbers belonging to a particular group of class. Class size is the
Number of values in a class interval.
Any symbol defining a class such as 21-30, 31- 40 and so on, as in table
2.2 is called the class interval. In the first class interval, the value 21 is
the lower class limit while 30 is the upper class limit. A class interval
which has either lower or upper class limit is said to be an open class
interval.
24
STT 205 MODULE 2
They are chosen such that no observation of value in the raw data being
grouped takes its value. Class boundaries are obtained in practice by
taking the average of upper limit of a class and the lower class limit of
the next class, and also the average of the upper limit of the first class
and lower class limit of the second class. Once a class boundary value is
obtained, class boundaries of other class can easily be obtained by
adding or subtracting ±0.5 from the class size as shown in table 3.3.
44 + 48
𝑐𝑙𝑎𝑠𝑠 𝑚𝑎𝑟𝑘 = = 46
2
2.3.3 Relative Frequency
25
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Using the data in table 3.3, we can generate the cumulative frequency
table.
2.4 Histogram
26
STT 205 MODULE 2
be closed to the horizontal axis by considering the next lower and higher
class mark by having zero frequencies.
For this frequency distribution to be useful the class limits will have to
be changed to class boundaries and we make columns for the midpoints,
10
0
0.5-3.5 3.5-6.5 6.5-9.5 9.5-12.5 12.5-15.5 15.5-18.5
DISTANCE
Figure 3.1: frequency distribution plot for the number of kilometres that
the employees travelled to work each day.
27
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
FREQUENCY POLYGON
Commuting Distance of Employees
16
14
12
Number of Employees
10
0
0 2 4 6 8 10 12 14 16 18
Distance
28
STT 205 MODULE 2
60
50
Cumulative Number of Employees
40
30
20
10
0
0 2 4 6 8 10 12 14 16 18 20
Distance
SELF-ASSESSMENT EXERCISE(S)
29
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
2.7 Summary
This unit covers one of the basic uses of statistics, which is organizing
raw data into simpler and more useful and understandable by creating
a frequency distribution, Histogram, Frequency Polygon and
Cumulative Frequency Polygon/Curve.
30
STT 205 MODULE 2
Unit Structure
3.1 Introduction
3.2 Intended Learning Outcomes (ILOs)
3.3 Main Content
3.3.1 Pictorial Diagram
3.3.2 Pictogram
3.3.1.2 Block Diagram
3.3.1.3 Scattered Diagram
3.3.3 Graph
3.3.2.1 Characteristics of a Graph
3.3.2.2 Line Graph
3.3.2.3 Series Graph
3.3.4 Charts
3.3.4.1 Pie Chart
3.3.4.2 Band Curve Chart
3.3.4.3 Bar Chart
3.3.5 Types of Bar Chart
3.3.6 Simple Bar Chart
3.3.7 Component Bar Chart
3.3.8 Percentage Component Bar Chart
3.3.8.1 Multiple Bar Chart
3.4 Summary
3.5 References/Further Reading/Web Resources
3.1 Introduction
31
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
3.3.2 Pictogram
A block broken down into the various sections that make up the data.
The steps involve are as follows;
Step I: Choose a convenient scale that can show clearly the total figure.
Step II: Use the scale chosen in step I to find the size of each section
and the total.
32
STT 205 MODULE 2
Step III: Draw a block of the size of the total figure and mark each
section of it.
................................................................
///////////////////////////////////////////////////////
Y Y
X X
X X
X X
X XXX XX
X XXX
X XXXXX
X XXX
X XXX
X X
Figure 3.4: Illustration of a scatter diagram
33
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
3.3.3 Graph
300
250
200
150
Series1
100
50
0
A B C D
34
STT 205 MODULE 2
The graph below shows when there are many line graphs in one plot and
this type of graph is called series graph.
Trends of Financial Inclusion Indicators
450
400
350
300
250
200
150
100
50
0
1997
1992
1993
1994
1995
1996
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
-50
-100
3.3.4 Charts
3.3.4.1 Pie Chart: A pie chart is simply a circle divided into sectors.
It is a circular method, where each member of the data is represented in
a circle. In fact, this circle represents the total of the data being
represented and each sector is drawn proportionally to its relative size.
The angles of the sectors are proportional to the frequencies of the
numbers, objects or classes.
Discussion
35
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Solution:
To construct a pie chart, assign one sector of a circle to each category.
The angle of each sector should be proportional to the proportion of
measurements (or relative frequency) in that category. Since a circle
contains 360°.
PIE CHART
A B C D
3%
9%
23%
65%
36
STT 205 MODULE 2
Discussion
Simple Bar Chart: simple bar chart is a block diagram in which simple
bars are used to denote magnitudes. This is another name for a block
diagram and there is no difference between the two, in the sense that the
same steps are involved in their construction. They are used for
comparison over many years. The diagram is drawn according to scale
that represent each year with the bar separated by gaps. The time is
shown on the horizontal axis and the other variable (quantity or value) is
shown on the vertical axis.
37
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
15.5-18.5
12.5-15.5
DISTANCE
9.5-12.5
6.5-9.5
FREQUENCY
3.5-6.5
0.5-3.5
0 5 10 15
NUMBER OF EMPLOYEES
Figure 3.8: Bar Chart for the number of miles that the employees of a
large department store travelled to work each day
38
STT 205 MODULE 2
Discussion
Step I: From the raw data, obtain the highest and the lowest values and
then the range = The highest value – The lowest value.
Step II: Determine the no. of the data range at class width C, for equal
class interval. Number of Class = Range
Step III: Obtain the first class interval or class boundary which contains
the lowest value. Other classes are obtain by adding the class size, C, to
the lower and upper limit of the first class until we get a class which will
absorb the highest value.
Step IV: We then tally the data into the classes they belong to obtain the
corresponding class frequencies.
In addition to the steps, the following must be considered;
This chart consists of bars which are sub-divided into two or
more parts.
The length of the bars is proportional to the totals.
The component bars are shaded or coloured differently.
Example:
(a) Component Bar Chart:
39
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
40
STT 205 MODULE 2
For example, imports and exports of Nigeria are shown in multiple bar
chart.
41
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Self-Assessment Exercise(s)
1. What is a Block Diagram and what are the steps for constructing
it?
2. Define the following: Line Graph, series graph, Pie Chart, Band
Curve Chart, Component Bar Chart, Percentage Component Bar
Chart, Multiple Bar Chart and Scattered Diagram.
3. What do you understand by pictorial diagram, Pictogram, Simple
Bar Diagram/Chart, Bar Chart for ungroup data, Bar Chart for
Group data, graph, and charts?
4. What are the Characteristics of a Graph?
5. What the Steps in Constructing a Pie Chart, Band Curve Chart
and Bar Chart?
42
STT 205 MODULE 2
3.4 Summary
43
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
4.1 Introduction
4.2 Intended Learning Outcomes (ILOs)
4.3 Main Content
4.3.1 Ratios
4.3.2 Percentages
4.3.3 Random Numbers
4.3.2.1 A Random Number Table
4.3.2.2 Use of Table of Random Numbers
4.3.2.3 Procedure for the Use of Table of Random Numbers
4.3.2.4Advantages and Disadvantages of Random Numbers
4.4 Summary
4.5 References/Further Reading/Web Resources
4.1 Introduction
Unit 4 compares quantities effectively and checks what the values that we
get can tell us about the magnitude of these quantities.
4.3.1 Ratios: Ratios are fractions which express variation in the data
irrespective of actual or absolute size of the data while percentages are
ratios express with hundred (100) as the denominator; usually not
written.
Example3.1: If the total expenditure is N36 out of which N18 was spent
on food, then the ratio of food to total expenditure is
18 1
Ratio. , This kind of ratio is usually referred to as
36 2
proportion.
44
STT 205 MODULE 2
4.3.2 Percentage
Recall:
45
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
the table are independent. A table of random digits can readily be used
to select a simple random sample by the following procedure: Consider
the frame of lawyers who are members of state Bar Association as of
last month, all together they are also in the frame, we assign numbers to
the lawyers for convenience from 001,......950. Since the frame contains
950 elements, we select 3 digits numbers and starting from the left most
column, suppose we use the 1st 3 digits in each line as a 3 digits number,
the procedure is then as in the next section.
4.3.2.3 Procedure for the use of Table of Random Numbers:
Draw a sample number of size n=20 from a population of 950. Since the
population is of 3 digits, 001, 002. 003............950, Lawyer number 231
is the 1st element for the sample.
A number over 950 would be disregarded, thus each of the 950
lawyers have equal probability of being a sample element.
Lawyer 005 is of second element for the sample.
Also number 231 would be disregarded since that lawyer is
already in the sample, thus each of the 947 has an equal
probability of being of 3rd sample element.
The above procedure is repeated until the required number of
lawyers for the sample is selected.
The 20 lawyers are:
231,055,455,070,003,949,555,647,777,090,047,777,701,694,256,901,16
2,295,007,706.
46
STT 205 MODULE 2
4.4 Summary
Some calculations that can be carried out after tabulations of data are
usually division of a component of the table by another component of
the total. Some other commonly used comparative facts include ratios
and percentages.
47
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
1.1 Introduction
1.2 Intended Learning Outcomes (ILOs)
1.3 Main Content
1.3.1 Measures of Location
1.3.2 Arithmetic Mean
1.3.3 Basic Arithmetic Mean
1.3.4 Mean of Group Data
1.3.5 Computation of Mean Using Change of Origin
1.4 Mode
1.5 Median
1.5.1 Median for Grouped Data
1.5.2 Median for Ungrouped Data
1.5.3 Median for a Set of Ungrouped Data
1.5.4 Median for a Set of Grouped Data
1.6 Weighted Mean (Xw)
1.7 Geometric Mean
1.8 Harmonic Mean
1.9 Summary
1.10 References/Further Reading/Web Resources
1.1 Introduction
These are useful in describing statistical data and comparing one group
of data with another. The most commonly known measures of central
tendency are the arithmetic mean, median, mode, quartiles, deciles,
percentiles, harmonic mean, and geometric mean. In all, we estimate the
values of measure of central tendency.
48
STT 205 MODULE 3
- Arithmetic Mean:
- Mode
- Median
Arithmetic mean may be defined as the sum of all the values of the
items divided by the number of items.
The population mean (theoretical mean) is represented by µ (meu)
which is given by
xi 3.1
N
If a sample of size (n) is taken from the population then the sample
mean is represented by X
xi 3.2
n
X i
X i 1
3.3
N
The mean of the numbers: 1, 2, 3, 4, and 5. = 15 3
5
Definition II: If the numbers. X1, X2.......Xk occur with associated
k
f X i i
frequencies f1, f2......fk then X i 1
k
3.4
f i 1
i
49
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
fU i i
XA
X A i 1
k
XC where i 3.5
f
C
i
i 1
Suppose that a given set of data are grouped such that each item x i
(where i 1, 2, 3, ...n) occurs with frequency f i (where i 1, 2, ..., k ) then
the mean
X
f x i i
3.6
f i
50
STT 205 MODULE 3
5. Add the correction factor to the mean in order to obtain the exact
mean for the distribution using the formula:
A
f ( x A)
i i
3.10
f i
X A
fV i i
3.11
f i
where U i =0, +1or-1,+2 or-2, if the middle class mark is taken as the
assumed mean
1.3.6 Mode
The mode for an ungrouped set of numbers is that value which occurs
with the highest frequency. The mode when it exists may always be
unique.
For a set of grouped data, the modal class (which is the class with the
highest mode) can be clearly ascertained, but the value of mode is not so
obvious, but can be calculated by:
i. Interpolation within the modal class.
ii. Graphic interpolation from the histogram
The mode is located within the modal class by considering the frequency
for the two out joining classes to the modal class
1
X 0 L xw 3.16
1 2
51
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
1.4 Median
For a set of ungrouped data, the median is the value in the middle or the
mean of the two middle values (depending whether the number of values
is odd or even) when the values are arranged in their order of magnitude.
In other words, median is that value which divides the distribution into
two equal parts.
Symbolically, when the sample size (n) is an odd numbers
n 1
Xm 3.15
2
When the sample size (n) is an even numbers
1 Xn Xn
Xn ( 1) , when n is even 3.16
2 2 2
52
STT 205 MODULE 3
N
= median term
2
C = Cumulative frequency of the class proceeding the median
class
F = Freq. of the median class
W = Width of the median class
F = frequency of the median class
53
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
W i
3.20
X GPA
X G i i
3.21
W i
x 2 ……. x n ) n
3.22
When log is used in computation, the formula below is used.
Log X 9 1 log xi 3.23
n
1
X G Anti log log X i 3.24
n
n fi 3.25
f i log xi
X G Anti log 3.27
f i
XH
f i
3.29
1
( x )
i
XH=
fi 3.30
fi
x
i
54
STT 205 MODULE 3
Self-Assessment Exercise(s)
1.7 Summary
In this unit, useful in describing statistical data and comparing one group
of data with another were discussed, such as the arithmetic mean,
median, mode, quartiles, deciles, percentiles, harmonic mean, and
geometric mean.
55
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
2.1 Introduction
2.2 Intended Learning Outcomes (ILOs)
2.3 Main Content
2.3.1 Definition of Fractiles
2.3.2 Quartiles
2.3.3 Percentiles
2.3.4 Deciles
2.3.5 Moments
2.3.6 Skewness
2.3.7 Kurtosis
2.4 Summary
2.5 References/Further Reading/Web Resources
2.1 Introduction
Definition of Fractiles
Fractiles are the measures of partition and the three measures of partition
are the Quartile, Decile and the Percentile.
i. First or lower quartile (Q1) is the value below which 25% of data
can be found.
ii. Second quartile (Q2) is the value below which 50% of the data lie
iii. Third quartile upper quartile (Q3) is the value below which 75%
of the data can be found.
56
STT 205 MODULE 3
2.3.2 Quartiles
These measures are obtained by adjusting the median formula into four
equal parts to locate the appropriate quartile. The quartile is obtained by
dividing the set of data into four equal parts.
2.3.3 Percentiles
The percentiles divide the data into one hundred parts and we have
P1,P2..........P99.
The percentile class is located where we have the cumulative frequency
of
KXN
, 3.2
100
2.3.4 Deciles
The decile divide the set into ten parts. It then have D1,D2,.........D9.
Location of the ith decile is at the point where the cumulative frequency
is iXN
10
The formula then becomes
10 fD j 1
jXN
D j LD j XC 3.3
fD j
57
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
i. The first moment as the Mean which is also known as the average
ii. The second moment (about mean?) as the Variance and it is
pertinent to note that Standard deviation is the square root of the
variance: an indication of how closely the values are spread about
the mean. A small standard deviation means the values are all
similar. If the distribution is normal, 63% of the values will be
within 1 standard deviation.
Self-Assessment Exercise(s)
1. What is fractiles?
2. Explain the difference between first or lower quartile, Second
quartile, third quartile and upper quartile.
3. What do you understand by quartiles, percentiles, moment
4. Write the difference between skewness, Kurtosis and deciles.
5. Write the formula for quartiles, percentiles, deciles moment,
skewness, Kurtosis and explain their parameters
2.4 Summary
58
STT 205 MODULE 3
Deciles and the Percentile. We also have the moment, skewness and
Kurtosis that measures the spread in statistical distributions.
In summary,
Quartile: divides the data into 4 equal parts;
Deciles: divides the data into 10 equal parts;
Percentile: divides the data into 100 equal parts.
Moore, David S., "The Basic Practice of Statistics." Third edition. W.H.
Freeman and Company. New York. 2003.
59
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
3.1 Introduction
3.2 Intended Learning Outcomes (ILOs)
3.3 Main Content
3.3.1 Definition of Dispersion
3.3.2 Properties of a Good Measure of Dispersion
3.3.3 Types of Dispersion
3.3.4 Methods of Dispersion
3.3.5 Mathematical Methods
3.4 Summary
3.5 References/Further Reading/Web Resources
3.1 Introduction
60
STT 205 MODULE 3
3.3.3.2 Relative
- Quartile Deviation
- Average Deviation
- Standard deviation and coefficient of variation.
Coefficient of Range:
This is a relative measure of dispersion and is based on the value
of the range. It is also called range coefficient of dispersion. It is
defined as:
𝐿−𝑆
𝐂𝐨𝐞𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐭 𝐨𝐟 𝐑𝐚𝐧𝐠𝐞 = 3.2
𝐿+𝑆
where, L represents largest value in a distribution
S represents smallest value in a distribution
We can understand the computation of range with the help of examples
of different series,
In the example, the maximum or the highest marks is ‘8’ and the lowest
marks is ‘1’. Therefore, we can calculate range;
L = 8 and S = 1
Absolute Range = L – S = 7 marks
Discrete Series
Continuous Series
Table 3.2: Frequencies
Marks of the Students in Statistics (X) No. of students (F)
Smallest= 10 10-15 4
15-20 10
20-25 26
Largest =30 25-30 8
Absolute Range = L – S = 30 – 10 = 20 marks
Absolute Range = L – S = 30 – 10 = 20 marks
The concept of range is useful in the field of quality control and to study
the variations in the prices of the shares etc.
The concept of ‘Quartile Deviation does take into account only the
values of the ‘Upper quartile (Q3) and the ‘Lower quartile’ (Q1).
Quartile Deviation is also called ‘inter-quartile range’. It is a better
method when we are interested in knowing the range within which
certain proportion of the items fall.
Case Studies
Calculation of Inter-quartile Range, Semi-quartile Range and
Coefficient of Quartile Deviation in case of Raw Data
Suppose the values of X are : 20, 10, 18, 25, 32, 10
In case of quartile-deviation, it is necessary to calculate the values of Q1
63
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Number of items = 6
Therefore,
Inter-quartile range = Q3 – Q1 = 26.75 – 10.50 = 16.25
𝑄3−𝑄1
Semi-quartile range = = 8.125
2
𝑄3−𝑄1
Coefficient of Quartile Deviation =
𝑄3+𝑄1
More examples on Coefficient of Quartile Deviation = (Q3 – Q1) / (Q3 +
Q4 )
Coefficient of Quartile Deviation = (94.5 – 57.25) / (94.5 +57.25)
Coefficient of Quartile Deviation = 0.2454.
Coefficient of Quartile Deviation = (Q3 – Q1) / (Q3 + Q). Coefficient of
Quartile Deviation = (31.25 – 13.50) /(31.25 + 13.50)
Coefficient of Quartile Deviation = 0. 397
64
STT 205 MODULE 3
cases.
- Absolute percentile range = P90 – P10.
- Coefficient of percentile range =
This method of calculating dispersion can be applied generally in case of
open end series where the importance of extreme values are not
considered.
Average Deviation
65
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Standard Deviation
The standard deviation, which is shown by Greek letter s (read as sigma)
is extremely useful in judging the representativeness of the mean. The
concept of standard deviation, which was introduced by Karl Pearson
has a practical significance because it is free from all defects, which
exists in a range, quartile deviation or average deviation.
Standard deviation is calculated as the square root of average of squared
deviations taken from actual mean. It is also called root mean square
deviation. The square of standard deviation i.e., s2 is called ‘variance’.
There are four ways of calculating standard deviation for raw data:
- When actual values are considered;
- When deviations are taken from actual mean;
- When deviations are taken from assumed mean; and
- When ‘step deviations’ are taken from assumed mean.
Disadvantages
- It is difficult to compute.
- It assigns more weights to extreme items and less weights to
items that are nearer to mean.
- It is because of this fact that the squares of the deviations which
are large in size would be proportionately greater than the squares
of those deviations which are comparatively small.
66
STT 205 MODULE 3
3.4 Summary
The concept of central tendency, which is identifying a single value that
can be used to represent an entire data set, and also where a distribution
of data lies and how the variability or differences among scores in a set
of data influences the shape and spread of the distribution were
considered in unit 4’
This unit handles the basic property of dispersion as a value that
indicates the extent to which all other values are dispersed about the
central value in a particular distribution.
67
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
1.1 Introduction
1.2 Intended Learning Outcomes (ILOs)
1.3 Main Content
1.3.1 Set
1.3.2 Method of Representation a Set
1.3.3 Set Theory
1.3.4 Types of set
1.4 Some Laws of Sets of Algebra
1.5 Summary
1.6 References/Further Reading/Web Resources
1.1 Introduction
68
STT 205 MODULE 4
1.3.1 Set
A set may be defined by listing the members in brace brackets e.g. the
set of positive even numbers less than 10 is denoted by the symbol {2, 4,
6, 8}. The set of all positive even integers is denoted by {2, 4, 6........}.
69
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Equal set
Equal set can be defined as when two sets A and B are said to be
equal or identical consisting of exactly the same elements
irrespective of how many times elements are repeated, and we
write
A = B. Thus the set A = {2, 4, 6, 8} equals to B = {2, 2, 4, 6, 6, 8}.
Also, A = {a, b, c, d}
B = {a, d, c, a, d, c}, it doesn’t matter if the elements are repeated.
70
STT 205 MODULE 4
Super Set
For any two sets, A and B, A is said to be a subset of B written as
A B or B is a super set of A written as B A if and only if all
elements of A are also elements of B= and =Empty set.
Proper Subset
A set A is said to be a proper subset of B if A is a subset
of B, A is not equal to B or A is a subset of B but B contains at
least one element which does not belong to A. Assuming, A B
and there exists at least one element of B which is not contained
in A i.e. B A then we say A is a proper subset of B and we
write A B.
Improper Set
Set A is called an improper subset of B if and only if A = B.
Every set is an improper subset of itself. However if we have A
B and every element in B is also contained in A, that is, B A,
then A is called an improper set of B and we write: A B .
We find out that in this case
A = B.
Universal Set
In all application of set theory, when sets of elements are
discussed we can have a larger encompassing set which contains
all sets under discussion i.e. all the sets will be subsets of this
larger set. An example is i students offering courses A, B, and C ,
a larger set could be the set of students in the school. Such a set is
called the Universal set U and vary from one application to the
other. It can also be defined as a set put in consideration for all
set present. It can also be defined as any set which is a superset of
all sets under consideration.
71
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Union of Set
Given two or more set A, B, ............, the union of the sets is
defined as the set of elements found in either A or B or ....... i.e.
the set of all elements that can be found in any of the sets and we
write
A B ……. = {x : x A or x B or x …….}. Combining
element of given sets effectively without repetition.
Intersect of Set
Given two or more sets A, B... the intersection of the sets is the
set of all elements found in A, B and ..., that is, the set of
elements belonging to every one of those sets and we write:
A B ... = {x : x A, x B, x ….}
For example, given that A = {1, 2, 3,4, 5}; B = {2, 4, 6, 8,},
then A B = {2, 4}.
It involves sorting out common elements in the given sets.
Compliment:
Given a universal set U, and a set A which is a subset of the
universal set, the complement of the set A with respect to U
written as A’ or Ac is the set of all elements not found in A but
are in the universal set, that is, A’ = {x : x A, x U}.
Product Set
Show AxB=BxA, where A= 1,2; B= a, b then AxB=(1,a), (1,b),
(2,a), (2,b) and
BxA=(a,1), (b,1), (a,2), (b,2)
72
STT 205 MODULE 4
Partition of Sets
This means separating the set with parenthesis that would contain
all the elements of the set depending on your objective(s)
Example: partition six even numbers as unit sets from set A; A= {1, 2, 3,
4, 5,…, 13}, we have A= {2}, {4}, {6}, {8}, {10}, {12},
Case Studies
We can demonstrate, the concept of subsets, intersection, union,
complement, disjoint, and difference of set in Venn diagrams as follows:
A
B
B
A
AB AB
73
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
U
A’
A
AB
B A
AB=
A-B
Case Study 2
If A = {a, b}, the subsets are { }, {a}, {b}, {a, b}. Therefore the number
of subsets is 4. If A = {a, b, c}.
All the subsets of A are:
{a,}, {b,} {c,}, {a, b} {b, c} {a, c}, {a, b, c} and { }
We thus set up a table as follows
74
STT 205 MODULE 4
Therefore, for any set having n distinct elements, the maximum number
of subsets it can have is 2n, which is called the power set.
75
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
SELF-ASSESSMENT EXERCISE(S)
1.4 Summary
This unit covers the ratio of the favorable cases to the whole number of
cases possible on set theory, combination of elements of given sets
effectively without repetition, sorting out common elements in the given
sets, set theory, types of sets and some laws of sets of algebra.
76
STT 205 MODULE 4
77
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
2.1 Introduction
2.2 Intended Learning Outcomes (ILOs)
2.3 Main Content
2.3.1 Fundamental Principles of Counting Techniques
2.3.2 Axioms of Probability
2.3.3 Partition of Sample Space
2.3.4 Bayes Theorem
2.3.5 Probability Density Functions
2.3.6 Cumulative Density Function
2.3.7 Step Function Graph
2.4 Summary
2.5 References/Further Reading/Web Resources
2.1 Introduction
Case Study 1
1. Assuming there are two ways of travelling from Nsukka to Enugu
and subsequently three ways of travelling from Enugu to Aba,
then there are total of 6 ways of travelling from Nsukka to Aba.
2. If you have 3 shirts and 4 pants; it results to 3×4=12 different
outfits.
3. If there are 6 flavors of ice-cream, and 3 different cones; there
will then be 6×3=18 different single-scoop ice-creams you could
order
78
STT 205 MODULE 4
Multiplicative Rules-
If a procedure can be performed in n1 way, and if following this
procedure, a second one can be performed in n2 ways and so on,
then the number of ways the procedure can be performed in the
order indicated is the product.
Case Study 2
The Multiplication Principle applies when we are making more than one
selection. Suppose we are choosing an appetizer, an entrée, and a
dessert. If there are 2 appetizer options, 3 entrée options, and 2 dessert
options on a fixed-price dinner menu, there are a total of 12 possible
choices of one each as shown in the tree diagram below;
79
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Diane packed 2 skirts, 4 blouses, and a sweater for her business trip. She
will need to choose a skirt and a blouse for each outfit and decide
whether to wear the sweater. Use the Multiplication Principle to find the
total number of possible outfits.
Solution
To find the total number of outfits, find the product of the number of
skirt options, the number of blouse options, and the number of sweater
options.
80
STT 205 MODULE 4
Additive Rule-
The additive principle states that if event A can occur in m ways,
and event B can occur in n disjoint ways, then the event “A or B
” can occur in m+n ways. Also, if a procedure can be done in m1
ways and another done in m2 ways, then the number of ways the
procedure can be done in the other is (m1+m2) ways. It is
important that the events be disjoint : i.e., that there is no way for
A and B to both happen at the same time.
81
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
82
STT 205 MODULE 4
Case Studies 5:
Assuming that, a bag contains 12 white and 8 black balls.
(a) What is the probability of picking a white excludes picking a black,
when the events are mutually exclusive;
83
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Solution;
a) P(A) = 12/20
b) P(B) = 8/20
3.1
where;
P(A) is the probability of event A
P(B) is the probability of event B
P(A|B) is the probability of observing event A if B is true
P(B|A) is the probability of observing event B if A is true.
84
STT 205 MODULE 4
Case Studies 6;
The table 3.1 shows the number of persons who were reported dead in a
certain community during 2020 and were classified by sex and age.
Table 3.1: Dead in a certain community during 2021 COVID 19
pandemic
Age/sex A1 A2 A3 total
<50yrs 50-79 80+
Male 27 79 21 128
Female 18 23 20 61
P i 45 102 42 189
A
A
Solution
102 79
P( Ai ) P A P x
Ai Ai A 189 102
P = P i
A 45 27 102 79
A
P( Ai ) P Ai
A 42 22
x x x
189 45 189 102 189 42
79
A 189 79 189 79
= P i
X
A 27 79 22 189 28 28
189 189 189
85
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
From the table above, we can also see that given a positive test (subjects
in the Test + row), the probability of disease is 99/198 = 0.05 = 50%.
https://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_probability/BS704_Probability6.html
Case Studies 8:
Suppose a patient exhibits symptoms that make her physician concerned
that she may have a particular disease. The disease is relatively rare in
this population, with a prevalence of 0.2% (meaning it affects 2 out of
every 1,000 persons). The physician recommends a screening test that
costs $250 and requires a blood sample. Before agreeing to the
screening test, the patient wants to know what will be learned from the
test, specifically she wants to know the probability of disease, given a
positive test result, i.e., P(Disease | Screen Positive).
The physician reports that the screening test is widely used and has a
reported sensitivity of 85%. In addition, the test comes back positive 8%
of the time and negative 92% of the time.
86
STT 205 MODULE 4
information this test would produce the results summarized in the table
12.2;
Another important question that the patient might ask is, what is the
chance of a false positive result? Specifically, what is P(Screen Positive|
No Disease)? We can compute this conditional probability with the
available information using Bayes Theorem.
Complementary Events
Note that if P(Disease) = 0.002, then P(No Disease)=1-0.002. The
events, Disease and No Disease, are called complementary events. The
"No Disease" group includes all members of the population not in the
"Disease" group. The sum of the probabilities of complementary events
must equal 1 (i.e., P(Disease) + P(No Disease) = 1). Similarly, P(No
Disease | Screen Positive) + P(Disease | Screen Positive) = 1.
87
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Studies 9:
The entries of a coin tossed twice
X 0 1 2 Total
F 1 2 1 4
f(x) ¼ 2/4 ¼ 1
f(x) ¼ ¾ 4/4 2
n 2 2!
1
x 0 (2 0)!0!
Recall f(x)=P X x
F(0)=P X 0=1/4
1 2 3
F(1)=P X 1 =P X 0,1 =1/4+2/4
4 4
F(2) = P X 2=P X 0,1,2 =1/4+2/4+1/4 1 2 1 4 1
4 4
88
STT 205 MODULE 4
3
C0 = 1; 3C1 = 3; 3C3 = 1
Consider f(x)= C ( x 2 ) 0 x 3
O elsewhere
If f(x) is a pdf
i) find c
ii) p(x<0)
iii) p( 0 x 1)
Solution
3 3
i) f(x)= f ( x)dx = cx 2 dx
0 0
3
x 3
= c 1 discrete pdf eq(1)
3 0
C=1/9
3
x2
0 9
From eq (1) substitute for 1/9= C
x 3 27
1 It is a pdf
3x9 27 ,
ii ) P(0 x 1)
1
x x 1 0
3 3 3
1
1 1
0 9 = 3x9 0 27 27 27 0 27
i. P ( 2 x 3)
3
x 2 x 3 33 2 3
3
27 8
2 9 27 27 27 = 1 8 = 27 8 =
2 27 27 27 27
19
27
89
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Given, f(x) = 3e 3 x , x 0
, check if it is a proper pdf
Solution
To check if the equation above is a proper pdf;
= 9e 3() 9e 3 , Not a proper pdf
3e
3 x
9e 3 x
1
1
Using f(x) = 3e 3 x , x 0
3 3e 3 x
dx = 3
1 3 x
3 e dx 3 3 e 3
3 x
0 e 9 =1.23x10-4
SELF-ASSESSMENT EXERCISE(S)
2.4 Summary
90
STT 205 MODULE 4
91
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
3.1 Introduction
3.2 Intended Learning Outcomes (ILOs)
3.3 Main Content
3.3.1 Permutation
3.3.2 Combination
3.4 Summary
3.5 References/Further Reading/Web Resources
3.1 Introduction
3.3.1 Permutation
92
STT 205 MODULE 4
Case Study 1
If you have a set containing 3 elements; a, b, c and you are to arrange
2 objects from the 3 objects, in this case, you do not replace the element
once selected and in an ordered, then you will have the following a, b,
a, c, b, c, b, a, c, a, c, bresults which is equal to 6
permutations. This is called an ordered arrangement without
replacement.
i. It is denoted by n Pr or P(n, r ) and evaluated in drawing without
n!
replacement as Pr n(n 2)....(n r 1)
n
(n r )!
3.1
ii. If arrangement is with replacement then the number of ways is
given by n r
3.2
Case Study 2
Assuming, you have a set containing 3 elements; a, b, c and you are to
arrange 2 objects from the 3 objects, in this case, you will replace the
element that mean, there is possibility of selecting that particular
element more than once and in an ordered, then you will have the
following a, b, a, c, b, c, b, a , c, a, c, b a, a, b, b, c, c
results which is equal to 9 permutations. This is called an ordered
arrangement with replacement.
iii. If we have n objects are arrange taking all n at the same time and
if we observe that n1 are alike, n2 are alike, and so on then
n 1
n; and the number of arrangements is given by
n!
3.3
n1 !n2 ...nr !
Case Study 3
If an urn contains 10 balls, find the number of ordered arrangements
i. Of size 3 with replacement
ii. Of size 5 without replacement
93
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Solution
i. n 10 and r 3; with replacement number of ways is n r .
n r 103 1000
ii. n=10, r=3 without replacement, the required number of ways is
10!
n
Pr 10 P5 10 X 9 X 8 X 7 X 6 30240
5!
Case Study 4
Find the number of permutations that can be formed using the words (i)
queue and (ii) statistics
Solution:
i. Queue: n 5,e u 2, q 1
Probability
n! 5!
No of ways 30
n1 !n2 !n3 ! 2!2!1!
ii. Statistics n 10, t s 3, i 2, a c 1
n! 10!
No of ways: 50400
n1 !n2 !n3 !n4 !n5 ! 3! X 3! X 2! X 1! X 1!
3.3.2 Combination
i. It is denoted by
n
Cr or nr 3.4
ii. In sampling without replacement, the evaluation becomes
nr r !(nn! r )! 3.5
iii. In sampling with replacement, the number of unordered selection
becomes
(rn!(nr1)!1)!
n r 1
r
n r 1
n 1 3.6
94
STT 205 MODULE 4
Case Study 5
Consider the four letters A,B,C,D. How many unordered selections are
possible taking 3 letters at a time if
i. Replacement is allowed?
ii. Replacement is not allowed?
iii. List samples in (i) and (ii)
Solution
i. With replacement, total number of ordered samples
n r 1
r ,but n 4 and r 3; therefore n r 1
r 20
6
3
ii. Without replacement
4
n
r
4
3
The twenty years of listing (i) are:
AAA AAB AAC AAD ABB
ABC ABD ACC ACD ADD
BBB BBC BBD BCC BCD
BDD CCC CCD CDD DDD
Case Study 6:
Repeat example 3.3 using 3 letters A, B and C for both (a) ordered and
(b) unordered selections of 2 letters out of the three.
Solution
95
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 6
If you have a set containing 3 elements; a, b, c and you are to arrange
2 objects from the 3 objects, in this case, you do not replace the element
once selected and in an ordered, then you will have the following; a, b,
a, c, b, c, results which is equal to 3 combinations. This is called an
ordered arrangement without replacement.
SELF-ASSESSMENT EXERCISE(S)
1.4 Summary
96
STT 205 MODULE 4
97
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
4.1 Introduction
4.2 Intended Learning Outcomes (ILOs)
1.3 Main Content
4.3.1 Introduction to probability
4.3.2 Unconditional Probability
4.3.3 Conditional Probability
4.3.4 Types of Probabilities
4.3.5 Probability Model
4.3.6 Reasons for using probability
4.4 Summary
4.5 References/Further Reading/Web Resources
4.1 Introduction
https://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_probability/BS704_Probability3.html
98
STT 205 MODULE 4
Age (years)
5 6 7 8 9 10 Total
Boys 432 379 501 410 420 418 2,560
Girls 408 513 412 436 461 500 2,730
Totals 840 892 913 846 881 918 5,290
Case Study 2
If we select a child at random (by simple random sampling), then each
child has the same probability (equal chance) of being selected, and the
probability is 1/N, where N=the population size. Thus, the probability
that any child is selected is 1/5,290 = 0.0002. In most sampling
situations we are generally not concerned with sampling a specific
individual but instead we concern ourselves with the probability of
sampling certain types of individuals. For example, what is the
probability of selecting a boy or a child 7 years of age?
Case Study 3
Each of the probabilities computed in the previous section (e.g., P(boy),
P(7 years of age)) is an unconditional probability, because the
denominator for each is the total population size (N=5,290) reflecting
the fact that everyone in the entire population is eligible to be selected.
However, sometimes it is of interest to focus on a particular subset of
the population (e.g., a sub-population).
99
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
For example, suppose we are interested just in the girls and ask the
question, what is the probability of selecting a 9 year old from the sub-
population of girls?
What is the probability of selecting a boy from among the 6 year olds?
https://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_probability/BS704_Probability3.html
100
STT 205 MODULE 4
SELF-ASSESSMENT EXERCISE(S)
4.4 Summary
101
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
102
STT 205 MODULE 5
Unit Structure
1.1 Introduction
1.2 Intended Learning Outcomes (ILOs)
1.3 Main Content
1.3.1 Normal Distribution
1.3.2 Skewed Distributions
1.3.3 Characteristics of Normal Distributions
1.3.4 Z Scores are Standardized Scores
1.3.5 The Standard Normal Distribution
1.3.6 Probabilities of the Standard Normal Distribution Z
1.3.7 Distribution of BMI and Standard Normal Distribution
1.3.8 Computing Percentiles
1.3.9 Evaluation of Probabilities for a Normal Distribution
1.3.10 Students t- Distribution
1.3.11 Fitting a Normal Curve to a Data
1.3.12 Difference between z test and t test
1.3.13 Difference between the t- distribution and the normal
distribution?
1.3.14 Application of t-distribution
1.4 Summary
1.5 References/Further Reading/Web Resources
1.1 Introduction
103
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Note that the horizontal or x-axis displays the scale of the characteristic
being analyzed (in this case weight), while the height of the curve
reflects the probability of observing each value. The fact that the curve
104
STT 205 MODULE 5
is highest in the middle suggests that the middle values have higher
probability or are more likely to occur, and the curve tails off above and
below the middle suggesting that values at either extreme are much less
likely to occur. There are different probability models for continuous
outcomes, and the appropriate model depends on the distribution of the
outcome of interest.
105
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
The mean (μ = 29) is in the center of the distribution, and the horizontal
axis is scaled in increments of the standard deviation (σ = 6) and the
distribution essentially ranges from μ - 3 σ to μ + 3σ. It is possible to
have BMI values below 11 or above 47, but extreme values occur very
infrequently.
106
STT 205 MODULE 5
Note that with the normal distribution the probability of having any
exact value is 0 because there is no area at an exact BMI value, so in this
case, the probability that his BMI = 29 is 0, but the probability that his
BMI is <29 or the probability that his BMI is < 29 is 50%.
What is the probability that a 60 year old man has BMI less than 35?
The probability is displayed graphically and represented by the area
under the curve to the left of the value 35 in the figure below.
Note that BMI = 35 is 1 standard deviation above the mean. For the
normal distribution we know that approximately 68% of the area under
the curve lies between the mean plus or minus one standard deviation.
Therefore, 68% of the area under the curve lies between 23 and 35. We
also know that the normal distribution is symmetric about the mean,
therefore P (29 < X < 35) = P (23 < X < 29) = 0.34. Consequently, P(X
< 35) = 0.5 + 0.34 = 0.84. In other words, 68% of the area is between 23
and 35, so 34% of the area is between 29 and 35, and 50% is below 29.
If the total area under the curve is 1, then the area below 35 is therefore,
0.50 + 0.34 = 0.84 or 84%.
Case Study 2
What is the probability that a 60 year old man has BMI less than 41?
[Hint: A BMI of 41 is 2 standard deviations above the mean.] Try to
figure this out on your own before looking at the answer.
Solution
It is easy to figure out the probabilities for values that are increments of
the standard deviation above or below the mean, but what if the value
isn't an exact multiple of the standard deviation? For example, suppose
we want to compute the probability that a randomly selected man has a
BMI less than 30 (which is the threshold for classifying someone as
obese).
107
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 3
Assuming that the body mass index (BMI) in a population of 60 year old
man’s BMI is normally distributed with mean value = 29 and a standard
deviation = 6.
https://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_probability/BS704_Probability8.html
108
STT 205 MODULE 5
To this point, we have been using "X" to denote the variable of interest
(e.g., X=BMI, X=height, X=weight). However, when using a standard
normal distribution, "Z" is used to denote a variable in the context of a
standard normal distribution. After standardization, the BMI=30
discussed on the previous page is shown below lying 0.15667 units
above the mean of 0 on the standard normal distribution on the right.
Since the area under the standard curve = 1, is easy to obtain the
probabilities of specific observation. For any given Z-score we can
compute the area under the curve to the left of that Z-score.it is pertinent
to note that a "Z" score of 0.0 tilts a probability of 0.50 or 50%, and a
"Z" score of 1, meaning one standard deviation above the mean, lists a
probability of 0.8413 or 84%. That is because one standard
deviation above and below the mean encompasses about 68% of the
area, so one standard deviation above the mean represents half of that of
34%. So, the 50% below the mean plus the 34% above the mean gives
us 84%.
The statistical normal table is organized to provide the area under the
curve to the left of or less of a specified value or "Z value". In this case,
because the mean is zero and the standard deviation is 1, the Z value is
109
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
the number of standard deviation units away from the mean, and the area
is the probability of observing a value less than that particular Z value.
Note also that the normal table shows probabilities to two decimal
places of Z. The units place and the first decimal place are shown in the
left hand column, and the second decimal place is displayed across the
top row.
But let's get back to the question about the probability that the BMI is
less than 30, that is, P(X<30). We can answer this question using the
standard normal distribution. The figures below show the distributions
of BMI for men aged 60 and the standard normal distribution side-by-
side.
====
The area under each curve is one but the scaling of the X axis is
different. Note, however, that the areas to the left of the dashed line are
the same. The BMI distribution ranges from 11 to 47, while the
standardized normal distribution, Z, ranges from -3 to 3. We want to
compute P(X < 30). To do this we can determine the Z value that
corresponds to X = 30 and then use the standard normal distribution
table above to find the probability or area under the curve. The
following formula converts an X value into a Z score, also called
a standardized score:
110
STT 205 MODULE 5
Case Study 4
In order to compute P(X < 30) we convert the X=30 to its corresponding
Z score (this is called standardizing):
Thus, P(X < 30) = P(Z < 0.17). We can then look up the corresponding
probability for this Z score from the standard normal distribution table,
which shows that P(X < 30) = P(Z < 0.17) = 0.5675. Thus, the
probability that a male aged 60 has BMI less than 30 is 56.75%.
Case Study 5
Using the same distribution for BMI, what is the probability that a male
aged 60 has BMI exceeding 35? In other words, what is P(X > 35)?
Again we standardize:
111
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 6
The mean BMI for men aged 60 is 29 with a standard deviation
of 6.
The mean BMI for women aged 60 the mean is 28 with a
standard deviation of 7.
To compute the 90th percentile, given the formula X=μ + Zσ, and
standard normal distribution table, unlike the previous example on BMI,
computation will be done in the opposite direction. Previously we
started with a particular "X" and used the table to find the probability.
However, in this case we want to start with a 90% probability and find
the value of "X" that represents it.
When we go to the table, we find that the value 0.90 is not there exactly,
however, the values 0.8997 and 0.9015 are there and correspond to Z
values of 1.28 and 1.29, respectively (i.e., 89.97% of the area under the
standard normal curve is below 1.28). The exact Z value holding 90% of
the values below it, is 1.282 which was determined from a table of
standard normal probabilities with more precision.
112
STT 205 MODULE 5
Using Z=1.282 the 90th percentile of BMI for men is: X = 29 + 1.282(6)
= 36.69.
Case Study 7
What is the 90th percentile of BMI among women aged 60? Recall that
the mean BMI for women aged 60 the mean is 28 with a standard
deviation of 7.
Solution
The table below shows Z values for commonly used percentiles.
Percentile Z
st
1 -2.326
th
2.5 -1.960
th
5 -1.645
th
10 -1.282
th
25 -0.675
th
50 0
th
75 0.675
th
90 1.282
th
95 1.645
th
97.5 1.960
th
99 2.326
Case Study 8
If a child's weight for age is extremely low it might be an indication of
malnutrition.
1. For infant girls, the mean body length at 10 months is 72
centimeters with a standard deviation of 3 centimeters. Suppose a
girl of 10 months has a measured length of 67 centimeters. How
does her length compare to other girls of 10 months?
2. A complete blood count (CBC) is a commonly performed test.
One component of the CBC is the white blood cell (WBC) count,
which may be indicative of infection if the count is high. WBC
counts are approximately normally distributed in healthy people
113
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
with a mean of 7550 WBC per mm3 (i.e., per microliter) and a
standard deviation of 1085. What proportion of subjects have
WBC counts exceeding 9000?
3. Using the mean and standard deviation in the previous question,
what proportion of patients have WBC counts between 5000 and
7000?
The table of areas under the normal curve shows the area between the
mean and a given number of standard deviation. Recall that if X is a
continuous random variable, X is said to be normally distributed with
parameter and 2 ( X ~ N ( , 2 ) with pdf given as;
1 x 2
1 ( )
f ( x) e 2
2A
Where = population mean, = standard deviation, = pie (3.152),
= the base of natural logarithm.
The total area bounded by the normal curve and X is 1. Hence, the area
under the curve between two coordinate X=A and X=B is denoted by
Pr(a<X<b) .When the variable X is to be expressed standardized form
X
as; Z .
1
1 2Z2
The equation above is replaced by f ( x) e .
2
In such a case, we say that Z is normally distributed with as zero and
as 1 (0,1), that is, Z~N(0,1), where Z is the
standardize form of X. Since the pdf of the standard normal distribution
is symmetric with respect to point Z=0 (zero). It follows that
p( Z Z ) p( Z Z ) .
For any real number Z ( Z ) .The mean to the right of Z is equal
to zero (o) and the area to the left of Z=0 is 0.5.
Case Study 9:
i. p ( t Z S )
p (t Z S ) A(t ) A( S ) is shaded area.
ii. p( Z S )
iii. p( Z S )
p( Z S ) 0.5 A( S )
114
STT 205 MODULE 5
Solution
i. P(Z>-0.5)
P(Z>-0.5)=0.5+0.1915=0.6915
Solution
The probability that a baby born has weight less that 3100 P(X<3100)
X
Z
X 3500 3100 3500
P
500 500
400
P Z
500
PZ 0.8
p( Z 0.8) 0.5 0.2881 = 0.2119
115
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 11
The table below is a frequency distribution of height recorded to the
nearest inch of 100 male students at XYZ University
Height No. of X f X ( X X (X X f (X X )2
student ) )2
60-62 5 61 305 -6.45 41.603 208.02
63-65 18 64 1152 -3.45 11.903 215.25
66-68 42 67 2815 -0.45 0.203 8.53
69-71 27 70 1890 2.55 6.503 175.58
72-74 8 73 584 5.55 30.803 246.44
100 6745 -2.25 852.82
Recall Z X X
S
Where X is the mean
& s is the S deviation
S
f ( X X )2
f
X
fX
f
X
fx 67.45 67.45
f 100
852.82
S = 8.5282
100
S=2.92
116
STT 205 MODULE 5
Hint;
a. Expected frequencies = Area for each class x 100.
b. Use Chi-Square goodness of test to determine the goodness of fit
n
(O i ei ) 2
X2 i
of the data; ei
c. Hypothesis;
Null Hypothesis Ho – the fit is good for the data.
Alternative Hypothesis H1 – the fit is not good for the data.
d. If Chi-Square calculated is lesser than chi-square tabulated, we
accept the Ho.
If chi-square calculated is greater than chi-square tabulated, we
accept the H1
(5 4.13) 2 (18 20.68) 2 (42 38.92) 2 (27 27.71) 2 (8 7.43) 2
X2
4.13 20.68 38.92 27.71 7.43
=0.183+0.347+0.244+0.018+0.044 =0.836; X cal 0.836
2
117
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
SELF-ASSESSMENT EXERCISE(S)
118
STT 205 MODULE 5
1.4 Summary
119
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
2.1 Introduction
2.2 Intended Learning Outcomes (ILOs)
2.3 Main Content
2.3.1 Definition of Binomial distribution
2.3.2 The four requirements
2.3.3 Properties of a binomial experiment
2.3.4 Difference between binomial distribution and Poisson
distribution
2.3.4 Use of the binomial distribution requires three
assumptions
2.3.6 Computing the Probability of a Range of Outcomes
2.3.7 Mean and Standard Deviation of a Binomial Population
2.4 Summary
2.5 References/Further Reading/Web Resources
2.1 Introduction
120
STT 205 MODULE 5
121
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
https://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_probability/bs704_probability7.html
A multinomial probability model might be appropriate, but here we
focus on the situation in which the outcome is dichotomous.
Case Study 1
Adults with allergies might report relief with medication or not, children
with a bacterial infection might respond to antibiotic therapy or not,
adults who suffer a myocardial infarction might survive the heart attack
or not, a medical device such as a coronary stent might be successfully
implanted or not. These are just a few examples of applications or
processes in which the outcome of interest has two possible values (i.e.,
it is dichotomous). The two outcomes are often labelled "success" and
"failure" with success indicating the presence of the outcome of interest.
Note, however, that for many medical and public health questions the
outcome or event of interest is the occurrence of disease, which is
obviously not really a success. Nevertheless, this terminology is
typically used when discussing the binomial distribution model. As a
result, whenever using the binomial distribution, we must clearly specify
which outcome is the "success" and which is the "failure".
122
STT 205 MODULE 5
Suppose that 80% of adults with allergies report symptomatic relief with
a specific medication. If the medication is given to 10 new patients with
allergies, what is the probability that it is effective in exactly seven
patients?
First, do we satisfy the three assumptions of the binomial distribution
model?
The outcome is relief from symptoms (yes or no), and here we
will call a reported relief from symptoms a 'success.'
The probability of success for each person is 0.8.
The final assumption is that the replications are independent, and
it is reasonable to assume that this is true.
We know that:
number observation is n=10
number successes or events of interest is x=7
P = 0.80
But many of the terms in the numerator and denominator cancel each
other out,
:
Binomial probabilities like this can also be computed in an Excel
spreadsheet using the BINOMDIST function. Place the cursor into an
empty cell and enter the following formula:
BINOMDIST(x, n, p, FALSE)
where x= # of 'successes', n = # of replications or observations, and p =
probability of success on a single observation.What is the probability
123
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
that none report relief? We can again use the binomial distribution
model with n=10, x=0 and p=0.80.
This is equivalent to
Which simplifies to
What is the most likely number of patients who will report relief out of
10? If 80% report relief and we consider 10 patients, we would expect
that 8 report relief. What is the probability that exactly 8 of 10 report
relief? We can use the same method that was used above to demonstrate
that there is a 30.30% probability that exactly 8 of 10 patients will report
relief from symptoms when the probability that any one reports relief is
80%. The probability that exactly 8 report relief will be the highest
probability of all possible outcomes (0 through 10).
The likelihood that a patient with a heart attack dies of the attack is 0.04
(i.e., 4 of 100 die of the attack). Suppose we have 5 patients who suffer a
heart attack, what is the probability that all will survive? For this
example, we will call a success a fatal attack (p = 0.04). We have n=5
patients and want to know the probability that all survive or, in other
words, that none are fatal (0 successes).
124
STT 205 MODULE 5
There is an 81.54% probability that all patients will survive the attack
when the probability that any one dies is 4%. In this example, the
possible outcomes are 0, 1, 2, 3, 4 or 5 successes (fatalities). Because the
probability of fatality is so low, the most likely response is 0 (all patients
survive). The binomial formula generates the probability of observing
exactly x successes out of n.
What is the probability that 2 or more of 5 die from the attack? Here we
want to compute P(2 or more successes). The possible outcomes are 0,
1, 2, 3, 4 or 5, and the sum of the probabilities of each of these outcomes
is 1 (i.e., we are certain to observe either 0, 1, 2, 3, 4 or 5 successes). We
just computed P(0 or 1 successes) = 0.9851, so P(2, 3, 4 or 5 successes)
= 1 - P(0 or 1 successes) = 0.0149. There is a 1.49% probability that 2 or
more of 5 will die from the attack.
125
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 4
https://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_probability/bs704_probability7.htm
SELF-ASSESSMENT EXERCISE(S)
2.4 Summary
126
STT 205 MODULE 5
https://sphweb.bumc.bu.edu/otlt/mph-
modules/bs/bs704_probability/bs704_probability7.htm
127
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
3.1 Introduction
3.2 Intended Learning Outcomes (ILOs)
3.3 Main Content
3.3.1 Definition of Poisson
3.3.1 Probability of events for a Poisson distribution
3.3.2 Characteristics of a Poisson distribution
3.3.3 The difference between Poisson and binomial distribution
3.3.4 Application of Poisson
3.3.5 How do you know if a distribution is Poisson
3.4 Geometric distribution
3.4.1 The criteria for a distribution to be geometric are
3.4.2 Difference between binomial and geometric distribution
3.4.3 Geometric probability formula
3.5 Hypergeometric distribution
3.5.1 Uses of hyper-geometric distribution
3.5.2 Conditions for use of hyper-geometric distribution
3.5.3 Fundamental difference between hyper-geometric and
Geometric distributions
3.6 Summary
3.7 References/Further Reading/Web Resources
3.1 Introduction
128
STT 205 MODULE 5
129
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 1
130
STT 205 MODULE 5
Case Study 2
Case Study 3
131
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 4
A certain basket player has a 65% chance of making a free throw.
Assume all free throws are independent. What is the probability that he
makes the first free throws on the 3rd try?
Solution
a. 𝑃(𝑃 = 3) = (352 )65 = 0.79325 ; 𝑃 = 65 𝑎𝑛𝑑 𝑞 = 35
Case Study 5
How many tossed of a pair of fair die are necessary to be 99% certain
that a double size will appear? Find c ∋ 𝑃(𝑥 ≤ 𝑐 ) ≥ 0.99 𝑤ℎ𝑒𝑟𝑒 𝑥 =
1, 2,3, …
1 35 𝑥−1 ′
( )
𝑓 𝑥 =( × )
36 36
𝑐 1 35 𝑥−1
∑
= 𝑥−1( )( ⁄36) ≥ 0.99
36
1
=1-∑𝑐𝑥−1( )(35⁄36)𝑥−1 ≥ 0.99
36
132
STT 205 MODULE 5
Case Study 6
A fast food chain puts a winning game piece on every fifth package of
French fries. Find the probability that you will win a price
a. with your third purchase of French fries
b. with your third or fourth purchase of French fries.
Solutions;
a. 𝑥=3
𝑃 3 = (0.2)(0.8)3−1 = 0.128
( )
b. 𝑥 = 3,4; 𝑃(3𝑜𝑟4) = 𝑃(3) − 𝑃(4) = 0.128 + 0.102 = 0,230
FORMULA
A random variable x follows the hypergeometric distribution if its
probability mass function (pmf) is given by
𝐶𝑘𝐾 × 𝐶𝑛−𝑘
𝑁−𝐾
𝑃 (𝑥 ) = 3.6
𝐶𝑛𝑁
where
K is the population size,
K is the number of success samples in the population,
n is the number of draws (i.e. quantity drawn in each trial),
N is the number of observed successes,
133
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 7
134
STT 205 MODULE 5
Case Study 8
Case Study 9
A hat contain 5 green marbles and 9 blue marbles. 4 marbles are drawn
randomly without replacement. Calculate each probabilities;
a. The probability of getting 3 green marbles;
𝑎 = 5; 𝑛 = 14 𝑎𝑛𝑑 𝑟 = 4
𝐶35 × 𝐶19 (10)(9)
𝑃 (𝑋 = 3) = 14 = = 0.00991008
𝐶4 (1001)
Self-Assessment Exercise(s)
135
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
3.4 Summary
https://en.wikipedia.org/wiki/Hypergeometric_distribution#:~:text=In%
20probability%20theory%20and%20statistics,that%20contains%
20exactly%20objects%20with
https://en.wikipedia.org/wiki/Poisson_distribution#:~:text=In%20probab
ility%20theory%20and%20statistics,time%20or%20space%20if
%20these
136
STT 205 MODULE 6
Unit 1 Estimation
Unit 2 Principle of Hypothesis Testing
Unit 3 Statistical Hypotheses’ Dimensions
UNIT 1 ESTIMATION
Unit Structure
1.1 Introduction
1.2 Intended Learning Outcomes (ILOs)
1.3 Main Content
1.3.1 Definition of Terms in Estimation:
1.3.2 Types of Estimation
1.3.2.1 Point Estimation
1.3.2.2 Interval Estimation:
1.3.2.3 Confidence Intervals
1.3.3 Method of Estimation
1.3.3.1 Least Square Estimation
1.3.3.2 Method of Moment
1.3.3.3 Maximum Likelihood Estimate
1.3.4 Criteria of Estimation
1.3.4.1 Consistency
1.3.4.2 Unbiasedness
1.3.4.3 Efficiency
1.3.4.4 Minimum Variance
1.3.4.5 Completeness
1.3.4.6 Sufficiency
1.4 Summary
1.5 References/Further Reading/Web Resources
1.1 Introduction
137
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study 1
̅
If X = 55, the sample mean, then 55 is the estimate.
Estimation- Estimation involves using the sample data to get a value for
the unknown parameter. This implies that in estimation we use part of
the entire population to draw conclusion. It represents ways or a process
of learning and determining the population parameter based on the
model fitted to the data. Estimation involves approximating the value of
an unknown parameter
Point estimation and interval estimation, and hypothesis testing are three
main ways of learning about the population parameter from the sample
statistic.
138
STT 205 MODULE 6
Case Study 2
Interval Estimation
This is the process of obtaining an estimate of the population parameter
as being within an interval L and U, which are functions of the observed
random variable. The probability that the parameter lie between the
interval L and U i.e. ρ [L≤ θ ≤ U ] where θ is the parameter, is
expressed in terms of predetermined number, 1 – α, called the
confidence coefficient and α is the level of significance. L is called the
lower limit and U the upper limit. The interval (L,U) is called 100 (1-
α)% confidence interval.
139
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
For instance,
n
x i
1.3.3.1: Consistency
140
STT 205 MODULE 6
1.3.4.2 Unbiasedness
Case Study 3:
E(X̅ ) = μ ⟹ X̅ is unbiased for µ, where X̅ is an estimator of µ.
1.3.4.3 Efficiency
1.3.4.5: Completeness
141
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
SELF-ASSESSMENT EXERCISE(S)
1.4 Summary
142
STT 205 MODULE 6
143
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
2.1 Introduction
2.2 Intended Learning Outcomes (ILOs)
2.3 Main Content
2.3.1 Test of Significance
2.3.2 Statistical Hypothesis
2.3.3 Level of Significance
2.4 Interpretation of a Test
2.5 The Power of a Test
2.6 Critical and Acceptance Regions
2.7 Procedure for Testing Hypothesis
2.8 Summary
2.9 References/Further Reading/Web Resources
2.1 Introduction
Case Study 1
144
STT 205 MODULE 6
Case Study 4:
H1: π ≠ πo or H1: π > πo. In each case the parameter µ or π can assume
an infinite number of values different from µo and πo.
145
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
2.3.4 Interpretation of α
If in test of hypothesis α =0.05, this means that 5 out of every 100 cases
we shall be rejecting a true null hypothesis which implies that we are 95
percent confident that we have made the correct decision.
146
STT 205 MODULE 6
147
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
SELF-ASSESSMENT EXERCISE(S)
148
STT 205 MODULE 6
2.4 Summary
Moore, David S., "The Basic Practice of Statistics." Third edition. W.H.
Freeman and Company. New York. 2003.
149
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
3.1 Introduction
3.2 Intended Learning Outcomes (ILOs)
3.3 Main Content
3.3.1 Branches of Hypotheses
3.3.1.1 Simple Test of Hypotheses
3.3.1.2 Composite Test of Hypotheses
3.3.2 Test of Hypotheses for Small and Large Samples
3.3.2.1 Large Sample Test for One Population Mean
3.3.2.2 Large Sample for Two Populations Mean
3.3.3 Population Proportion
3.3.3.1 Large Sample Test for One Proportion
3.3.3.2 Large Sample Test between Two Population
Proportions
3.4 Hypothesis Testing in Small Sample
3.5 Summary
3.6 References/Further Reading/Web Resources
3.1 Introduction
150
STT 205 MODULE 6
Based on the sample data, the test determines whether to reject the null
hypothesis. You use a p-value, to make the determination. If the p-
value is less than or equal to the level of significance, which is a cut-off
point that you define, then you can reject the null hypothesis.
When a set contains more than one parameter value, then the hypothesis
is called a composite hypothesis, because it involves more than one
model.
Case Study 1:
The teachers union will like to establish that the average salary of a high
school teachers in a particular state is less than $32,500. A random
sample of 100 public school teachers in a particular state has a mean
salary of $31.578. it is known from past history that standard deviation
of the salary for the teachers in the state is $4,415. Test the union’s
claim at 5% level of significant.
Solution
H0: µ 32,500
H1: µ<32,500
n=100, x =31,578, 4415
α=5%=0.05
x 0
Z the test statistic
n
31578 32500 922
= 2.0883
4415 441.5
100
Decision rule: For a specified value of α =0.05, reject H0 if the test value
Zcal<-Ztab ie if -2.0883 <-1.645
Conclusion: Since Zcal =-2.0883 <-Ztab=-1.645, we reject the null
hypothesis at α=0.05. Therefore, there is a sufficient evidence to reject
the union’s claim.
H0: µ1 µ2
H1: µ1 2
152
STT 205 MODULE 6
x1 x2
Test statistic, Z =
12 22
n1 n 2
Decision rule: For a specified value of α, reject H0 if the computed test
value, Zcal>+Ztab
153
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
x np 0
The test statistic: Z
np 0 (1 p 0 )
Decision rule for left tail
For a specified α, reject the null hypothesis if the computed
statistic value Z is less than Z (Z tabulated ) ie. Z cal Z tab
c) Two tailed
H0 : p p 0
H1 : p p 0
x np 0
The test statistic: Z
np 0 (1 p 0 )
Decision rule
For a specified α, reject the null hypothesis if the computed
statistic value Z is less than Z ie Z cal Z tab or if it is greater
2
than Z ie Z cal Z
2 2
Case Study 2
Your teacher claim that 60% of the Nigerian males are married. You fill
that the proportion is higher. In a random sample of 100 Nigerian males,
65 of them are married. Test your teacher’s claim at 5% level of
significant.
Solution
Here:
P0 =60% =0.06, x=65
H0= p 0.6 n=100
H1: p>0.6
α=5%=5/100=0.05
x np 0 65 100(0.6)
Z =
np 0 (1 p 0 ) 100(0.6)(1 0.6)
65 60 5
60(0.4) 24
5
1.0206
4.8990
Z tab Z 0.5 0.05 0.45
From the table; Z Z 0.45 1.65
154
STT 205 MODULE 6
2) Two tailed
H0: P1 P2
H1: P1 P2
Test statistic,
P1 P2
Z
1 1
P(1 P)
n
1 n2
Decision rule: For a specified value of α, reject H0 if the computed test
value, Zcal>+Z=tab Z tab Z or Z cal Z tab Z cal Z / 2
2
where
x1 x2
P
n1 n2
155
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
x 0
t
s
n
Decision rule: For a specified value of α, reject H0 if the computed test
value,
tcal > t α (n-1)tab.
SELF-ASSESSMENT EXERCISE(S)
156
STT 205 MODULE 6
3.5 Summary
Moore, David S., "The Basic Practice of Statistics." Third edition. W.H.
Freeman and Company. New York. 2003.
157
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Unit Structure
1.1 Introduction
1.2 Intended Learning Outcomes (ILOs)
1.3 Main Content
1.3.1 Introduction to Parametric Methods
1.3.2 A flow chart for Parametric Methods
1.3.3 Test Based on Runs
1.3.4 Distribution of Runs
1.3.5 Importance of Runs
1.4 Summary
1.5 References/Further Reading/Web Resources
1.1 Introduction
158
STT 205 MODULE 7
The Wilcoxon signed rank test and the sign test are used to test
hypothesis about population median and population median differences.
The Mann-Whitney test is used to test the hypothesis that two
populations are identical, and the Kruskal-Wallis test is used to test the
hypothesis that three or more populations are identical. When samples of
repeated measurements are obtained, the Friedman test can be used to
test the hypothesis that all possible rankings of the observations from
any subject are equally likely.
159
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Symmetric/Approximately
Independent or Paired Independent Samples
Samples or Repeated
Symmetric Population
Measurements
Paired Samples
Extremely Non-Symmetric
Population
160
STT 205 MODULE 7
Case Study 1
Consider the local and international calls: TTTLL TTTT LLLL TT LLL
Solution 1
We have six runs.
Case Study 2
Given binary digits to denote the number of males and females
respectively;
00011111 0001111 000011001011
Solution 2
We have ten runs (10).
In a sequence of two types of observations, like the one we have above,
the total number of runs can be used as a measure of the randomness of
the sequence; too many runs may indicate that each observation tend to
follow and be followed by an observation of the other type.
The total number of runs may be used to test the null hypothesis Ho that
two independent random samples came from population with identical
distribution functions.
161
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
(n 0 −1 n1 −1
r −1 )( r −1 )
0 1
If r0 = r1 the joint pdf is f(r0 r1 ) = 2
(nn )
o
If the modules of r0 , r1 that is if [r0 −r1 ] =1
(nr 0−1
−1
) (nr 1−1
−1
)
0 1
f(r0 r1 ) =
(nn )
o
Find out if it can easily be shown that the marginal pdf of r0 is
(n 0 −1 n1 −1
r0 −1 )( r1 −1 )
f( r 0 ) = , where r0 = 1, 2, … . n0
(nn )
o
Recall that R = # number of runs in the sequence.
R = r0 + r 1
If r0 = r1= r, then R =2r. So the probability of the 2r runs is
(n 0 −1 n1−1
r −1 )( r −1 )
0 1
f (r, r) = P(R =2r) = 2
(nn )
o
We want to find the probability of R=2r + 1. R will be equal to 2r +1 if
r0 = r and r1= r +1 or r0 =r +1 and r1= r. This is so because if we consider
some sequence of zeros and ones we may observe in some sequence that
r0 = r1 and in some that r0 =r0+1.
Case Study 4;
r0 = 3, r1 = 3 → r0 = rn i. e r0 = 3 r1 = 4
(nr−1
o −1
)(nr−1
1 −1
) + (n0r−1)(nr−1
1 −1
)
∴ p[R = 2r + 1] =
(n0n+n1)
0
The distribution of R can also be calculated using t μ and σ2 . If the
critical region approximated with large samples n0 and n1, by normal
distribution with mean;
2nn1 (μ−1)(μ−2)
μ = E(R) = + 1 and variance, σ2 =
n0 +1 n0 +n1 −1
Case Study 5:
The following given pattern of incoming trunk (T) and local (L) calls
coming into a switchboard, TTT LLLL TT LLLLL TTTL. Use the
information to test whether the pattern of incoming calls are random and
was (T) the type trunk or local (L) of call.
Solution 5
We want to test the hypothesis.
Ho : The pattern of incoming calls is random
H1 : The pattern of incoming calls is not random
R = 6, nT = 8, nL = 10
162
STT 205 MODULE 7
Conclusion
Since 6 ≤ R ≤ 14, we accept Ho at 5% level of significance.
Case Study 6;
Given the sequence A B B A B B A A A then number of runs r=5.
nA (n1) =5 and nB (nB) =4
The number r of runs in a sequence of n1 and n2 can be used as a test
statistic to test for randomness and non-randomness. The null and
alternative hypotheses for the test are;
H0: the sequence of As and Bs have been generated by a random process.
H1: the sequence of As and Bs have not been generated by a random
process.
When n1 and n2 are both large i.e, n1> 10, n2>10, then we conduct the
run test by using the formula for the standard normal test statistic
r r 2n1n2 2n1n2 (2n1n2 n1 n2 )
z r 1 r
r ;
n1 n2 ;
(n1 n2 ) 2 (n1 n2 1)
163
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
SELF-ASSESSMENT EXERCISE(S)
1.4 Summary
Moore, David S., "The Basic Practice of Statistics." Third edition. W.H.
Freeman and Company. New York. 2003.
164
STT 205 MODULE 7
Unit Structure
2.1 Introduction
2.2 Intended Learning Outcomes (ILOs)
2.3 Main Content
2.3.1 Introduction and definition of index number
2.3.2 Types of Index
2.3.3 Construction of Index Numbers
2.3.4 Construction of Simple Price Indexes/ Price Relative
2.3.5 Construction of Simple Price Indexes/ Quantity Relative
2.3.6 Construction of Simple Price Indexes/ Value Relative
2.4 Summary
2.5 References/Further Reading/Web Resources
2.1 Introduction
165
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Case Study1
Referring to table 3.1 determine the simple price indexes for 1976, for
the commodity, Milk using 1970 as the base year.
Table 3.1 Price and consumption of three commodities in a particular
Area, 1970 and 1976.
Commodity Unit Average Prices Per
capita
Per
month
Quotation 1970 Po 1976 Pn 1970 qo 1976, qn
Milk Quart 0.30 0.38 30 35
Bread 1 1b loaf 0.25 0.35 3.8 3.7
Eggs Dozen 0.60 0.90 1.5 1.0
Solution 1:
For milk = 0.38 X 100 126.67,
0.30
166
STT 205 MODULE 7
Case Study 2:
Referring for table 3.1. Determine simple quantity indexes for the three
commodities for 1976, using 1970 as the base year.
Solution 2:
For Milk = 35 X 100 116.7
30
For Bread = 3.7 X 100 97.37
3 .8
1.0
For Eggs = X 100 66.67
1.5
Finally, the value of a commodity in a designated period is equal to the
price of the commodity multiplied by the quantity produced (or sold).
Case Study 3:
Compute the simple value relative for 1976 for the commodity - Milk in
table 22.1 using 1970 as the base year.
Solution 3:
Pn qn 100 0.38 X 3.5 100
For milk = I X X 147.8
Po q0 1 0.30 X 30 1
167
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Weighted Aggregate
Because of the difficulty described above, aggregate price indexes are
generally weighted according to the quantities q of the commodities.
The question at which period of quantities should be used which serves
as the basis for different aggregate price relatives.
Case Study 4
Compute Laspeyres aggregate price index for 1976 for three
commodities in table 3.1 using 1970 as the base year.
168
STT 205 MODULE 7
Table 3.2 Worksheet for the calculations of Laspeyres index for the data
in table 3.1
Commodity Pnqo Poqn
Milk 11.40 9.00
Bread 1.33 0.95
Egg 1.35 0.90
Total Pn qo &14.08 Po qo &10.85
Solution 4:
Referring to table 3.2 the index is determined as follows
Paasche’s Index
From 3.3, we can calculate the Paasches index as follows
I ( P)
P q X 100
n n
P qn 1
3.5
o
Case Study 5
Compute Paasches aggregate price indexes for 1976 for Milk in table
3.1 using 1970 as the base year.
Table 3.3 Worksheet for the calculate of Paasches’ index for the
data in table 3.1
Commodity Pnqn Poqo
Milk 13.5 10.50
Bread 1.30 0.93
Egg 0.90 0.60
Total Pn qn 15.50 Po qn 12.03
Solution .5
For Milk = 13.5 X 100 128.6
12.03 1
Case Study 6
Compute the price index by the weighted average of price relatives
method for the 3 commodities from table 21.1. Using 1970 as the base
year.
Ip
P q ( P / P X 100
o o n o
3.6
P q o o
169
STT 205 STATISTICS FOR MANAGEMENT SCIENCES I
Solution.6:
Table 3.4: The weighted average of price relatives for the data in Table
3.1.
Commodity Price relative Value Weighted relative
Pn/PoX100 weighted (Poqo)(Pn/PoX100)
Poqo
Mil 126.67 9 1140.03
Bread 140 0.95 133
Eggs 150 0.9 135
Total 416.67 10.85 1408.03
170
STT 205 MODULE 7
2.4 Summary
Moore, David S., "The Basic Practice of Statistics." Third edition. W.H.
Freeman and Company. New York. 2003
171