Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lecture1 2 3

Download as pdf or txt
Download as pdf or txt
You are on page 1of 86

Business Statistics

Dr. Yousaf Ali Khan

Introductory Lecture
Why study statistics?
1. Data are everywhere
2. Statistical techniques are used to make many
decisions that affect our lives
3. No matter what your career, you will make
professional decisions that involve data. An
understanding of statistical methods will help
you make these decisions efectively
Applications in
Business and Economics
• Accounting
Public accounting firms use statistical
sampling procedures when conducting
audits for their clients.

Economists use statistical information
in making forecasts about the future of
the economy or some aspect of it.
Applications in
Business and Economics
Electronic point-of-sale scanners at
retail checkout counters are used to
collect data for a variety of marketing
research applications.

A variety of statistical quality
control charts are used to monitor
the output of a production process.
Applications in
Business and Economics
◼ Finance
Financial advisors use price-earnings ratios and
dividend yields to guide their investment
Applications of statistical concepts in
the business world
• Finance – correlation and regression, index
numbers, time series analysis
• Marketing – hypothesis testing, chi-square
tests, nonparametric statistics
• Personel – hypothesis testing, chi-square tests,
nonparametric tests
• Operating management – hypothesis testing,
estimation, analysis of variance, time series
Business Statistics-what and why?
• Definition of statistics
The science of collectiong, organizing, presenting, analyzing, and
interpreting data to assist in making more effective decisions.
• Steps in Statistical Investigation
Five stages of statistical investigation
Collection of Data

Organization of data

Presentation of data


Interpretation of Results
Key Definitions
• A population (universe) is the collection of things under
• A sample is a portion of the population selected for analysis
• A parameter is a summary measure computed to describe a
characteristic of the population
• A statistic is a summary measure computed to describe a
characteristic of the sample
• A variable is a characteristic or condition that can change or
take on different values
• The set of measurements collected for a particular element
is called an observation.
Types of statistics
• Descriptive statistics is a type of statistics in which we
presenting, organizing and summarizing data
• Descriptive statistics are the tabular, graphical, and
numerical methods used to summarize and present data.

• Descriptive statistics – Methods of organizing,

summarizing, and presenting data in an informative
way. collection, presentation, and description of sample
e.g. Survey, Tables and graphs
Example: Hudson Auto Repair
• The manager of Hudson Auto
would like to have a better
understanding of the cost
of parts used in the engine
tune-ups performed in the
shop. She examines 50
customer invoices for tune-ups
• Sample of Parts Cost ($) for 50 Tune-ups
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
Inferential Statistics
Inferential Statistics involves analyzing a set
of data to make conclusions. This branch of
statistics is more difficult than Descriptive
In the study of Inferential Statistics, two basic
concepts are important:
Population: Population refers to all possible
subjects for a given study.
Sample: Sample refers to part (subset) of a
Population and Sample
• Let’s take a few examples.
• Example 1
• We are interested in knowing the proportion
of GIKI students are in favor of legalizing
the use PUBG game.

• Population consists of all GIKI students.

• Sample is 250 students selected at random.

Population and Sample
• Example 3
• We want to test if a new brand of tires
manufactured by Goodyear is better than
existing tires.

• Population consists of all tires of the new

brand manufactured by Goodyear.

• Sample is 100 tires of the new brand

chosen at random.
Statistical Inference
• Statistical inference is a statistical procedure to determine
the characteristics of a population by studying a sample.

• PEL Electric developed a new light bulb that increases its

useful life. In this case, all new light bulbs comprise the
population. To test if the new light bulb really has a longer
life, a sample of 200 bulbs was tested and the average life
of these bulbs was calculated. This average life will be
used to conclude if the new bulb has a longer useful life.
This is an example of statistical inference.
Statistical Inference
• Statistical inference allows us to make conclusions about a
population. This conclusion is made by studying a sample.

• In the PEL Electric case, the population was all new light
bulbs whose life expectancy we wanted to verify.

• Do all the new bulbs have a longer life?

• We answered this question by studying a sample and
calculating the average life of this sample of bulbs.
Why We Need Data
• To provide input to survey
• To provide input to study
• To measure performance of service or
production process
• To evaluate conformance to standards
• To assist in formulating alternative courses of
• To satisfy curiosity
• Data are the facts and figures collected, summarized,
analyzed, and interpreted.
• The data collected in a particular study are referred to as the
data set.
• Statistical data are usually obtained by counting or measuring
items. Most data can be put into the following categories:
• Qualitative - data are measurements that each fall into one of
several categories. (hair color, ethnic groups and other
attributes of the population)
• Quantitative - data are observations that are measured on a
numerical scale (distance traveled to college, number of
children in a family, etc.)
Qualitative data
Qualitative data are generally described by words or
letters. They are not as widely used as quantitative data
because many numerical techniques do not apply to the
qualitative data. For example, it does not make sense to
find an average hair color or blood type.
Qualitative data can be separated into two subgroups:
 dichotomic (if it takes the form of a word with two
options (gender - male or female)
 polynomic (if it takes the form of a word with more
than two options (education - primary school, secondary
school and university).
Quantitative data
Quantitative data are always numbers and are the
result of counting or measuring attributes of a
Quantitative data can be separated into two
• discrete (if it is the result of counting (the number
of students of a given ethnic group in a class, the
number of books on a shelf, ...)
• continuous (if it is the result of measuring
(distance traveled, weight of luggage, …)
Types of Variable
• Variables can be classified as discrete or
• Discrete variables (such as class size) consist of
indivisible categories, and continuous variables
(such as time or weight) are infinitely divisible
into whatever units a researcher may choose. For
example, time can be measured to the nearest
minute, second, half-second, etc.
• The process of measuring a variable requires a set
of categories called a scale of measurement and
a process that classifies each individual into one
Types of Variable

Qualitative Quantitative

Dichotomic Polynomic Discrete Continuous

Children in family, Amount of income

Gender, marital Brand of Pc, hair
Strokes on a golf tax paid, weight of
status color
hole a student
Scales of Measurement
Scales of measurement include:
Nominal Interval

Ordinal Ratio

The scale determines the amount of information

contained in the data.

The scale indicates the data summarization and

statistical analyses that are most appropriate.
Numerical scale of measurement:
• Nominal – consist of categories in each of which the number of respective
observations is recorded. The categories are in no logical order and have no
particular relationship. The categories are said to be mutually exclusive since an
individual, object, or measurement can be included in only one of them. Example:
Employment Classification 1 for Educator, 2 for Construction Worker and 3 for
Manufacturing Worker
• Ordinal – contain more information. Consists of distinct categories in which order
is implied. Values in one category are larger or smaller than values in other
categories (e.g. rating-excelent, good, fair, poor)
• Interval – is a set of numerical measurements in which the distance between
numbers is of a known, constant size. Interval measurements identify the direction and
magnitude of a difference
• Ratio –A ratio scale is an interval scale where a value of zero indicates that nothing
exists for the variable at the zero point
• Ratio measurements identify the direction and magnitude of differences and allow
ratio comparisons of measurements
• Variables such as distance, height, weight, and time use the ratio scale
Melissa’s college record shows 36 credit hours earned, while Kevin’s record shows
72 credit hours earned. Kevin has twice as many credit hours earned as Melissa.
Cross-Sectional & Time Series Data

Cross-sectional data are collected at the same or approximately the

same point in time.

Example: data detailing the number of building permits issued in June 2007
in each of the counties of Ohio

➢ Time series data are collected over several time periods.

Example: data detailing the number of building

permits issued in Lucas County, Ohio in each of the last 36 months
2. Frequency Distributions
• A frequency distribution is a tabular summary of data showing the number
(frequency) of observations in each of several non-overlapping categories or
• The objective is to provide insights about the data that cannot be quickly
obtained by looking only at the original data.
• Numerical presentation of quantitative data
• Frequency distribution – shows the frequency, or number of occurences, in
each of several categories. Frequency distributions are used to summarize large
volumes of data values.
• Steps for constructing a frequency distribution
1. Determine the number of classes
2. Determine the size of each class
3. Determine the starting point for the first class
4. Tally the number of values that occur in each class
5. Prepare a table of the distribution using actual counts and/ or percentages
(relative frequencies)
Guidelines for the Frequency
1. The set of classes must be mutually exclusive (i.e., a given data value can
fall into only one class)
2. The set of classes must be exhaustive (i.e., include all possible data
3. If possible, the classes should have equal widths. Unequal class widths
make it difficult to interpret both frequency distributions and their
graphical presentations.
4. Whenever possible, class widths should be round numbers (e.g., 5, 10, 25,
50, 100).
5. If possible, avoid using open-end classes.
Frequency Distribution
Example: Marada Inn
• Guests staying at Marada Inn were asked to rate the quality
of their accommodations as being excellent, above average,
average, below average, or poor.
• The ratings provided by a sample of 20 guests are:
Below Average Average Above Average
Above Average Above Average Above Average
Above Average Below Average Below Average
Average Poor Poor
Above Average Excellent Above Average
Average Above Average Average
Above Average Average
Frequency Distribution
• Example: Marada Inn

Rating Frequency
Poor 2
Below 3
Average 5
Above 9
Excellent 1
Total 20
Relative Frequency Distribution
• The relative frequency of a class is the fraction or
proportion of the total number of data items belonging to
the class.
Frequency of the class
Relative frequency of a class =

• A relative frequency distribution is a tabular summary of

data showing the relative frequency for each class.
Percent Frequency Distribution
• The percent frequency of a class is the relative
frequency multiplied by 100.
• A percent frequency distribution is a tabular summary
of a set of data showing the percent frequency for each

Relative Frequency and Percent Frequency Distributions

• Example: Marada Inn
Rating Relative Frequency Percent Frequency
Poor .10 10
Below Average .15 15
Average .25 25
Above Average .45 45
Excellent .05 5
Total 1.00 100
Bar Chart
• A bar chart is a graphical display for depicting qualitative
• On one axis (usually the horizontal axis), we specify the
labels that are used for each of the classes.
• A frequency, relative frequency, or percent frequency scale
can be used for the other axis (usually the vertical axis).
• Using a bar of fixed width drawn above each class label, we
extend the height appropriately.
• The bars are separated to emphasize the fact that each
class is a separate category.
Bar Chart
10 Marada Inn Quality Ratings

Poor Below Average Above Excellent Rating
Average Average
Pareto Diagram
• In quality control, bar charts are used to identify the most
important causes of problems.
• When the bars are arranged in descending order of height
from left to right (with the most frequently occurring cause
appearing first) the bar chart is called a Pareto diagram.
• This diagram is named for its founder, Vilfredo Pareto, an
Italian economist.
Pie Chart
• The pie chart is a commonly used graphical display for
presenting relative frequency and percent frequency
distributions for categorical data.
• First draw a circle; then use the relative frequencies to
subdivide the circle into sectors that correspond to the
relative frequency for each class.
• Since there are 360 degrees in a circle, a class with a relative
frequency of .25 would consume .25(360) = 90 degrees of
the circle.
Pie Chart
Marada Inn Quality Ratings

Above 15%
Average Average
45% 25%
Frequency Distribution
• Example: Hudson Auto Repair
The manager of Hudson Auto would like to gain a better
understanding of the cost of parts used in the engine tune-ups
performed in the shop. She examines 50 customer invoices for
tune-ups. The costs of parts, rounded to the nearest dollar, are
listed on the next slide.
Frequency Distribution
The three steps necessary to define the classes for a
frequency distribution with quantitative data are:

1. Determine the number of non-overlapping classes.

2. Determine the width of each class.
3. Determine the class limits.
Frequency Distribution
• Guidelines for Determining the Number of Classes
• Use between 5 and 20 classes.
• Data sets with a larger number of elements usually
require a larger number of classes.
• Smaller data sets usually require fewer classes.
• The goal is to use enough classes to show the variation
in the data, but not so many classes that some contain
only a few data items.
Frequency Distribution
• Guidelines for Determining the Width of Each Class

• Use classes of equal width.

• Approximate Class Width =

Largest data value − Smallest data value

Number of classes

• Making the classes the same width reduces the

chance of inappropriate interpretations.
Frequency Distribution
• Note on Number of Classes and Class Width

• In practice, the number of classes and the appropriate class width

are determined by trial and error.
• Once a possible number of classes is chosen, the appropriate
class width is found.
• The process can be repeated for a different number of
• Ultimately, the analyst uses judgment to determine the
combination of the number of classes and class width that
provides the best frequency distribution for summarizing the
Frequency Distribution
• Guidelines for Determining the Class Limits
• Class limits must be chosen so that each data item
belongs to one and only one class.
• The lower class limit identifies the smallest possible
data value assigned to the class.
• The upper class limit identifies the largest possible
data value assigned to the class.

• The appropriate values for the class limits depend

on the level of accuracy of the data.
• An open-end class requires only a lower class limit
or an upper class limit.
Frequency Distribution
• Class Midpoint

• In some cases, we want to know the midpoints of the

classes in a frequency distribution for quantitative data.

• The class midpoint is the value halfway between the

lower and upper class limits.
Frequency Distribution
• Example: Hudson Auto Repair
If we choose six classes:
Approximate Class Width = (109 - 50)/6 = 9.83 10

Parts Cost ($) Frequency

50-59 2
60-69 13
70-79 16
80-89 7
90-99 7
100-109 5
Total 50
Relative Frequency and Percent Frequency Distributions

• Example: Hudson Auto Repair

Parts Relative Percent
Cost ($) Frequency Frequency
50-59 .04 = 2/50 4 = .04(100)
60-69 .26 26
70-79 .32 32
80-89 .14 14
90-99 .14 14
100-109 .10 10
Total 1.00 100
Relative Frequency and Percent Frequency Distributions
• Example: Hudson Auto Repair
Insights Gained from the Percent Frequency Distribution:
• Only 4% of the parts costs are in the $50-59 class.
• 30% of the parts costs are under $70.
• The greatest percentage (32% or almost one-third) of the
parts costs are in the $70-79 class.
• 10% of the parts costs are $100 or more.
Surprise Quiz
Consider the quantitative data in below table.These data show the time in days required
to complete year-end audits for a sample of 20 clients of Sanderson and Clifford, a
small public accounting firm. The three steps necessary to define the classes for a
frequency distribution with quantitative data are:
1. Determine the number of nonoverlapping classes.
2. Determine the width of each class.
3. Determine the class limits.
Hint: develop a frequency distribution with five classes.
Graphical Presentation of Data

• An old saying “a picture is worth a thousand words”.

• Graph or Chart of a data set often provides the simplest and most
efficient display.

Common methods for graphically displaying qualitative data :

• Bar charts
• Pie charts
Common methods for graphically displaying quantitative data :
• Histogram
• Frequency Polygon
• Frequency Ogive
Bar Charts for Qualitative Data
Most common types of Bar Chart:
• Simple Bar Chart
• Multiple Bar Chart
• Component Bar Chart

Simple Bar Chart

• A Simple Bar chart consists of horizontal or vertical bars of
equal widths and lengths proportional to the values they
• It displays graphically the same information concerning
qualitative data that a frequency distribution shows in tabular
Simple Bar Chart for Qualitative Data

Party Affiliation Example:

Frequency Distribution
Bar Chart: Party Affiliation
Party Frequency (f) 12
PTI 10 10 9

Frequency (f)
N 9 6
6 5
Q 6 4 Freq (f)
P 5 2
Total 30
Simple Bar Chart for Qualitative Data

Party Affiliation Example:

Bar Charts for Party Affiliation Example

Party Frequency (f) Bar Chart: Party Affiliation

PTI 10 P P 5
N 9 r Q 6
Q 6
i N 9 Freq (f)
P 5 e
s PTI 10
Total 30 0 5 10 15
frequency (f)
Simple Bar Chart for Qualitative Data

Party Affiliation Example:

Relative Frequency Distribution

Party Freq (f) Relative Bar Chart: Party Affiliation


Relative Frequency (%ages)

PTI 10 0.3333 0.25

N 9 0.30
Relative Freq
Q 6 0.20 0.1

P 5 0.1667
Total 30 1 PTI N Q P
Multiple Bar Chart
Multiple Bar Chart
Multiple Bar Chart shows two or more characteristics
corresponding to values of a common variable in the form of a
grouped bars, whose lengths are proportional to the values of the
Example: Draw multiple bar charts to show the area and production of cotton
in Punjab for the following data:
Area and Production of Cotton in Punjab
Year Area (000 Production (000
acres) bales) 3420
3500 3233
1965-66 2866 1588 3000 2866

1970-71 3233 2229 2500 2229

2000 1588
1975-76 3420 1937 Area (000 acres)
Production (000 bales)
1965-66 1970-71 1975-76
Component Bar Chart

Component Bar Chart (subdivided bars)

A bar is divided into two or more sections, proportional in size to
the component parts of a total displayed by each bar.
Example: Draw component bar chart of the students’ enrollment
data: Component Bar Chart
Classes Total Male Female 60
No of Students
BBA 65 33 32 28
MBA 60 32 28
30 19 Female
MS/PHD 40 21 19 20 Male
33 32
10 21
Pie Charts For Qualitative Data

A Pie-Chart (also called sector diagram), is a graph

consisting of a circle divided into sectors whose areas
are proportional to the various parts into which whole
quantity is divided.

Pie Chart
Expenditure (in 100 rupees)

Pie Charts For Qualitative Data

Example: Represent the expenditures on various items

of a family by a pie chart.
Items Expenditure
Steps for Constructing Pie-Chart:
(in 100 Step 1: Draw a circle of any radius
Food 50
Clothing 30
Rent 20
Fuel 15
Misc. 35
Total 150
Pie Charts For Qualitative Data

Steps for Constructing Pie-Chart:

Step 2: Find angle of each sector corresponding to share of each
Angle of sector=(component part/whole quantity) * 360

Items Expenditure (in 100 rupees) Angles of sector (in Degrees)

Food 50 (50/150)*360=1200
Clothing 30 (30/150)*360=720
Rent 20 (20/150)*360=480
Fuel 15 (15/150)*360=360
Misc. 35 (35/150)*360=840
Total 150 3600
Pie Charts For Qualitative Data

Steps for Constructing Pie-Chart:

Step 3: Divide the circle into various sectors by measuring the
corresponding angle via protector.

Pie Chart
Items Expenditure (in Angles of sector (in Expenditure (in 100 rupees)
100 rupees) Degrees)
Food 50 1200
Clothing 30 720 35
50 Food
Rent 20 480 Clothing
Fuel 15 360 Rent
Misc. 35 840 Misc.
Total 150 3600 20 30
Graphs For Quantitative Data
Common methods for graphing quantitative data are:
• Histogram
• Frequency Polygon
• Frequency Ogive
• Histograms For Quantitative Data
A histogram is a graph that consists of a set of adjacent bars with heights proportional
to the frequencies (or relative frequencies or percentages) and bars are marked off by
class boundaries (NOT class limits).
It displays the classes on the horizontal axis and the frequencies (or relative frequencies
or percentages) of the classes on the vertical axis.
The frequency of each class is represented by a vertical bar whose height is equal to the
frequency of the class.
It is similar to a bar graph. However, a histogram utilizes classes or intervals and
frequencies while a bar graph utilizes categories and frequencies.
Histograms For Quantitative Data
Example: Construct a Histogram for ages of telephone operators.

Age (years) No of Operators Method: First construct Class Boundaries (CB).

11-15 10
16-20 5 Age (years) Class Boundaries No of Operators
11-15 10.5-15.5 10
21-25 7
16-20 15.5-20.5 5
26-30 12
21-25 20.5-25.5 7
31-35 6
26-30 25.5-30.5 12
Total 40
31-35 30.5-35.5 6
Total 40

Method: Construct Histogram by taking CB along X-axis and frequencies along Y-axis.
Histograms For Quantitative Data
Method: Construct Histogram by taking CB along X-axis and
frequencies along Y-axis.

Histogram of number of Telephone Operators

Class No of 14
Boundaries Operators 12
frequency (f)

10.5-15.5 10
15.5-20.5 5
20.5-25.5 7 4
25.5-30.5 12 2
30.5-35.5 6 0
0-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5
Total 40
Class Boundaries (CB)
Histograms For Quantitative Data

Method: Construct Histogram by taking CB along X-axis and

frequencies along Y-axis.

Histogram of number of Telephone Operators

Class No of 14

Boundaries Operators 12
10.5-15.5 10
frequency (f)

15.5-20.5 5
20.5-25.5 7 4
25.5-30.5 12 2
30.5-35.5 6 0
Total 40 0-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5
Class Boundaries (CB)
Histograms For Quantitative Data

Method: Construct Histogram by taking CB along X-axis and

frequencies along Y-axis.

Histogram of number of Telephone Operators

Class No of 14

Boundaries Operators 12
10.5-15.5 10
frequency (f)

15.5-20.5 5
20.5-25.5 7
25.5-30.5 12
30.5-35.5 6
Total 40 0-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5
Class Boundaries (CB)
Histograms For Quantitative Data

Method: Construct Histogram by taking CB along X-axis and

frequencies along Y-axis.

Histogram of number of Telephone Operators

Class No of 14

Boundaries Operators 12
10.5-15.5 10
frequency (f)

15.5-20.5 5
20.5-25.5 7
25.5-30.5 12
30.5-35.5 6
Total 40 0-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5
Class Boundaries (CB)
Histograms For Quantitative Data

Method: Construct Histogram by taking CB along X-axis and

frequencies along Y-axis.

Histogram of number of Telephone Operators

Class No of 14

Boundaries Operators 12
10.5-15.5 10
frequency (f)

15.5-20.5 5
20.5-25.5 7
25.5-30.5 12
30.5-35.5 6
Total 40 0-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5
Class Boundaries (CB)
Histograms For Quantitative Data

Method: Construct Histogram by taking CB along X-axis and

frequencies along Y-axis.

Histogram of number of Telephone Operators

Class No of 14

Boundaries Operators 12
10.5-15.5 10
frequency (f)

15.5-20.5 5
20.5-25.5 7
25.5-30.5 12
30.5-35.5 6
Total 40 0-10.5 10.5-15.5 15.5-20.5 20.5-25.5 25.5-30.5 30.5-35.5
Class Boundaries (CB)
Frequency Polygon For Quantitative Data

Graph of frequencies of each class against its mid point (also

called class marks, denoted by X).

Class Mark (X) or Mid point: It is calculated by taking average of lower and
upper class limits.
Example: (Ages of Telephone Operators)

Age (years) No of Operators Mid Point (X)

11-15 10 (11+15)/2=13
16-20 5 18
21-25 7 23
26-30 12 28
31-35 6 33
Total 40
Frequency Polygon For Quantitative Data
Method: Take Mid Points along X-axis and Frequency along Y-axis.
:Construct Bars with height proportional to the corresponding freq.

Age (years) No of Operators Mid Point (X)

11-15 10 (11+15)/2=13
16-20 5 18
21-25 7 23
26-30 12 28
31-35 6 33
Frequency (f)

8 13 18 23 28 33
Mid Point (X)
Frequency Polygon For Quantitative Data

Method: Construct Bars with height proportional to the corresponding freq.

Age (years) No of Operators Mid Point (X)

11-15 10 (11+15)/2=13
16-20 5 18
21-25 7 23
26-30 12 28
31-35 6 33
Frequency Polygon
Frequency (f)


8 13 18 23 28 33
Mid Point (X)

No of Operators
Frequency Polygon For Quantitative Data

Method: Join Mid points to get Frequency Polygon.

Age (years) No of Operators Mid Point (X)

11-15 10 (11+15)/2=13
16-20 5 18
21-25 7 23
26-30 12 28
31-35 6 33
Frequency Polygon
Frequency (f)


8 13 18 23 28 33
Mid Point (X)

No of Operators
Frequency Polygon For Quantitative Data

Method: Join Mid points to get Frequency Polygon.

Age (years) No of Operators Mid Point (X)

11-15 10 (11+15)/2=13
16-20 5 18
21-25 7 23
26-30 12 28
31-35 6 33
Frequency Polygon
Frequency (f)


8 13 18 23 28 33
Mid Point (X)

No of Operators
Cumulative Frequency Polygon (called Ogive) For
Quantitative Data

Ogive is pronounced as O’Jive (rhymes with alive).

Cumulative Frequency Polygon is a graph obtained by plotting the
cumulative frequencies against the upper or lower class boundaries
depending upon whether the cumulative is of ‘less than’ or ‘more than’
Less than Cumulative Frequency
Age (years) Class Boundaries No of Operators (f) Cumulative Frequency
11-15 Less than 15.5 10 10
16-20 Less than 20.5 5 15
21-25 Less than 25.5 7 22
26-30 Less than 30.5 12 34
31-35 Less than 35.5 6 40
Total 40
Cumulative Frequency Polygon (Ogive) For
Quantitative Data
Method: Take Upper Class Boundaries along X-axis and Cumulative
Frequency along Y-axis.
:Join less than Class Boundaries with corresponding Cumulative Frequencies.

Class Boundaries Cumulative 40
Less than 15.5 10
Cumulative Freq

Less than 20.5 15

Less than 25.5 22
Less than 30.5 34 15
Less than 35.5 40 10

0 5 10 15 20 25 30 35 40
Upper Class Boundaries
Cumulative Frequency Polygon (Ogive) For
Quantitative Data

Method: Join less than Class Boundaries with corresponding Cumulative


Cumulative Frequency Polygon (Ogive)

Class Boundaries Cumulative
Less than 15.5 10 Cumulative Freq
Less than 20.5 15 30
Less than 25.5 22 20
Less than 30.5 34 15
Less than 35.5 40 5
0 5 10 15 20 25 30 35 40
Upper Class Boundaries

Cumulative Frequency
Distributional Shape

Distribution of a Data Set

• A table, a graph, or a formula that provides the values of the
data set and how often they occur.
• An important aspect of the distribution of a quantitative data is
its shape.
– The shape of a distribution frequently plays a role in determining
the appropriate method of statistical analysis.

• To identify the shape of a distribution, the best approach

usually is to use a smooth curve that approximates the overall
Distributional Shape

Figure displays a relative-frequency histogram for the heights of the 3000 female
It also includes a smooth curve that approximates the overall shape of the distribution.
Note: Both the histogram and the smooth curve show that this distribution of heights is
bell shaped, but the smooth curve makes seeing the shape a little easier.

Advantage of smooth curves:

It skips minor differences in shape
and concentrate on overall patterns.
Frequency Distributions in Practice

Common Type of Frequency Distribution:

• Symmetric Distribution
a. Normal Distribution (or Bell Shaped)
b. Triangular Distribution
c. Uniform Distribution (or Rectangular)
Frequency Distributions in Practice

Common Type of Frequency Distribution:

• Asymmetric or skewed Distribution
– Right Skewed Distribution
– Left Skewed Distribution
– Reverse J-Shaped (or Extremely Right Skewed)
– J-Shaped (or Extremely Left Skewed)
Frequency Distributions in Practice

Common Type of Frequency Distribution:

• Bi-Modal Distribution
• Multimodal Distribution
• U-Shaped Distribution
Identifying Distribution

Example: (Household Size): The relative-frequency histogram

for household size in the United States is shown in figure.
Identify the distribution shape for sizes of U.S. households.
Identifying Distribution

To identify the distributional shape, Draw a smooth curve

through the histogram.
Identifying Distribution

To identify the distributional shape, Draw a smooth curve

through the histogram.
Identifying Distribution

To identify the distributional shape, Draw a smooth curve

through the histogram.

Using Excel for Tabular & Graphical Presentation
Bar Chart
Frequency Distribution and Histogram
for Quantitative Data

You might also like