Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Quantitative Methods in Management

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 70

Quantitative Methods in

Management
Day-3
Recap..
• Introduction
• Definition
• Terms and terminologies
• Types of statistics
• Types of data
• Levels of measurements
• Application of statistics in
business
• Sources of data
Organizing and visualizing variables
• Tables
– Frequency distribution
– Relative frequency distribution
– Relative percent frequency distribution
– Cumulative frequency distribution
– Univariate
– Bivariate / cross tabulation
• Diagrams
– Bar charts
– Pie charts
• Graphs
– Histogram
– Frequency polygon
– Frequency curve
– Cumulative frequency curve ( Ogive)
• EDA
– Stem and leaf plot
– Scatter diagram
– Dot plots
– Pareto chart
Numerical descriptive statistics

Day 3
Pg. 99-148
Objectives

In this chapter, you learn to:


• Describe the properties of central tendency,
variation, and shape in numerical data
• Construct and interpret a boxplot
• Compute descriptive summary measures for a
population
• Calculate the covariance and the coefficient of
correlation
Summary Definitions
DCOVA
 The central tendency is the extent to which the
values of a numerical variable group around a
typical or central value.

 The variation is the amount of dispersion or


scattering away from a central value that the
values of a numerical variable show.

 The shape is the pattern of the distribution of


values from the lowest value to the highest
value.
Summarization of data
• Measures of central tendencies
– AM, WM, GM
– Positional averages – median, percentiles, quartiles
– Mode
– Empirical formula
• Measures of dispersion
– Range
– Quartile deviation
– Mean deviation
– Standard deviation
– Variance
– Coefficient of variation RAW DATA
Arithmetic Mean
• Commonly called ‘the mean’
• is the average of a group of numbers
• Applicable for interval and ratio data
• Not applicable for nominal or ordinal data
• Affected by each value in the data set, including
extreme values
• Computed by summing all values in the data set and
dividing the sum by the number of values in the data
set
• It is possible to find the average, if we know the
aggregate and number of items, not necessarily to
know the value of the individual
Measures of Central Tendency:
The Mean DCOVA

• The arithmetic mean (often just called the


“mean”) is the most common measure of
central tendency
Pronounced x-bar The ith value

– For a sample
n of size n:

X i
X1  X2    Xn
X i1

n n
Sample size Observed values
Population Mean


 X X X X
 1 2 3
...  X N
N N
24  13  19  26  11

5
93

5
 18. 6
Measures of Central Tendency:
The Mean (con’t) DCOVA

• The most common measure of central tendency


• Mean = sum of values divided by the number of values
• Affected by extreme values (outliers)

11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20

Mean = 13 Mean = 14

11  12  13  14  15 65 11  12  13  14  20 70
  13   14
5 5 5 5
Properties of AM
• Sum of deviations from AM is ZERO
• Sum of squares of deviation taken from AM
will be minimum
• Combined mean
• It is affected by change of scale and change of
origin
Weighted Mean
 When the mean is computed by giving each data
value a weight that reflects its importance, it is
referred to as a weighted mean.
 In the computation of a grade point average (GPA),
the weights are the number of credit hours earned fo
each grade.
 When data values vary in importance, the analyst
must choose the weight that best reflects the
importance of each value.
Weighted Mean

x  wxi i

w i

where:
xi = value of observation i
wi = weight for observation i
Weighted mean
Purchase Cost per Number of
Pound($) pounds
1 3.00 1200
2 3.40 500
3 2.80 2750
4 2.90 1000
5 3.25 800

• WM= 2.96 AM=$3.07


(mean cost per pound for the raw material is
$2.96)
Geometric mean
• Used in analyzing growth rates in financial
data.
• nth root of the product of n values.
Median
• Middle value in an ordered array of numbers.
• Applicable for ordinal, interval, and ratio data
• Not applicable for nominal data
• Unaffected by extremely large and extremely
small values.
Median: Computational Procedure
• First Procedure
– Arrange the observations in an ordered array.
– If there is an odd number of terms, the median is
the middle term of the ordered array.
– If there is an even number of terms, the median is
the average of the middle two terms.
• Second Procedure
– The median’s position in an ordered array is given
by (n+1)/2.
Measures of Central Tendency:
The Median DCOVA

• In an ordered array, the median is the “middle”


number (50% above, 50% below)

11 12 13 14 15 16 17 18 19 20 11 12 13 14 15 16 17 18 19 20

Median = 13 Median = 13

• Less sensitive than the mean to extreme values


Measures of Central Tendency:
Locating the Median
DCOVA
• The location of the median when the values are in numerical order
(smallest to largest):

n 1
Median position  position in the ordered data
2
• If the number of values is odd, the median is the middle number

• If the number of values is even, the median is the average of the two
middle numbers

Note that
n  1 is not the value of the median, only the position of
2
the median in the ranked data
Percentiles
• Measures of central tendency that divide a group
of data into 100 parts
• At least n% of the data lie below the nth
percentile, and at most (100 - n)% of the data lie
above the nth percentile

• Example: 90th percentile indicates that at least


90% of the data lie below it, and at most 10% of
the data lie above it
• The median and the 50th percentile have the same
value.
• Applicable for ordinal, interval, and ratio data
• Not applicable for nominal data
Percentiles: Computational Procedure
• Organize the data into an ascending ordered
array.
• Calculate the
P
percentile location:
i (n)
100
• Determine the percentile’s location and its value.

• If i is a whole number, the percentile is the


average of the values at the i and (i+1) positions.

• If i is not a whole number, the percentile is at


the (i+1) position in the ordered array.
Percentiles: Example
• Raw Data: 14, 12, 19, 23, 5, 13, 28, 17
• Ordered Array: 5, 12, 13, 14, 17, 19, 23, 28
• Location of
30
30th percentile: i  (8)  2. 4
100
• The location index, i, is not a whole number; i+1 =
2.4+1=3.4; the whole number portion is 3; the
30th percentile is at the 3rd location of the array;
the 30th percentile is 13.
Quartiles
• Measures of central tendency that divide a group of
data into four subgroups

• Q1: 25% of the data set is below the first quartile


• Q2: 50% of the data set is below the second quartile
• Q3: 75% of the data set is below the third quartile

• Q1 is equal to the 25th percentile


• Q2 is located at 50th percentile and equals the
median
• Q3 is equal to the 75th percentile
• Quartile values are not necessarily members of the
data set
Quartiles

Q1 Q2 Q3

25% 25% 25% 25%


Quartiles: Example
• Ordered array: 106, 109, 114, 116, 121, 122,
125, 129
Q1 25 109  114
i (8)  2 Q1   111.5
100 2

50 116121
Q2: i (8)  4 Q2   1185
.
100 2
75 122125
Q3: i (8)  6 Q3   1235
.
100 2
Measures of Central Tendency:
The Mode
DCOVA
• Value that occurs most often
• Not affected by extreme values
• Used for either numerical or categorical
data
• There may be no mode
• There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6

Mode = 9 No Mode
Mode
• The most frequently occurring value in a data
set
• Applicable to all levels of data measurement
(nominal, ordinal, interval, and ratio)

• Bimodal -- Data sets that have two modes


• Multimodal -- Data sets that contain more
than two modes
Mode -- Example
• The mode is 44.
35 41 44 45
• There are more 44s
than any other value. 37 41 44 46

37 43 44 46

39 43 44 46

40 43 44 46

40 43 45 48
Measures of Central Tendency:
Review Example
DCOVA
House Prices:  Mean: ($3,000,000/5)
$2,000,000 = $600,000
$ 500,000
$ 300,000
 Median: middle value of ranked
$ 100,000 data
$ 100,000 = $300,000
Sum $ 3,000,000  Mode: most frequent value
= $100,000
Measures of Central Tendency:
Which Measure to Choose?
DCOVA
 The mean is generally used, unless extreme
values (outliers) exist.
 The median is often used, since the median is
not sensitive to extreme values. For example,
median home prices may be reported for a
region; it is less sensitive to outliers.
 In some situations it makes sense to report
both the mean and the median.
Measures of Central Tendency:
Summary
DCOVA
Central Tendency

Arithmetic Median Mode


Mean
n

X i
X i 1
n Middle value in Most
the ordered frequently
array observed
value
Empirical formula

MODE = 3 MEDIAN – 2 MEAN


Problem
• The cost of consumer purchases such as single family
housing, gasoline, internet services, tax preparation ,
and hospitalization were provided in The Wall Street
journal. Sample data typical of the cost of tax return
preparation by services such as H&R block are shown
below
120 230 110 115 160 130 150
105 195 155 105 360 120 120
140 100 115 180 235 255
- Compute the mean, median and mode
- Compute the first and third quartiles
- Compute and interpret the 90th percentile
Measures of Variability
 It is often desirable to consider measures of variability
(dispersion), as well as measures of location.

 For example, in choosing supplier A or supplier B we


might consider not only the average delivery time for
each, but also the variability in delivery time for each.
Variability
No Variability in Cash Flow Mean
Mean

Variability in Cash Flow Mean


Mean
Variability

Variability

No Variability
Measures of Variability:
Ungrouped Data
• Measures of variability describe the spread or the
dispersion of a set of data.
• Common Measures of Variability
– Range
– Interquartile Range
– Mean Absolute Deviation
– Variance
– Standard Deviation
– Z scores
– Coefficient of Variation
Measures of Variation
Variation DCOVA

Range Variance Standard Coefficient of


Deviation Variation

 Measures of variation give information


on the spread or variability or
dispersion of the data values.

Same center,
different variation
Measures of Variation:
The Range
DCOVA
 Simplest measure of variation
 Difference between the largest and the smallest values:

Range = Xlargest – Xsmallest

Example:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Range = 13 - 1 = 12
Measures of Variation:
Why The Range Can Be Misleading
DCOVA
 Does not account for how the data are
distributed
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5

 Sensitive to outliers Range = 5 - 1 = 4

1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120

Range = 120 - 1 = 119


Range
• The difference between the largest and the
smallest values in a set of data
• Simple to compute 35 41 44 45

• Ignores all data points except 37 41 the 44 46


two extremes
• Example: 37 43 44 46

Range 39 43 = 44 46
Largest - Smallest =
48 - 35 = 13 40 43 44 46

40 43 45 48
Interquartile Range

• Range of values between the first and third


quartiles
• Range of the “middle half”
• Less influenced by extremes

Interquartile Range  Q 3  Q1
Deviation from the Mean
• Data set: 5, 9, 16, 17, 18
• Mean:

 X 65   13
N 5
• Deviations from the mean: -8, -4, 3, 4, 5
+5
-4 +4
-8 +3

0 5 10 15 20


Mean Absolute Deviation
• Average of the absolute deviations from the
mean
X X   X  
 X 
M . A. D. 
5 -8 +8 N
9 -4 +4
+3 +3 24

16
17 +4 +4
18 +5 +5 5
0 24  4.8
Population Variance
• Average of the squared deviations from the
arithmetic mean

X   X
 X 
X  
2
2


2
5 -8 64 
9 -4 16 N
16 +3 9
130
17
18
+4
+5
16
25 
0 130
5
 2 6 .0
Population Standard Deviation
• Square root of the
variance
 X 
2

X X   X  
2

2

N
5 -8 64 130
9 -4 16 
16 +3 9 5
17
18
+4
+5
16
25  2 6 .0
0 130
  
2

 2 6 .0
 5 .1
Measures of Variation:
The Sample Variance
DCOVA
• Average (approximately) of squared
deviations of values from the mean
n

– Sample variance:  (X  X)
i
2

S 2 i1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Measures of Variation:
The Sample Standard Deviation
DCOVA
• Most commonly used measure of variation
• Shows variation about the mean
• Is the square root of the variance
• Has the same units as the original data
n

– Sample standard deviation:


 (X i  X) 2

S i 1
n -1
Measures of Variation:
The Standard Deviation
DCOVA
Steps for Computing Standard Deviation

1. Compute the difference between each value


and the mean.
2. Square each difference.
3. Add the squared differences.
4. Divide this total by n-1 to get the sample
variance.
5. Take the square root of the sample variance to
get the sample standard deviation.
Measures of Variation:
Sample Standard Deviation:
Calculation Example
DCOVA
Sample
Data (Xi) : 10 12 14 15 17 18 18 24

n=8 Mean = X = 16

(10  X)2  (12  X)2  (14  X)2    (24  X)2


S
n 1

(10  16)2  (12  16)2  (14  16)2    (24  16)2



8 1

A measure of the “average” scatter


130
  4.3095 around the mean
7
Measures of Variation:
Comparing Standard Deviations
DCOVA
Data A
Mean = 15.5

11 12 13 14 15 16 17 18 19 20 21
S = 3.338

Data B Mean = 15.5


S = 0.926
11 12 13 14 15 16 17 18 19 20 21

Data C Mean = 15.5


S = 4.567
11 12 13 14 15 16 17 18 19 20 21
Measures of Variation:
Comparing Standard Deviations
DCOVA

Smaller standard deviation

Larger standard deviation


Uses of Standard Deviation
• Indicator of financial risk
• Quality Control
– construction of quality control charts
– process capability studies
• Comparing populations
– household incomes in two cities
– employee absenteeism at two plants
Measures of Variation:
Summary Characteristics
DCOVA
 The more the data are spread out, the greater the
range, variance, and standard deviation.

 The more the data are concentrated, the smaller


the range, variance, and standard deviation.

 If the values are all the same (no variation), all


these measures will be zero.

 None of these measures are ever negative.


Standard Deviation as an
Indicator of Financial Risk
Annualized Rate of Return
Financial  
Security

A 15% 3%
B 15% 7%

3-56
Measures of Variation:
The Coefficient of Variation
DCOVA
• Measures relative variation
• Always in percentage (%)
• Shows variation relative to mean
• Can be used to compare the variability of two or
more sets of data measured in different units

 S
CV     100%

X 
Measures of Variation:
Comparing Coefficients of Variation
DCOVA
• Stock A:
– Average price last year = $50
– Standard deviation = $5
S $5
CVA     100%   100%  10%
X $50 Both stocks have
the same
• Stock B:
standard
– Average price last year = $100 deviation, but
stock B is less
– Standard deviation = $5 variable relative
to its price
S $5
CVB     100%   100%  5%
X $100
Measures of Variation:
Comparing Coefficients of Variation (con’t)

• Stock A:
DCOVA
– Average price last year = $50
– Standard deviation = $5
S $5
 
CVA     100%   100%  10%
X $50 Stock C has a
much smaller
• Stock C:
standard
– Average price last year = $8 deviation but a
much higher
– Standard deviation = $2 coefficient of
variation
 S  $2
CVC     100%   100%  25%

X  $8
Coefficient of Variation
  29
1
  84
2

 1
 4.6  2
 10
 100  100
. .
CV 1
1
. .
CV 2
2

1 2

4.6 10
 100  100
29 84
 1586
.  1190
.
Measures of shapes

skewness
Shape of a Distribution
DCOVA

• Describes how data are distributed


• Two useful shape related statistics are:
– Skewness
• Measures the extent to which data values are not
symmetrical
– Kurtosis
• Kurtosis affects the peakedness of the curve of the
distribution—that is, how sharply the curve rises
approaching the center of the distribution
Shape of a Distribution (Skewness)
DCOVA
• Measures the extent to which data is not
symmetrical

Left-Skewed Symmetric Right-Skewed


Mean < Median Mean = Median Median < Mean

Skewness
<0 0 >0
Statistic
Skewness

Negatively Symmetric Positively


Skewed (Not Skewed) Skewed
Skewness

Mean Mode Mean Mean


Mode
Median
Median Mode Median

Negatively Symmetric Positively


Skewed (Not Skewed) Skewed
Coefficient of Skewness
• Summary measure for skewness
3   Md 
S

• If S < 0, the distribution is negatively skewed
(skewed to the left).
• If S = 0, the distribution is symmetric (not skewed).
• If S > 0, the distribution is positively skewed
(skewed to the right).
• >1 or <-1  high degree of skewness
• 0.5 to 1 or -0.5 to -1  moderate skewness
• 0.5 and -0.5  relative symmetry
Distribution Shape: Skewness (
FOR PRACTICE)
 Example: Apartment Rents

Seventy efficiency apartments were randomly sampled


in a college town. The monthly rent prices for the
apartments are listed below in ascending order.

425 430 430 435 435 435 435 435 440 440
440 440 440 445 445 445 445 445 450 450
450 450 450 450 450 460 460 460 465 465
465 470 470 472 475 475 475 480 480 480
480 485 490 490 490 500 500 500 500 510
510 515 525 525 525 535 549 550 570 570
575 575 580 590 600 600 600 600 615 615
Distribution Shape: Skewness

 Example: Apartment Rents

.35 Skewness = .92


.30
Relative Frequency

.25

.20
.15

.10
.05
0
Kurtosis
• Peakedness of a distribution
– Leptokurtic: high and thin
– Mesokurtic: normal in shape
– Platykurtic: flat and spread out

Leptokurtic

Mesokur
tic Platykurtic
Shape of a Distribution -- Kurtosis measures how sharply the
curve rises approaching the center of the distribution

DCOVA
Sharper Peak
Than Bell-Shaped
(Kurtosis > 0)

Bell-Shaped
(Kurtosis = 0)

Flatter Than
Bell-Shaped
(Kurtosis < 0)

You might also like