Data Management
Data Management
Data Management
SOLUTION:
σ𝑥
𝑥ҧ =
𝑛
110 + 76 + 29 + 38 + 105 + 31 + 35 + 123 + 76 + 89
𝑥ҧ =
10
712
𝑥ҧ =
10
𝑥ҧ = 71.2
MEDIAN
The halfway point in a data set. Before you can
find this point, the data must be arranged in
order. When the data is ordered, it is called a data
array. The median either will be a specific value in
the data set or will fall between two values.
Denoted by (𝑥).
MEDIAN
Examples:
𝑁 𝑖𝑠 𝑜𝑑𝑑: 𝑥 = 2, 2, 3, 4, 4, 5, 6
𝑚𝑒𝑑𝑖𝑎𝑛 = 4
𝑁 𝑖𝑠 𝑒𝑣𝑒𝑛: 𝑥 = 1, 1, 3, 4, 5, 6, 6, 8
4+5 9
𝑚𝑒𝑑𝑖𝑎𝑛 = = = 4.5
2 2
EXAMPLE:
The number of room in the seven hotels in downtown Pittsburgh is 713, 300, 618, 595,
311, 401, and 292. Find the median.
SOLUTION:
292, 300, 311, 401, 595, 618, 713
𝑥 = 401
MODE
The value that occurs most often in a data set.
Unimodal means that the data set has only one
mode. Bimodal means that the data set has only
two modes. Multimodal means that the data set
has more than two modes. Denoted by (𝑥)
ො
EXAMPLE:
Find the mode of the signing bonuses of eight NFL players for a specific year. The
bonuses in millions of dollars are: 18.0, 14.0, 34.5, 10, 11.3, 10, 12.4, 10.
SOLUTION:
10, 10, 10, 11.3, 12.4, 14.0, 18.0, 34.5
𝑥ො = 10
WEIGHTED MEAN
It is often used when some data values are more
important than others. Similar to the mean, it is
denoted with with (𝑥)ҧ for a sample and (𝜇) for a
population.
Where: 𝑥ҧ is the weighted mean
x is the data entry
σ(𝑥∙𝑤)
Formula: 𝑥ҧ = w is the corresponding
σ𝑤 weight of the entry
WILL KARLO GET AN ALLOWANCE INCREASE FOR NEXT SEMESTER?
σ(𝑥 ∙ 𝑤)
𝑥ҧ =
σ𝑤
{ 1.5 ∙ 5.0 + 1.0 ∙ 3.0 + 1.0 ∙ 3.0 + 2.0 ∙ 3.0 + 1.5 ∙ 3.0 + 1.25 ∙ 5.0 }
𝑥ҧ =
5.0 + 3.0 + 3.0 + 3.0 + 3.0 + 5.0
30.25
𝑥ҧ =
22
𝑥ҧ = 1.375
𝑥ҧ = 8.0 𝑥ҧ = 8.0
VARIANCE
The average of the squares of the
distance each value is from the mean.
The variance of the given set of data is
the square of the standard deviation of
the data.
STANDARD DEVIATION
Makes use of the amount by which
each individual data value deviates
from the mean. The standard deviation
is the square root of the variance.
The following chart shows the formula for population
variance, population standard deviation, sample variance,
and sample standard deviation.
Population Variance Sample Variance
σ 𝑥−𝜇 2 σ 𝑥 − 𝑥ҧ 2
2
𝜎 = 2
𝑠 =
𝑁 𝑁−1
σ 𝑥−𝜇 2 σ 𝑥 − 𝑥ҧ 2
𝜎= 𝑠=
𝑁 𝑁−1
Calculate the variance and standard deviation: There are 45 students in a class. 5
students were randomly selected from this class and their heights (in cm) were
recorded as follows: 131, 148, 139, 142, 152.
Solution:
131+148+139+142+152 712
Sample mean= = = 142.4
5 5
σ 𝑥 − 𝑥ҧ 2
𝑠2 =
𝑁−1
𝑠2
2 2 2 2 2
131 − 142.2 + 148 − 142.2 + 139 − 142.2 + 142 − 142.2 + 152 − 142.2
=
5−1
2
265.4
𝑠 =
4
𝒔𝟐 = 𝟔𝟔. 𝟑𝟓
𝒔 = 𝟔𝟔. 𝟑𝟓 = 𝟖. 𝟏𝟓
Find the variance and standard deviation for the following set of data representing
trees heights in feet: 3, 21, 98, 203, 17, 9
Solution:
3+21+98+203+17+9 351
Sample mean= = = 58.5
6 6
σ 𝑥 − 𝑥ҧ 2
2
𝑠 =
𝑁−1
𝑠2
2 2 2 2 2 2
3 − 58.5 + 21 − 58.5 + 98 − 58.5 + 203 − 58.5 + 17 − 58.5 + 9 − 58.5
=
6−1
2
31099.5
𝑠 =
5
𝒔𝟐 = 𝟔𝟐𝟏𝟗. 𝟗
𝒔 = 𝟔𝟐𝟏𝟗. 𝟗 = 𝟕𝟖. 𝟖𝟕
TRY THIS!
What is the computed sample variance and
standard deviation of the set of scores: 99,
100,105, 107, 110, 112, 116.
TRY THIS!
What is the computed mean, median, mode,
range, sample variance and standard deviation
of the set of scores: 33, 33, 35, 36, 37, 38, 39,
40, 39, 31, 33.
ASSIGNMENT
Z-SCORE
A z-score measures the distance between an observation and the
mean, measured in units of standard deviation.
FORMULA: If the z-score is positive, the positive
𝒙−𝝁 score is above the mean. If the z-score is
Population: 𝐳 =
𝝈 0, the score is the same as the mean. If
𝒙−ഥ𝒙
Sample: 𝐳 = the z-score is negative, the score is
𝒔
below the mean.
EXAMPLE:
An IQ test has a mean of 105 and a standard of 20. Find the
corresponding z-scores for each IQ.
a. 88 b. 122 c. 110
a. z = =
𝑥−𝑥ҧ
𝑠
88−105
= −𝟎. 𝟖𝟓
20
b. z = 𝑠 = 20 = 𝟎. 𝟖𝟓
𝑥−𝑥ҧ 122−105
c. z = 𝑠 = 20 = 𝟎. 𝟐𝟓
𝑥−𝑥ҧ 110−105
EXAMPLE:
Raul has taken two tests in his chemistry class. He scored
72 on the first test, for which the mean of all scores was
65 and the standard deviation was 8. He received a 60 on
a second test, for which the mean of all scores was 45
and the standard deviation was 12. In comparison to the
other students, did Raul do better on the first test or the
second tests?
SOLUTION:
Find the z-score for each test.
72−65 Raul scored 0.875 sd above the
1st test: 𝑧 = = 𝟎. 𝟖𝟕𝟓 mean on the 1st test and 1.25 sd
8
60−45 above the mean on the 2nd test.
2nd test: 𝑧 = = 𝟏. 𝟐𝟓
12 These z-scores indicate that, in
comparison to his classmates,
Raul scored better on the second
test than he did on the first.
EXAMPLE:
A consumer group tested a sample of 100 light
bulbs. It found that the mean life expectancy of the
bulbs was 842 h, with a standard deviation of 90.
One particular light bulb from the DuraBright
Company has a z-score of 1.2. What was the life
span of his light bulb?
TRY THIS!
Roland received a score of 70 on a test for
which the mean score was 65.5. Roland has
learned that the z-score for his test is 0.6.
What is the standard deviation for this set of
test scores?
PERCENTILES
Percentiles are values that separates a set of data
into 100 equal parts. We can use 𝑃1 , 𝑃2 , 𝑃3 ,
𝑃4 , … , 𝑃99 .
𝒌 𝒕𝒉
𝑷𝒌 = 𝒏+𝟏
𝟏𝟎𝟎
DECILE
- one of the 9 values of a variable which divides the
distribution into 10 equal parts.
𝒌 𝒕𝒉
𝑫𝒌 = 𝒏+𝟏
𝟏𝟎
QUARTILE AND INTERQUARTILE RANGE
- Quartile is one of the 3 values of a variable which divides
the distribution into 4 equal parts.
𝒌 𝒕𝒉
𝑫𝒌 = 𝒏+𝟏
𝟏𝟎
-Interquartile range is the difference of the 1st quartile
from the 3rd quartile.
-𝐢𝐧𝐭𝐞𝐫𝒒𝒖𝒂𝒓𝒕𝒊𝒍𝒆 𝒓𝒂𝒏𝒈𝒆 = 𝑸𝟑 − 𝑸𝟏
QUARTILE
- one of the 3 values of a variable which divides the
distribution into 4 equal parts.
𝒌 𝒕𝒉
𝑫𝒌 = 𝒏+𝟏
𝟏𝟎
EXAMPLE:
Find the 30th percentile of the following set of data.
45, 38, 35, 29, 54, 54, 43, 42, 38
Solution:
29, 35, 38, 38, 42, 43, 45, 54, 54 k = 30, n=9
30
P30 = 9 + 1 th
100
3
𝑃30 = 10 𝑡ℎ
10
𝑃𝑘 = 3𝑟𝑑
EXAMPLE:
Find the 6th and the 9th decile for the data
below: 23, 24, 27, 30, 32, 32, 32, 33, 36, 36, 42, 45, 51, 54, 55, 55, 56,
57, 59, 62, 63, 72, 80
Solution: k = 6, n = 23 k = 9, n = 23
6 9
D6 = 23 + 1 th D9 = 23 + 1 th
10 10
3 𝐷9 = 0.9 24 𝑡ℎ
𝐷6 = 24 𝑡ℎ
5 𝐷9 = 21.6𝑡ℎ
𝐷6 = 14.4𝑡ℎ 𝐷9 = 63 + 72 = 67.5
𝐷6 = 54 + 55 = 54.5
EXAMPLE:
Find the 1st quartile (𝑄1 ), 2nd quartile (𝑄2 ) and 3rd quartile (𝑄3 ) of
the following set of data.
2, 4, 4, 5, 6, 7, 8
Solution: k = 1, n=7 k = 2, n=7 k = 3, n=7
1 th
2 th
3 th
Q1 = 7 + 1 Q 2 = 7 + 1 Q 3 = 7 + 1
4 4 4
1 2 3
𝑄1 = 8 𝑡ℎ 𝑄2 = 8 𝑡ℎ 𝑄3 = 8 𝑡ℎ
4 4 4
𝑄1 = 2𝑛𝑑 𝑄2 = 4𝑡ℎ 𝑄3 = 6𝑡ℎ
𝑸𝟏 = 𝟒 𝑸𝟐 = 𝟓 𝑸𝟑 = 𝟕
GROUP REPORT: FIND THE MEAN, MEDIAN, MODE, RANGE,
VARIANCE, AND THE STANDARD DEVIATION FOR THE GIVEN
SAMPLES. 1. 1, 2, 7, 5, 7, 19, 8, 22
2. 16, 12, 15, 12, 11, 7, 4, 4, 3
3. 2.1, 3.0, 1.9, 1.5, 4.8
4. 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4
5. 93, 67, 49, 55, 92, 87, 77, 66, 73, 96, 55, 54
6. −23, −17, −19, −5, −11, −11, −11, −4, −31
7. 8, 6, 8, 8, 6, 8, 6, 8, 6, 8, 6, 8, 6, 8, 6, 8, 6