Data Management
Data Management
Data Management
A Review
Libeeth B. Guevarra
Introduction
Measures of Dispersion
Frequency Distribution
2 Experimental method
4 Registration method
5 Survey method
Sampling Technique
2 Systematic Sampling
3 Stratified Sampling
4 Cluster Sampling
5 Multi-stage Sampling
Non-probability sampling
1 Haphazard or Accidental Sampling
2 Purposive Sampling
3 Quota Sampling
4 Convenience Sampling
Organizing Data
1 Textual Method
2 Tabular Method
Parts of a Statistical Table
1 Table Heading includes the table number and
the title of the table
2 Body is the main part of the table that contains
the information or figures
3 Stubs or Classes are the classification or
categories describing the data and usually found
at the left most side of the table.
4 Caption is a designation or identification of the
information contained in a column, usually found
at the top most of the column.
3 Graphical Method
Categorical Distribution
A B B AB O
B AB B B B
O A O O O
AB AB A O B
O O O B A
Table 1: Blood type of the 25 inductees
Introduction
Measures of Dispersion
Frequency Distribution
Pn
i=1 xi
x̄ = (1)
Pnn
i=1 wi · xi
x̄ = P n (2)
w
Pn i=1 i
fi · x i
x̄ = i=1 (3)
n
Example
The heights (in meters) of the sampled mountains in the
Philippines are provided as follows in the table below.
What is the mean height of these mountains?
(http://www.pinoymountaineer.com)
Example
Out of 100 numbers, 20 were 5’s, 40 were 4’s, 35 were 7’s,
and 5 were 3’s. What is the mean of the data set?
Median of the data set is the middle or center
observation when the data set is arranged in
either increasing or decreasing order.
x̃ = x n+1
2
(4)
x n2 + x n+2
2
x̃ = (5)
2
Example
Find the Median of : 9, 3, 44, 17, 15
Example
Find the Median of : 8, 3, 44, 17, 12, 6
Mode of a set of data is the most frequent value
that occur/s. The mode is more helpful measure
for discrete and qualitative types of data, and the
only measure of central location helpful for
qualitative data. In some data sets, the mode does
not always exist, and if does, it may not be
unique. Mode is not very useful for continuous
data since the measurements are precise to a
significant digit and would mostly occur only once.
Example
Find the Mode of the following set of data:
A : 9, 3, 4, 17, 15, 3
B : 9, 3, 4, 17, 15, 3, 9
C: A+ , AB, A, O, B, B + , A
Give what is being asked
1 The grades of a student on seven examinations were
85, 96, 72, 89, 95, 82, and 85. Find the student’s mean
grade.
2 Find the median of the set of numbers: 15, 18, 50, 12,
16, and 20.
3 The numbers of incorrect answers on a true-false test
for 15 students were recorded as follows: 2, 1, 3, 0, 1,
3, 6, 0, 3, 3, 5, 2, 1, 5, 3. Find the median and mode.
4 Marcelo B. Fernan’s bridge is designed to carry a
maximum load of 150,000 tons. Is the bridge
overloaded if it carries 18 vehicles having a mean
weight of 5,000 tons?
5 The average IQ of 10 students in Stat 012 is 115. If
there are 2 students with IQ 101, 3 with IQ 125, 1
with IQ 130, 3 with IQ 98. What must be the IQ of the
other student?
Presentation Outline
Introduction
Measures of Dispersion
Frequency Distribution
Common Measures
Introduction
Measures of Dispersion
Frequency Distribution
equal parts.
2 Deciles Dm , divides the set of data into 10
equal parts.
3 Percentiles Pm , divides the set of data into
Example
The mean time to download pdf file is 12 min with
a standard deviation of 4 min. Belle’s download
time is 20 min. John’s download time is 6 min.
How can you compare Belle’s download time
compare with John?
Presentation Outline
Introduction
Measures of Dispersion
Frequency Distribution
25 29 30 32 36 36 39 40 40 44
45 48 49 50 50 51 54 55 55 55
55 56 57 57 59 60 60 60 61 61
61 63 65 65 65 67 68 70 71 74
74 76 77 77 80 81 81 83 84 90
12 12 13 14 14
16 17 19 19 25
Chebyshev’s Inequality
3(µ − median)
SK = (10)
σ
3(x̄ − median)
SK = (11)
s
Example
The scores of the students in the Prelim Exam has
a median of 18 and a mean of 16. What does this
indicate about the shape of the distribution of the
scores?
Presentation Outline
Introduction
Measures of Dispersion
Frequency Distribution
equal to 1 or 100%.
5 The normal curve area may be subdivided
Introduction
Measures of Dispersion
Frequency Distribution
y − ȳ = m(x − x̄)
y − ȳ = m(x − x̄)
where:
x̄ = mean of variable x
ȳ = mean of variable y
m =slope of the line
P
xy − nx̄ȳ
m= P 2
x − n(x̄)2
Example
Find the equation of the regression line for the
data