Lecture2-slides
Lecture2-slides
2. Organizing Data.
Frequency
distributions
Example 1: Twenty-five army inductees were given a blood test to determine their blood
type. The data set is
A B B AB O
O O B AB B
B B O A O
A O O O AB
AB A O B A
Construct a frequency distribution for the data.
Solution
Since the data are categorical, discrete classes can be used. There are four blood types: A, B, O, and AB. These types will be used as the
classes for the distribution.
The frequency distribution for the data is:
A B C D
Class Tally Frequency Relative Frequency
(Percent)
A //// 5 20
B //// // 7 28
O //// //// 9 36
AB //// 4 16
Total 25 100
For the sample, more people have type O blood than any other type.
Example 2: Find the frequency distribution table for 20 families who own certain number of
pets. The data set is: 3, 0, 1, 4, 4, 1, 2, 0, 2, 2, 0, 2, 0, 1, 3, 1, 2, 1, 1, 3.
Solution
n = Ʃf = 50
If the range divided by the number of classes gives an integer value (or no remainder),
add one unit to the class width.
e.g. Let the largest value = 31.5 and the smallest value = 7.5 and the data set is approximated to 1D,
⇒ range = 31.5 – 7.5 = 24
Say we need 5 classes,
The class width = 24 / 5 = 4.8 (1D, i.e. no remainder → add one unit 0.1)
∴ The class width = 4.8 + 0.1 = 4.9
Class limits
7.5–12.3
12.4–17.2
17.3–22.1
22.2–27
27.1–31.9
5. The boundaries (if needed) are half-way between the upper limit of one class and the lower limit of the next class.
If one of the data points meets one of the class boundaries, in which class it will be put?
It’s impossible to happen since If data is integer → Class boundaries will be 1D, If data is 1D → Class boundaries will be 2D, ….
Stem & Leaf
Example 4: At an outpatient testing center, the number of cardiograms performed each day
for 20 days is shown. Construct a stem and leaf plot for the data.
25 31 20 32 13
14 43 02 57 23
36 32 33 32 44
32 52 44 51 45
Solution
1- Arrange the data in order:
02, 13, 14, 20, 23, 25, 31, 32, 32, 32, 32, 33, 36, 43, 44, 44, 45, 51, 52, 57
2- Separate the data according to the first digit, as shown.
02 13, 14 20, 23, 25 31, 32, 32, 32, 32, 33, 36
43, 44, 44, 45 51, 52, 57
3- A display can be made by using the leading digit as the stem and the trailing digit as the leaf. For example, for the value 32, the
leading digit, 3, is the stem and the trailing digit, 2, is the leaf. For the value 14, the 1 is the stem and the 4 is the leaf. Now a plot
can be constructed as shown in Figure:
Leading digit (stem) Trailing digit (leaf)
0 2
1 34
2 035
3 1222236
4 3445
5 127
If there are no data values in a class!!!
you should write the stem number and leave the leaf row blank. Do not put a zero in the leaf row.
If the arranged data is: 50, 51, 51, 52, 53, 53, 55, 55, 56, 57, 57, 58, 59,
62, 63, 65, 65, 66, 66, 67, 68, 69, 69,
72, 73, 75, 75, 77, 78, 79.
Plot the data as shown:
Leading digit (stem) Trailing digit (leaf )
5 011233
5 5567789
6 23
6 55667899
7 23
7 55789
2- Frequency Polygon
3- Ogive
Part I: Introduction to Statistical Methods.