Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
7 views

Lesson 3. Frequency Distribution

The document provides an overview of frequency distributions, explaining their purpose in summarizing how often values occur in datasets. It defines key terms such as categories, raw data, class intervals, and class frequencies, and outlines the steps for constructing a frequency table. Additionally, it includes an example of creating a frequency distribution table based on a dataset of business firm ages.

Uploaded by

1earljoshuamolde
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Lesson 3. Frequency Distribution

The document provides an overview of frequency distributions, explaining their purpose in summarizing how often values occur in datasets. It defines key terms such as categories, raw data, class intervals, and class frequencies, and outlines the steps for constructing a frequency table. Additionally, it includes an example of creating a frequency distribution table based on a dataset of business firm ages.

Uploaded by

1earljoshuamolde
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

1

Lesson 3. THE FREQUENCY DISTRIBUTION

A frequency distribution is a summary of how often each value or range of


values occurs in a dataset. It organizes data into classes or intervals and shows the
number of observations (frequency) that fall into each category. Frequency
distributions are commonly used in statistics to summarize large datasets and can
be represented in different forms, such as tables, histograms, or graphs.

Examples
2

Definition of Terms

1. Category: A classification or grouping that divides data into specific, non-


overlapping groups. Categories are used to distinguish different types or sets
of data based on certain characteristics.
2. Raw Data: The unorganized, original data collected directly from a source
before it has been processed or analyzed. Raw data has not undergone any
modifications, sorting, or analysis, and typically appears in the form it was
gathered, such as survey responses, experiment results, or measurements.
3. Array: An ordered set of data values, often arranged in a sequence or pattern.
In statistics, an array can refer to a list of data points sorted in either ascending
or descending order. Arrays are useful for organizing and analyzing data,
making it easier to identify trends, compute statistics, or find specific values.
4. Range: The difference between the largest and smallest values in a dataset. It
is a measure of variability that shows the spread of the data. The range is
calculated by subtracting the smallest value from the largest value. For
example, if the highest value in a dataset is 95 and the lowest is 50, the range
would be 95 - 50 = 45.
5. Class Frequency: The number of occurrences or data points that fall within a
particular class interval in a frequency distribution. For example, in a dataset
of students' test scores grouped into intervals like 50-59, 60-69, and 70-79,
the class frequency would be the number of students whose scores fall within
each range.
6. Class Interval: A range of values within which data points are grouped in a
frequency distribution. Class intervals are used when the data is continuous
and is divided into equal-sized segments to create a frequency distribution. For
example, if you're grouping test scores, you might use class intervals such as
50-59, 60-69, and 70-79.
7. Open Class Interval: A class interval where the lower or upper boundary is
not specified, often used to include extreme values. For instance, in a table of
ages, an open class interval might be "70 and above" for older individuals,
meaning any value 70 or greater falls into that interval.
8. Frequency Distribution: A summary that shows the number of times each
value or group of values appears in a dataset. It organizes the data into
categories or class intervals and counts the frequency of each category.
Frequency distributions are useful for identifying patterns and visualizing how
data is spread across different values or intervals.
9. Frequency Distribution Table: A table that presents the frequency
distribution of data, showing the classes or intervals, their corresponding
frequencies, and sometimes cumulative frequencies or relative frequencies. It
helps organize data for analysis and can be used to create graphs like
histograms. For example, a table might show age ranges (class intervals) and
the number of people (frequencies) that fall into each age group.
10. Class Centers/Class Marks: The midpoint of a class interval, which
represents a central value for the interval. It is calculated by averaging the
lower and upper boundaries of the class interval. For example, if a class
interval is 50-59, the class center would be (50 + 59) / 2 = 54.5.
11. Precise Class Boundaries: The exact limits that separate one class interval
from the next in a frequency distribution. These boundaries ensure that no
data point falls into two intervals. Precise boundaries are often calculated by
adjusting the lower and upper limits of the class intervals slightly (e.g., by
adding or subtracting 0.5 for continuous data) to ensure clarity in
classification.
3
Terms to remember:

1. Frequency (f) is the number of original scores that fall into that class.
2. Classes or intervals (k) refer to the groupings of a frequency table.
As a guide in finding the Class intervals, the following formula can be used.
𝑘 = 1 + 3.322 𝑙𝑜𝑔10 (𝑛)
where k is the number of classes, n is the number of data points, and
𝑙𝑜𝑔10 (𝑛) is the logarithm of n base 10

Sturges' Formula is more commonly used in practical statistics and is generally better for smaller
datasets or when you need a more precise number of classes.

*Note: The results are rounded off to the next higher integer to accommodate
all the observations.

3. Range is the difference between the highest value and the lowest value.
4. Class width (c) is the difference between two consecutive lower class limits
or class boundaries.
5. Class limits are the smallest or the largest numbers that can actually
belong to different classes.
Lower class limits (𝐿𝑗 ) are the smallest numbers that can actually belong to the
different classes.
Upper class limits (𝑢𝑗 ) are the largest numbers that can actually belong to the
different classes.
6. Class boundaries are obtained by increasing the upper class limits and
decreasing the lower class limits by the same amount so that there are no
gaps between consecutive under classes. The amount to be added or
subtracted is ½ or 0.5 the difference between the upper limit of one class and
the lower limit of the following class
7. Class marks or midpoint are the midpoints of the classes.
𝐿𝑗 + 𝑢𝑗
𝑥𝑗 =
2
Where 𝐿𝑗 is the lower class limit for the ith interval and 𝑢𝑗 is the upper
class limit for the ith class interval.

Process of Constructing a Frequency Table


When forming a frequency distribution, follow these general procedures to
ensure clarity, consistency, and meaningful data representation:

STEP 1: Determine the range.

R = Highest Value – Lowest Value


STEP 2. Determine the tentative number of classes (k)
𝑘 = 1 + 3.322 𝑙𝑜𝑔10 (𝑛)
❖ Round up to the next whole number
Note: The number of classes should be between 5 and 20. The actual number of
classes may be affected by convenience or other subjective factors

STEP 3. Find the class width by dividing the range by the number of classes.

𝑅𝑎𝑛𝑔𝑒 𝑅
Class Width = or 𝑐 = 𝑘
𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝐶𝑙𝑎𝑠𝑠𝑒𝑠

❖ Round up to the next whole number


4
STEP 4. Write the classes or categories starting with the lowest score. Stop
when the class already includes the highest score.

STEP 5. Determine the frequency for each class by referring to the tally
columns and present the results in a table.

STEP 6. Compute the class marks and class boundaries.

When constructing frequency tables, the following guidelines should be


followed.
1. The classes must be mutually exclusive. That is, each score must
belong to exactly one class.
2. Include all classes, even if the frequency might be zero.

By following these steps, you can create an effective and interpretable


frequency distribution that organizes and summarizes data meaningfully.

Example:

A survey collected information on all 464 business firms in Region X during one
week. Here are the ages (in years) of 50 business firms randomly selected from that
population. Construct a frequency distribution table for the ages of these firms.

Table 2 Ages of 50 business firms in Region X

19 18 30 40 41 33 73 25
23 25 21 33 65 17 20 76
47 69 20 31 18 24 35 24
17 36 65 70 22 25 65 16
24 29 42 37 26 46 27 63
21 27 23 25 71 37 75 25
27 23

Step 1: Determine the range.

R = Highest Value – Lowest Value = 76 – 16 = 60

Step 2: Determine the tentative number of classes (k).


For a dataset of 50 observations like in the example provided, Sturges'
Formula would likely give a better estimate.
𝑘 = 1 + 3.322 𝑙𝑜𝑔10 (50) = 6.64 = 7

Step 3: Find the class width (c).


𝑅 60
𝑐= = = 8.57 = 9
𝑘 7

Step 4-5 Write the classes starting with lowest score and tally the frequency.

Classes Tally frequency


70 – 78 /////
5
61 – 69 /////- 5
52 – 60 0
43 – 51 // 2
34 – 42 /////-// 7
/////-/////-
25 – 33
//// 14
/////-/////-
16 – 24 /////-// 17
5

Step 6. Compute the class marks and class boundaries.

Class
Classes Frequency (f) Class Marks (x)
Boundaries
70 – 78 69.5 – 78.5 5 74
61 – 69 60.5 - 69.5 5 65
52 – 60 51.5 – 60.5 0 56
43 – 51 42.5 – 51.5 2 47
34 – 42 33.5 – 42.5 7 38
25 – 33 24.5 – 33.5 14 29
16 – 24 15.5 – 24.5 17 20

CUMULATIVE FREQUENCY DISTRIBUTION

1.The less than cumulative frequency distribution (<cf) is constructed by


adding the frequencies from the lowest to the highest interval while
2 .The more than cumulative frequency distribution (>cf) is constructed by
adding the frequencies from the highest class interval to the lowest class
interval.

Class
Classes Frequency (f) Class Marks (x) <cf >cf
Boundaries
70 – 78 69.5 – 78.5 5 74 50 5
61 – 69 60.5 - 69.5 5 65 45 10
52 – 60 51.5 – 60.5 0 56 40 10
43 – 51 42.5 – 51.5 2 47 40 12
34 – 42 33.5 – 42.5 7 38 38 19
25 – 33 24.5 – 33.5 14 29 31 33
16 – 24 15.5 – 24.5 17 20 17 50

When data is summarized in a frequency distribution table, the individual


values of the original dataset are not retrievable from the table alone. For small
datasets, grouping data into intervals does not result in a significant loss of
information, as the intervals can still effectively represent the original data.
However, for larger datasets, important details may be obscured in the process of
summarizing data into broader intervals.
6

Worksheet 3

Name: Class Schedule: Date:

Construct a frequency distribution given the scores of 40 students in a 20-item


examination:

19 16 15 18 18 17 15 10 12 9

14 13 16 15 10 11 10 14 12 15

16 18 15 13 11 13 14 16 12 13

18 14 15 11 10 16 17 14 14 17

Solution:

R=

k=

class width(c) =

Classes Tally f Class Class <cf >cf


marks Boundaries

You might also like