MATH& 146 Lesson 10: Graphing Numerical Data
MATH& 146 Lesson 10: Graphing Numerical Data
MATH& 146 Lesson 10: Graphing Numerical Data
Lesson 10
Section 1.6
Graphing Numerical Data
1
Graphs of Numerical Data
One major reason for constructing a graph of
numerical data is to display its distribution, or the
pattern of variability displayed by the data of a
variable.
Three popular methods for displaying distributions
of numerical data are the dotplot, the histogram,
and the box plot.
2
Dotplots
The dotplot displays the data of a sample by
representing each data value with a dot positioned
along a scale, either horizontally or vertically.
3
Example 1
Create a dotplot of the following exam scores.
4
Histograms
For much of the work you do in this course, you
will use a histogram to display the data. One
advantage of a histogram is that it can readily
display large data sets.
5
Histograms
Unlike dotplots, histograms use ranges of values
instead of individual values. These ranges of values
are represented by bars (called classes), with the
heights equal to the frequency of each class.
6
Constructing histograms
The basic steps to construct a histogram are as
follows:
1) Find the minimum and maximum values of the
data.
2) Create classes by slicing data into intervals of
equal width (choose "nice" numbers).
3) Make a table (called a frequency table) to count
the number of values in each class.
4) Make a bar for each class, using the heights to
determine the height of each bar.
7
Example 2
The following are the scores on a measure of
sensitivity to smell taken by 13 chefs attending a
national conference:
96, 83, 59, 64, 73, 74, 80, 68, 87, 67, 64, 92, 76
Make a histogram of the data.
8
Shape of a Distribution
When describing the shape of a distribution (the
outline of a histogram), you should answer the
following three questions:
9
Peaks
1) Does the distribution have a single, central
peak or several separated peaks, or none at
all?
10
Peaks
A distribution with two peaks is bimodal.
11
Bimodal Distributions
Bimodal distributions usually occur when the data
of two separate groups are combined.
12
Diastolic Blood Pressure
Uniform Distributions
A distribution that doesn't appear to have any
mode and in which all the bars are approximately
the same height (in the "real world," the bars will
never be exactly the same) is called uniform:
Proportion of Wins 13
Symmetry
2) Is the distribution symmetric?
Essentially, a distribution is symmetric if you can fold
the distribution along a vertical line through the middle
and have the edges match pretty closely.
14
Skewness
The (usually) thinner ends of a distribution are called
the tails. If one tail stretches out farther than the other,
the histogram is said to be skewed to the side of the
longer tail.
15
Skewness
Symmetric graphs are ideal for inferential statistics,
though skewed graphs can also work, provided the
sample size is large enough. Generally, the more
skewed the graph, the larger the sample size is
needed to be.
skewed left skewed right
16
Outliers
3) Do any unusual features stick out?
17
Outliers
Often, not always, outliers are due to mistakes
(such as writing 5,000 instead of 50). Other
outliers may indicate that something unusual is
happening. If you see an outlier, proceed carefully.
18
Example 3
What can be said about the following histogram?
19
Example 4
What can be said about the following histogram?
20
Example 5
What can be said about the following histogram?
21
Example 6
What can be said about the following histogram?
22
Box Plots
Box plots, or box-and-whisker plots, give a
graphical image of the concentration of the data.
The box plot is constructed from five values, called
the five-number summary:
23
The Five-Number Summary
The five-number summary includes:
The minimum
The lower quartile, Q1
The median
The upper quartile, Q3
The maximum
These numbers divide the data into four more or less
equal pieces.
25% 25%
25% 25%
IQR
Range
25
Construct the Box Plot
To construct a box plot, use a number line and mark
each of the five numbers: minimum, first quartile,
median, third quartile, and maximum (use a dotted tick
mark for the median). Draw a top and bottom around
the middle three numbers to make a box, and then
draw lines connecting the box to the minimum and
maximum.
26
Example 7
Construct a box plot and find the range and
interquartile range.
27
Comparing Groups
Boxplots are ideal when it comes to comparing two
or more groups or categories.
28
Outliers
Box Plots can be
used to show
extreme values by
using dots or
asterisks ( or *) to
represent potential
outliers.
Any potential outlier
should be examined
carefully in your data
analysis.
29
Example 8
The boxplots below show the number of millionaires by
state per 1000 households, as reported by
Netscape.com in 2006.
30
Example 8 continued
a) List the regions from lowest to highest in terms of
the median rate of millionaires in that region.
b) Which region has the smallest interquartile range?
c) Which region has potential outliers?
31
Example 9
The following box plot shows the U.S. population for
1990.
32
Example 10
Match each histogram with X
its boxplot.
33