Unit 2
Unit 2
Visualization
Data visualization is the display of information in a graphic or tabular
format. Successful visualization requires that the data (information) be
converted into a visual format so that the characteristics of the data and
the relationships among data items or attributes can be analyzed or
reported.
OR
Visualization is the conversion of data into a visual or tabular format so that
the characteristics of the data and the relationships among data items or
attributes can be analyzed or reported.
Visualization of data is one of the most powerful and appealing techniques for
data exploration.
Humans have a well developed ability to analyze large amounts of information
that is presented visually.
Goals:
to detect general patterns and trends
o to detect outliers and unusual patterns
Ex:
Graphs and tables
Data from: satellite photos, sonar measurements, surveys, or
computer simulations
Consider Figure 3.2, which shows the Sea Surface Temperature (SST) in
degrees Celsius for July, 1982. This picture summarizes the information from
approximately 250,000 numbers and is readily interpreted in a few seconds. For
example, it is easy to see that the ocean temperature is highest at the equator and
lowestat. the poles.
Important points:
representation
Selection
arrangement
Tech niques
Visualization techniques are often specialized to the type of data being analyzed.
Indeed , new visualization techniques and approaches, as well as specialized
variations of existing approaches, are being continuously created, typically in
response to new kinds of data and visualization tasks.
One dimensional
Two dimensional
Three dimensional
Multi-dimensional
Hierarchical
Graph
Surface Plots
Vector Field P lots
Lower-Dimensional Slices
Animation
corresponding to the second stem. This approach is shown in the stem and
leaf plot of Figure 3.6. Other variations are also possible.
Histograms
Stem and leaf plots are a type of istogram, a plot that displays
the distribution of values for attributes by dividing the possible values
into bins and showing the number of objects that fall into each bin.
Histogram
Usually shows the distribution of values of a single variable.
Divide the values into bins and show a bar plot of the
number of objects in each bin.
The height of each bar indicates the number of objects if
all bins are of same width.
If bins are of di
erent width, then often it is the *area* of
the bar that indicates the number of objects in that bin.
Shape of histogram depends on the number of bins.
Example 3.8. Figure 3.7 shows histograms (with 10 bins) for sepal length,
sepal width, petal length, and petal width. Since the shape of a histogram
can depend on the number of bins, histograms for the same data, but with 20
bins, are shown in Figure 3.8.