Written Report - Chapter 3 - Visualizing Data
Written Report - Chapter 3 - Visualizing Data
Pie Charts
For many types of data, we are interested in
understanding the relative proportion of each data source
Tools and Software for Data Visualization to the total. A pie chart displays this by partitioning a circle
into pie-shaped areas showing the relative proportion
Data visualization ranges from simple Excel charts to
more advanced interactive tools and software that allow
users to easily view and manipulate data with a few clicks,
not only on computers, but on iPads and other devices as
well.
Line Charts
Line charts provide a useful means for displaying data
over time. You may plot multiple data series in line charts;
however, they can be difficult to interpret if the magnitude
of the data values differs greatly. In that case, it would be
advisable to create separate charts for each data series.
Bubble Charts
A bubble chart is a type of scatter chart in which the size
of the data marker corresponds to the value of a third
variable; consequently, it is a way to plot three variables
in two dimensions.
Data Queries: Tables, Sorting and Descriptive statistics refers to methods of describing
and summarizing data using tabular, visual, and
Filtering quantitative techniques.
Sorting Data in Excel Frequency Distribution for Categorical Data
Excel provides many ways to sort lists by rows or column A frequency distribution is a table that shows the
or in ascending or descending order and using custom number of observations in each of several nonoverlapping
sorting schemes. The sort buttons in Excel can be found groups. Categorical variables naturally define the groups
under the Data tab in the Sort & Filter group. Select a in a frequency distribution.
single cell in the column you want to sort on and click the
“AZ down arrow” button to sort from smallest to largest or To construct a frequency distribution, we need only count
the “AZ up arrow” button to sort from largest to smallest. the number of observations that appear in each category.
You may also click the Sort button to specify criteria for This can be done using the Excel COUNTIF function.
more advanced sorting capabilities.
Pareto Analysis
Pareto analysis is a term named after an Italian
economist, Vilfredo Pareto, who, in 1906, observed that
a large proportion of the wealth in Italy was owned by a
relatively small proportion of the people. The Pareto Relative Frequency Distributions
principle is often seen in many business situations. A
We may express the frequencies as a fraction, or
Pareto analysis relies on sorting data and calculating the
proportion, of the total; this is called the relative
cumulative percentage of the characteristic of interest.
frequency. If a data set has n observations, the relative
Example: Applying the Pareto Principle frequency of category i is computed as
Filtering Data
For large data files, finding a particular subset of records
that meet certain characteristics by sorting can be
tedious. Excel provides two filtering tools: AutoFilter for
simple criteria and Advanced Filter for more complex
criteria.
Cross-Tabulations
One of the most basic statistical tools used to summarize
categorical data and examine the relationship between
two categorical variables is cross-tabulation. A cross-
tabulation is a tabular method that displays the number of
observations in a data set for different subcategories of Slicers
two categorical variables. A cross-tabulation table is often Excel 2010 introduced slicers—a tool for drilling down to
called a contingency table. The subcategories of the “slice” a PivotTable and display a subset of data. To
variables must be mutually exclusive and exhaustive create a slicer for any of the columns in the database, click
meaning that each observation can be classified into only on the PivotTable and choose Insert Slicer from the
one subcategory, and, taken together over all Analyze tab in the PivotTable Tools ribbon.
subcategories, they must constitute the complete data
set.