The Data Analysis Workflow
The Data Analysis Workflow
1
The Grammar of Graphics
• The Grammar of Graphics (Leland Wilkinson, 2005) —a general scheme for data visualization
which breaks up graphs into semantic components:
– A plot is made up of layers
– A layer consists of data and a set of mappings between variables and aesthetics of geometric
objects. Some mappings require statistical transformation of data
– Scales control the details of the mapping
– All components
are independent and reusable
• R implementation –
package ggplot2
(Hadley Wickham, 2007)
2
Mappings (points layer) - SPSS
4
Mappings (points layer) – R/ggplot2
ggplot(data=catsales) +
geom_point(aes(x=Sales, y=Profit, colour=`Product Category`,
shape=Region, size=Quantity))
5
Aesthetic mapping exercise
6
Geoms exercise