Ca 1606693858
Ca 1606693858
Ca 1606693858
Correspondence analysis
{ch:corresp}
Correspondence analysis provides visualizations of associations in a two-way contin-
gency table in a small number of dimensions. Multiple correspondence analysis extends
this technique to n-way tables. Other graphical methods, including mosaic matrices and
biplots provide complementary views of loglinear models for two-way and n-way contin-
gency tables.
6.1 Introduction
Correspondence analysis (CA) is an exploratory technique which displays the row and col-
umn categories in a two-way contingency table as points in a graph, so that the positions of the
points represent the associations in the table. Mathematically, correspondence analysis is related
to the biplot, to canonical correlation, and to principal component analysis.
This technique finds scores for the row and column categories on a small number of dimen-
sions which account for the greatest proportion of the χ2 for association between the row and
column categories, just as principal components account for maximum variance of quantitative
variables. But CA does more— the scores provide a quantification of the categories, and have the
property that they maximize the correlation between the row and column variables. For graphical
display two or three dimensions are typically used to give a reduced rank approximation to the
data.
Correspondence analysis has a very large, multi-national literature and was rediscovered sev-
eral times in different fields and different countries. The method, in slightly different forms, is
also discussed under the names dual scaling, optimal scaling, reciprocal averaging, homogene-
ity analysis, and canonical analysis of categorical data.
See Greenacre (1984) and Greenacre (2007) for an accessible introduction to CA methodol-
ogy, or Gifi (1981), Lebart et al. (1984) for a detailed treatment of the method and its applications
from the Dutch and French perspectives. Greenacre and Hastie (1987) provide an excellent dis-
cussion of the geometric interpretation, while van der Heijden and de Leeuw (1985) and van der
217
218 [11-26-2014] 6 Correspondence analysis
Heijden et al. (1989) develop some of the relations between correspondence analysis and log-
linear methods for three-way and larger tables. Correspondence analysis is usually carried out
in an exploratory, graphical way. Goodman (1981, 1985, 1986) has developed related inferential
models, the RC model and the canonical correlation model, with close links to CA.
One simple development of CA is as follows: For a two-way table the scores for the row cat-
egories, namely X = {xim }, and column categories, Y = {yjm }, on dimension m = 1, . . . , M
are derived from a (generalized) singular value decomposition of (Pearson) residuals from inde-
√
pendence, expressed as dij / n, to account for the largest proportion of the χ2 in a small number
of dimensions. This decomposition may be expressed as
M
d nij − mij
√ij = √
X
{eq:cadij} = X Dλ Y T = λm xim yjm , (6.1)
n n mij m=1
where mij is the expected frequency and where Dλ is a diagonal matrix with elements λ1 ≥
λ2 ≥ · · · ≥ λM , and M = min(I − 1, J − 1). In M dimensions, the decomposition Eqn. (6.1)
is exact. For example, an I × 3 table can be depicted exactly in two dimensions when I ≥ 3. The
useful result for visualization purposes is that a rank-d approximation in d dimensions is obtained
from the first d terms on the right side of Eqn. (6.1). The proportion of the Pearson χ2 accounted
for by this approximation is
d
X
n λ2m /χ2 .
m
χ2 /n 2
P P
The quantity = i j dij /n
is called the total inertia and is identical to the measure of
association known as Pearson’s mean-square contingency, the square of the φ coefficient.
Thus, correspondence analysis is designed to show how the data deviate from expectation
when the row and column variables are independent, as in the sieve diagram, association plot and
mosaic display. However, the sieve, association and mosaic plots depict every cell in the table,
and for large tables it may be difficult to see patterns. Correspondence analysis shows only row
and column categories as points in the two (or three) dimensions which account for the greatest
proportion of deviation from independence. The pattern of the associations can then be inferred
from the positions of the row and column points.
• N = {nij } is the I × J contingency table with row and column totals ni+ and n+j ,
respectively. The grand total n++ is also denoted by n for simplicity.
• P = {pij } = N /n is the matrix of joint cell probabilities, called the correspondence
matrix.
• r = j pij = P 1 is the row margin of P ; c = i pij = P T 1 is the column margin. r
P P
• Dr and Dc are diagonal matrices with r and c on their diagonals, used as weights.
• R = Dr−1 P = {nij /n+j } is the matrix of row conditional probabilities, called row pro-
files. Similarly, C = Dc−1 P T = {nij /ni+ } is the matrix of column conditional probabili-
ties or column profiles.
Two types of coordinates, X, Y for the row and column categories are defined, based on the
generalized singular value decomposition of P ,
P = ADλ B T
principal coordinates: The coordinates of the row (F ) and column (G) profiles with respect to
their own principal axes are defined so that the inertia along each axis is the corresponding
singular value, λi ,
The joint plot in principal coordinates, F and G, is called the symmetric map because both
row and column profiles are overlaid in the same coordinate system.
standard coordinates: The standard coordinates (Φ, Γ) are a rescaling of the principal coordi-
nates to unit inertia along each axis,
These differ from the principal coordinates in Eqn. (6.2) and Eqn. (6.3) simply by the
absence of the scaling factors, Dλ . An asymmetric map shows one set of points (say, the
rows) in principal coordinates and the other set in standard coordinates.
Thus, the weighted average of the squared principal coordinates for the rows or columns on a
principal axis equals the squared singular value, λ for that axis, whereas the weighted average
of the squared standard coordinates equals 1. The relative positions of the row or column points
along any axis is the same under either scaling, but the distances between points differ, because
the axes are weighted differentially in the two scalings.
nested solutions: Because they use successive terms of the SVD Eqn. (6.1), correspondence
analysis solutions are nested, meaning that the first two dimensions of a three-dimensional
solution will be identical to the two-dimensional solution.
220 [11-26-2014] 6 Correspondence analysis
centroids at the origin: In both principal coordinates and standard coordinates the points repre-
senting the row and column profiles have their centroids (weighted averages) at the origin.
Thus, in CA plots, the origin represents the (weighted) average row profile and column
profile.
reciprocal averages: CA assigns scores to the row and column categories such that the column
scores are proportional to the weighted averages of the row scores, and vice-versa.
chi-square distances: In principal coordinates, the row coordinates may be shown equal to the
−1/2
row profiles Dr−1 P , rescaled inversely by the square-root of the column masses, Dc .
2
Distances between two row profiles, Ri and Ri0 is most sensibly defined as χ distances,
where the squared difference [Rij − Ri0 j ]2 is inversely weighted by the column frequency,
to account for the different relative frequency of the column categories. The rescaling by
−1/2
Dc transforms this weighted χ2 metric into ordinary Euclidean distance. The same is
true of the column principal coordinates.
interpretation of distances: In principal coordinates, the distance between two row points may
be interpreted as described above, and so may the distance between two column points.
The distance between a row and column point, however, does not have a clear distance
interpretation.
residuals from independence: The distance between a row and column point do have a rough
interpretation in terms of residuals or the difference between observed and expected fre-
quencies, nij − mij . Two row (or column) points deviate from the origin (the average
profile) when their profile frequencies have similar values. A row point appears in a sim-
ilar direction away from the origin as a column point when nij − mij > 0, and in an
opposite different direction from that column point when the residual is negative.
MASS: corresp(); the plot method calls biplot() for a 2 factor solution, using a a sym-
metric biplot factorization that scales the row and column points by the square roots of the
the singular values. There is also a mca() function for multiple correspondence analysis.
6.2 Simple correspondence analysis [supp-pdf.mkii ] 221
ca: ca(); provides 2D plots via the plot.ca() method and interactive (rgl) 3D plots via
plot3d.ca(). This package is the most comprehensive in terms of plotting options
for various coordinate types, plotting supplementary points. It also provides mjca() for
multiple and joint correspondence analysis of higher-way tables.
FactoMineR: CA(); provides a wide variety of measures for the quality of the CA representa-
tion and many options for graphical display
These methods also differ in terms of the types of input they accept. For example, MASS::corresp
handles matrices, data frames and "xtabs" objects, but not "table" objects. ca::ca handles
two-way tables and matrices, but requires other formats to be converted to these forms. In the
following, we largely use the ca package. {ex:haireye3}
##
## Principal inertias (eigenvalues):
## 1 2 3
## Value 0.208773 0.022227 0.002598
## Percentage 89.37% 9.52% 1.11%
##
##
## Rows:
## Black Brown Red Blond
## Mass 0.18243 0.48311 0.1199 0.2145
## ChiDist 0.55119 0.15946 0.3548 0.8384
## Inertia 0.05543 0.01228 0.0151 0.1508
## Dim. 1 -1.10428 -0.32446 -0.2835 1.8282
## Dim. 2 1.44092 -0.21911 -2.1440 0.4667
##
##
## Columns:
## Brown Blue Hazel Green
## Mass 0.37162 0.3632 0.15710 0.10811
## ChiDist 0.50049 0.5537 0.28865 0.38573
## Inertia 0.09309 0.1113 0.01309 0.01608
## Dim. 1 -1.07713 1.1981 -0.46529 0.35401
## Dim. 2 0.59242 0.5564 -1.12278 -2.27412
In the printed output, the table labeled “Principal inertias (eigenvalues)” indicates that nearly
99% of the Pearson χ2 for association is accounted for by two dimensions, with most of that
attributed to the first dimension.
The summary method for "ca" objects gives a more nicely formatted display, showing a
scree plot of the eigenvalues, a portion of which is shown below.
222 [11-26-2014] 6 Correspondence analysis
summary(haireye.ca)
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.208773 89.4 89.4 *************************
## 2 0.022227 9.5 98.9 **
## 3 0.002598 1.1 100.0
## -------- -----
## Total: 0.233598 100.0
...
The result returned by ca() can be plotted using the plot.ca() method. However, it
is useful to understand that ca() returns the CA solution in terms of standard coordinates, Φ
(rowcoord) and Γ (colcoord). We illustrate Eqn. (6.4) and Eqn. (6.5) using the components
of the "ca" object haireye.ca.
Dc <- diag(haireye.ca$colmass)
zapsmall(t(Gamma) %*% Dc %*% Gamma)
These standard coordinates are transformed internally within the plot function according to
the map argument, which defaults to map="symmetric", giving principal coordinates. The
following call to plot.ca() produces Figure 6.1.
6.2 Simple correspondence analysis [supp-pdf.mkii ] 223
0.6
0.4
Black
0.2
●
Dimension 2 (9.51%)
Brown
●
Hazel
−0.2
Red Green
●
−0.4
−0.6
Dimension 1 (89.37%)
fig:ca-haireye-plot
Figure 6.1: Correspondence analysis solution for the Hair color and Eye color data
For use in further customizing such plots (as we will see in the next example), the function
plot.ca() returns (invisibly)1 the coordinates for the row and column points actually plotted,
which we saved above as res:
res
## $rows
## Dim1 Dim2
## Black -0.50456 0.214820
## Brown -0.14825 -0.032666
## Red -0.12952 -0.319642
## Blond 0.83535 0.069579
##
## $cols
## Dim1 Dim2
## Brown -0.49216 0.088322
## Blue 0.54741 0.082954
## Hazel -0.21260 -0.167391
## Green 0.16175 -0.339040
It is important to understand that in CA plots (and related biplots, Section 6.6), the interpreta-
tion of distances between points (and angles between vectors) is meaningful. In order to achieve
1
This uses features incorporated in the ca package, version 0.54+.
224 [11-26-2014] 6 Correspondence analysis
this, the axes in such plots must be equated, meaning that the two axes are scaled so that the
number of data units per inch are the same for both the horizontal and vertical axes, or an aspect
ratio = 1.2
The interpretation of the CA plot in Figure 6.1 is then as follows:
• Dimension 1, accounting for nearly 90% of the association between hair and eye color
corresponds to dark (left) vs. light (right) on both variables.
• Dimension 2 largely contrasts red hair and green eyes with the remaining categories, ac-
counting for an additional 9.5% of the Pearson χ2 .
• With equated axes, and a symmetric map, the distances between row points and column
points are meaningful. Along Dimension 1, the eye colors could be considered roughly
equally spaced, but for the hair colors, Blond is quite different in terms of its frequency
profile.
4
{ex:mental3}
data("Mental", package="vcdExtra")
mental.tab <- xtabs(Freq ~ ses + mental, data=Mental)
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.026025 93.9 93.9 *************************
## 2 0.001379 5.0 98.9 *
## 3 0.000298 1.1 100.0
## -------- -----
## Total: 0.027702 100.0
...
The scree plot produced by summary(mental.ca) shows that the association between
mental health and parents’ SES is almost entirely 1-dimensional, with 94% of the χ2 ( 45.98,
with 15 df) accounted for by Dimension 1.
We then plot the solution as shown below, giving Figure 6.2. For this example, it is useful to
connect the row points and the column points by lines, to emphasize the pattern of these ordered
variables.
2
In base R graphics, this is achieved with the plot() option asp=1.
6.2 Simple correspondence analysis [supp-pdf.mkii ] 225
0.1
4 5
Well Mild
● ● Impaired
0.0
2
●1● ●
3
Moderate 6
●
−0.2
Dimension 1 (93.9%)
fig:ca-mental-plot
Figure 6.2: Correspondence analysis solution for the Mental health data
The plot of the CA scores in Figure 6.2 shows that diagnostic mental health categories are
well-aligned with Dimension 1. The mental health scores are approximately equally spaced,
except that the two intermediate categories are a bit closer on this dimension than the extremes.
The SES categories are also aligned with Dimension 1, and approximately equally spaced, with
the exception of the highest two SES categories, whose profiles are extremely similar, suggesting
that these two categories could be collapsed.
Because both row and column categories have the same pattern on Dimension 1, we may
interpret the plot as showing that the profiles of both variables are ordered, and their relation can
be explained as a positive association between high parents’ SES and higher mental health status
of children. A mosaic display of these data (Exercise 6.5) would show a characteristic opposite
corner pattern of association.
From a modeling perspective, we might ask how strong is the evidence for the spacing of
categories noted above. For example, we might ask whether assigning integer scores to the levels
of SES and mental impairment provides a simpler, but satisfactory account of their association.
Questions of this type can be explored in connection with loglinear models in Chapter 8.
4
{ex:victims2}
The data set RepVict in the vcd package gives a 8 × 8 table (from Fienberg (1980, Table
2-8)) on repeat victimization for various crimes among respondents to a U.S. National Crime
Survey. A special feature of this data set is that row and column categories reflect the same crimes,
so substantial association is expected. Here we examine correspondence analysis results in a bit
more detail and also illustrate how to customize the displays created by plot(ca(...)).
data("RepVict", package="vcd")
victim.ca <- ca(RepVict)
summary(victim.ca)
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.065456 33.8 33.8 *************************
## 2 0.059270 30.6 64.5 **********************
## 3 0.029592 15.3 79.8 **********
## 4 0.016564 8.6 88.3 *****
## 5 0.011140 5.8 94.1 ***
## 6 0.007587 3.9 98.0 **
## 7 0.003866 2.0 100.0
## -------- -----
## Total: 0.193474 100.0
...
The results above show that, for this 8 × 8 table, 7 dimensions are required for an exact
solution, of which the first two account for 64.5% of the Pearson χ2 . The lines below illustrate
that the Pearson χ2 is n times the sum of the squared singular values, n λ2i .
P
chisq.test(RepVict)
##
## Pearson's Chi-squared test
##
## data: RepVict
## X-squared = 11131, df = 49, p-value < 2.2e-16
## [1] 11131
The default plot produced by plot.ca(victim.ca) plots both points and labels for the
row and column categories. However, what we want to emphasize here is the relation between
the same crimes on the first and second occurrence.
To do this, we label each crime just once (using labels=c(2,0)) and connect the two
points for each crime by a line, using segments(), as shown in Figure 6.3. The addition of a
legend() makes the plot more easily readable.
Assault
Occurrence ●
0.4
● First
Second Rape
●
0.3
Robbery
Dimension 2 (30.63%) ●
0.2 Burglary
●
0.1
Pickpocket
●
0.0
Household Larceny
●
Auto Theft
●
−0.2
Personal Larcency
●
Dimension 1 (33.83%)
Figure 6.3: 2D CA solution for the repeat victimization data. Lines connect the category points
fig:ca-victims-plot
for first and second occurrence to highlight these relations.
In Figure 6.3 it may be seen that most of the points are extremely close for the first and
second occurrence of a crime, indicating that the row profile for a crime is very similar to its
corresponding column profile, with Rape and Pick Pocket as exceptions.
In fact, if the table was symmetric, the row and column points in Figure 6.3 would be identcal,
as can be easily demonstrated by analyzing a symmetric version.
## [1] TRUE
The first dimension appears to contrast crimes against the person (right) with crimes against
property (left), and it may be that the second dimension represents degree of violence associated
with each crime. The latter interpretation is consistent with the movement of Rape towards a
higher position and Pickpocket towards a lower one on this dimension.
4
For a two-way table, CA and mosaic displays give complementary views of the pattern of associ-
ation between the row and column variables, but both are based on the (Pearson) residuals from
228 [11-26-2014] 6 Correspondence analysis
independence. CA shows the row and column categories as points in a 2D (or 3D) space account-
ing for the largest proportion of the Pearson χ2 , while mosaics show the association by the pattern
of shading in the mosaic tiles. It is useful to compare them directly to see how associations can
{ex:TV2} be interpreted from these graphs.
data("TV", package="vcdExtra")
TV2 <- margin.table(TV, c(1,3))
TV2
## Network
## Day ABC CBS NBC
## Monday 2847 2923 2629
## Tuesday 3110 2403 2568
## Wednesday 2434 1283 2212
## Thursday 1766 1335 5886
## Friday 2737 1479 1998
In this case, the 2D CA solution is exact, meaning that two dimensions account for 100% of
the association.
##
## Principal inertias (eigenvalues):
## 1 2
## Value 0.081934 0.010513
## Percentage 88.63% 11.37%
...
The plot of this solution is shown in the left panel of Figure 6.4, using lines from the origin
to the category points for the networks.
An analogous mosaic display, informed by the CA solution, is shown in the right panel of
Figure 6.4. Here, the days of the week are reordered according to their positions on the first CA
dimension, another example of effect ordering.
In the CA plot, you can see that the dominant dimension separates viewing on Thursday,
with the largest share of viewers watching NBC, from the other weekdays. In the mosaic plot,
Thursday stands out as the only day with a higher than expected frequency for NBC, and this is
the largest residual in the entire table. The second dimension in the CA plot separates CBS, with
6.3 Properties of category scores [supp-pdf.mkii ] 229
Day
Thursday Wednesday Friday Tuesday Monday
0.3
ABC
0.2
CBS
Monday
●
0.1
Thursday Tuesday
● NBC ●
−0.1 0.0
Network
CBS
−19.3 −5.3 −2 8.4 17.8
Wednesday ABC
Friday
● ●
−0.3
NBC
36.9 −4 −10.5 −12.5 −13.5
Dimension 1 (88.63%)
Figure 6.4: CA plot and mosaic display for the TV viewing data. The days of the week in the
-mosaic-ca} mosaic plot were permuted according to their order in the CA solution.
its’ greatest proportion of viewers on Monday, from ABC, with greater viewership on Wednesday
and Friday.
Emerson (1998, Fig. 2) gives a table listing the shows in each half-hour time slot. Could
the overall popularity of NBC on Thursday be due to Friends or Seinfeld? An answer to this
and similar questions requires analysis of the three-way table (Exercise 6.7) and model-based
methods for polytomous outcome variables described in Section 7.6.4.
4
(I J )x K table
I x J x K table J
I
J
J
J
K
Figure 6.5: Stacking approach for a three-way table. Two of the table variables are combined
interactively to form the rows of a two-way table. {fig:stackin
which are the ML estimates of expected frequencies for the log-linear model [AB][C]. The χ2
that is decomposed by correspondence analysis is the Pearson χ2 for this log-linear model. When
the table is stacked as I × (J × K) or J × (I × K), correspondence analysis decomposes the
residuals from the log-linear models [A][BC] and [B][AC], respectively, as shown in Table 6.1.
In this approach, only the associations in separate [ ] terms are analysed and displayed in the
correspondence analysis maps. Van der Heijden and de Leeuw (1985) show how a generalized
form of correspondence analysis can be interpreted as decomposing the difference between two
specific loglinear models, so their approach is more general than is illustrated here.
b:stacking} Table 6.1: Each way of stacking a three-way table corresponds to a loglinear model
Stacking structure Loglinear model
(I × J) × K [AB][C]
I × (J × K) [A][BC]
J × (I × K) [B][AC]
printable form, where some variables are assigned to the rows and the others to the columns. Both
ftable() and structable() have as.matrix() methods3 that convert their result into
a matrix suitable as input to ca().
With data in the form of a frequency data frame, you can easily create interactive coding using
interaction() or simply use paste() to join the levels of stacked variables together.
To illustrate, create a 4-way table of random Poisson counts (with constant mean, λ = 15) of
types of Pet, classified by Age, Color and Sex.
set.seed(1234)
dim <- c(3, 2, 2, 2)
tab <- array(rpois(prod(dim), 15), dim=dim)
dimnames(tab) <- list(Pet=c("dog","cat","bird"),
Age=c("young","old"),
Color=c("black", "white"),
Sex=c("male", "female"))
You can use ftable() to print this, with a formula that assigns Pet and Age to the columns
and Color and Sex to the rows.
Then, as.matrix() creates a matrix with the levels of the stacked variables combined with
some separator character. Using ca(pet.mat) would then calculate the CA solution for the
stacked table, analyzing only the associations in the loglinear model [P etAge][ColorSex].4
(pet.mat <- as.matrix(ftable(Pet + Age ~ Color + Sex, tab), sep='.'))
## Pet.Age
## Color.Sex dog.young dog.old cat.young cat.old bird.young bird.old
## black.male 10 12 16 16 16 12
## black.female 8 12 13 15 11 13
## white.male 18 11 12 18 13 20
## white.female 13 13 16 15 12 15
With data in a frequency data frame, a similar result (as a frequency table), can be obtained
using interaction() as shown below. The result of xtabs() looks the same as pet.mat.
3
This requires at least R version 3.1.0 or vcd 1.3-2 or later.
4
The result would not be at all interesting here. Why?
232 [11-26-2014] 6 Correspondence analysis
data("Suicide", package="vcd")
# interactive coding of sex and age.group
Suicide <- within(Suicide, {
age_sex <- paste(age.group, toupper(substr(sex,1,1)))
})
## method2
## age_sex poison gas hang drown gun knife jump other
## 10-20 F 921 40 212 30 25 11 131 100
## 10-20 M 1160 335 1524 67 512 47 189 464
## 25-35 F 1672 113 575 139 64 41 276 263
## 25-35 M 2823 883 2751 213 852 139 366 775
## 40-50 F 2224 91 1481 354 52 80 327 305
## 40-50 M 2465 625 3936 247 875 183 244 534
## 55-65 F 2283 45 2014 679 29 103 388 296
## 55-65 M 1531 201 3581 207 477 154 273 294
## 70-90 F 1548 29 1355 501 3 74 383 106
## 70-90 M 938 45 2948 212 229 105 268 147
The results of the correspondence analysis of this table are shown below:
6.4 Multi-way tables: Stacking and other tricks [supp-pdf.mkii ] 233
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.096151 57.2 57.2 *************************
## 2 0.059692 35.5 92.6 ****************
## 3 0.008183 4.9 97.5 **
## 4 0.002158 1.3 98.8 *
## 5 0.001399 0.8 99.6
## 6 0.000557 0.3 100.0
## 7 6.7e-050 0.0 100.0
## -------- -----
## Total: 0.168207 100.0
...
It can be seen that 92.6% of the χ2 for this model is accounted for in the first two dimensions.
Plotting these gives the display shown in Figure 6.6.
plot(suicide.ca)
0.6
70−90 M
●
0.4
drown hang
55−65 M
Dimension 2 (35.49%)
●
0.2
70−90 F knife
● 55−65 F
●
jump
0.0
40−50 M
40−50 F ●
●
poison
−0.2
10−20 M gun
●
25−35 M
●
25−35 F other
●
−0.4
10−20 F
●
gas
−0.6
Dimension 1 (57.16%)
fig:ca-suicide-plot
Figure 6.6: 2D CA solution for the stacked [AgeSex][Method] table of the suicide data
Dimension 1 in the plot separates males (right) and females (left), indicating a large difference
between suicide profiles of males and females with respect to methods of suicide. The second
dimension is mostly ordered by age with younger groups at the top and older groups at the bottom.
234 [11-26-2014] 6 Correspondence analysis
Note also that the positions of the age groups are roughly parallel for the two sexes. Such a pattern
indicates that sex and age do not interact in this analysis.
The relation between the age–sex groups and methods of suicide can be approximately inter-
preted in terms of similar distance and direction from the origin, which represents the marginal
row and column profiles. Young males are more likely to commit suicide by gas or a gun, older
males by hanging, while young females are more likely to ingest some toxic agent and older
females by jumping or drowning. 4
{ex:suicide2}
As discussed in Chapter 5, mosaic plots are sensitive both to the order of variables used in
successive splits, and to the order of levels within variables and are most effective when these
orders are chosen to reflect the some meaningful ordering.
In the present example, method2 is an unordered table factor, but Figure 6.6 shows that the
methods of suicide vary systematically with both sex and age, corresponding to dimensions 1 and
2 respectively. Here we choose to reorder the table according to the coordinates on Dimension 1.
We also delete the low-frequency "other" category to simplify the display.
To construct the mosaic display for the same model analysed by correspondence analysis, we
use the argument expected=~age.group*sex + method2 to supply the model formula.
For this large table, it is useful to tweak the labels for the method2 variable to reduce over-
plotting; the labeling_args argument provides many options for customizing strucplot
displays.
This figure (Figure 6.7) again shows the prevalence of gun and gas among younger males
and decreasing with age, whereas use of hang increases with age. For females, these three meth-
ods are used less frequently, whereas poison, jump, and drown occur more often. You can
also see that for females the excess prevalence of these high frequency methods varies somewhat
less with age than it does for males.
4
## method2
## age.group poison gas hang drown gun knife jump other
## 10-20 2081 375 1736 97 537 58 320 564
## 25-35 4495 996 3326 352 916 180 642 1038
## 40-50 4689 716 5417 601 927 263 571 839
## 55-65 3814 246 5595 886 506 257 661 590
## 70-90 2486 74 4303 713 232 179 651 253
To treat the levels of sex as supplementary points, we calculate the two-way table of sex and
method, and append this to the suicide.tab2 as additional rows:
236 [11-26-2014] 6 Correspondence analysis
In the call to ca(), we then indicate these last two rows as supplementary:
##
## Principal inertias (eigenvalues):
##
## dim value % cum% scree plot
## 1 0.060429 93.9 93.9 *************************
## 2 0.002090 3.2 97.1 *
## 3 0.001479 2.3 99.4
## 4 0.000356 0.6 100.0
## -------- -----
## Total: 0.064354 100.0
##
...
This CA analysis has the same total Pearson chi-square, χ2 (28) = 3422.5 as the result of
chisq.test(suicide.tab2). However, the scree plot display above shows that the as-
sociation between age and method is essentially one-dimensional, but note also that dimension
1 (“age-method”) in this analysis has nearly the same inertia (0.0604) as the second dimension
(0.0596) in the analysis of the stacked table. We plot the CA results as shown below (see Fig-
ure 6.7), and add a line connecting the supplementary points for sex.
female
jump
0.1
25−35 70−90
drown
gas ●
other ●
10−20 poison 55−65
●
●
hang
gun knife
40−50
−0.1
●
male
Dimension 1 (93.90%)
Figure 6.7: 2D CA solution forfig:ca-suicide-sup
the [Age] [Method] marginal table. Category points for Sex are
shown as supplementary points
Comparing this graph with Figure 6.6, you can see that ignoring sex has collapsed the dif-
ferences between males and females which were the dominant feature of the analysis including
6.5 Multiple correspondence analysis [supp-pdf.mkii ] 237
sex. The dominant feature in Figure 6.7 is the Dimension 1 ordering of both age and method.
However, as in Figure 6.6, the supplementary points for sex point toward the methods that are
more prevalent for females and males.
## 3 Red Brown 26 0 0 1 0 1 0 0 0
## 4 Blond Brown 7 0 0 0 1 1 0 0 0
## 5 Black Blue 20 1 0 0 0 0 1 0 0
## 6 Brown Blue 84 0 1 0 0 0 1 0 0
## 7 Red Blue 17 0 0 1 0 0 1 0 0
## 8 Blond Blue 94 0 0 0 1 0 1 0 0
## 9 Black Hazel 15 1 0 0 0 0 0 1 0
## 10 Brown Hazel 54 0 1 0 0 0 0 1 0
## 11 Red Hazel 14 0 0 1 0 0 0 1 0
## 12 Blond Hazel 10 0 0 0 1 0 0 1 0
## 13 Black Green 5 1 0 0 0 0 0 0 1
## 14 Brown Green 29 0 1 0 0 0 0 0 1
## 15 Red Green 14 0 0 1 0 0 0 0 1
## 16 Blond Green 16 0 0 0 1 0 0 0 1
Thus, the first row in haireye.df represents the 68 individuals having black hair (h1=1)
and brown eyes (e1=1). The indicator matrix Z is then computed by replicating the rows in
haireye.df according to the Freq value, using the function expand.dft. The result has
592 rows and 8 columns.
Z <- expand.dft(haireye.df)[,-(1:2)]
dim(Z)
Note that if the indicator matrix is partitioned as Z = [Z1 , Z2 ], corresponding to the two sets
of categories, then the contingency table is given by N = Z1T Z2 .
With this setup, MCA can be described as the application of the simple correspondence anal-
ysis algorithm to the indicator matrix Z. This analysis would yield scores for the rows of Z (the
cases), usually not of direct interest and for the columns (the categories of both variables). As in
simple CA, each row point is the weighted average of the scores for the column categories, and
each column point is the weighted average of the scores for the row observations.5
Consequently, the point for any category is the centroid of all the observations with a response
in that category, and all observations with the same response pattern coincide. As well, the origin
reflects the weighted average of the categories for each variable. As a result, category points with
low marginal frequencies will be located further away from the origin, while categories with high
marginal frequencies will be closer to the origin. For a binary variable, the two category points
5
Note that, in principle, this use of an indicator matrix could be extended to three (or more) variables. That
extension is more easily described using an equivalent form, the Burt matrix, described in Section 6.5.2.
6.5 Multiple correspondence analysis [supp-pdf.mkii ] 239
will appear on a line through the origin, with distances inversely proportional to their marginal
x:haireye4} frequencies.
In the call to plot.ca, the argument what is used to suppress the display of the row points
for the cases. The plot shown in Figure 6.9 is an enhanced version of this basic plot.
Comparing Figure 6.9 with Figure 6.1, we see that the general pattern of the hair color and eye
color categories is the same in the analysis of the contingency table (Figure 6.1) and the analysis
of the indicator matrix (Figure 6.9), except that the axes are scaled differently—the display has
been stretched along the second (vertical) dimension. The interpretation is the same: Dimension
1 reflects a dark–light ordering of both hair and eye colors, and Dimension 2 reflects something
that largely distinguishes red hair and green eyes from the other categories.
Indeed, it can be shown (Greenacre, 1984, 2007) that the two displays are identical, except
for changes in scales along the axes. There is no difference at all between the displays in stan-
dard coordinates. Greenacre (1984, pp. 130–134) describes the precise relations between the
geometries of the two analyses.
4
240 [11-26-2014] 6 Correspondence analysis
Aside from the largely cosmetic difference in relative scaling of the axes, a major difference
between analysis of the contingency table and analysis of the indicator matrix is in the decompo-
sition of principal inertia and corresponding χ2 contributions for the dimensions. The plot axes
in Figure 6.9 indicate 24.3% and 19.2% for the contributions of the two dimensions, whereas
Figure 6.1 shows 89.4% and 9.5%. This difference is the basis for the more general development
of MCA methods and is reflected in the mcja() function illustrated later in this chapter. But
first, we describe a second approach to extending simple CA to the multivariate case based on the
Burt matrix.
Burt
The standard coordinates from an analysis of the Burt matrix B are identical to those of Z.
(However, the singular values of B are the squares of those of Z.) Then, the following code,
using Burt produces the same display of the category points for hair color and eye color as
shown for the indicator matrix Z in Figure 6.9.
Burt.ca <- ca(Burt)
plot(Burt.ca)
categorical variables, and variable q has Jq categories, then the Q-way contingency table, of size
J= Q q=1 Jq = J1 × J2 × · · · × JQ , with a total of n = n++··· observations may be represented
Q
• The inertia contributed by a given variable increases with the number of response cate-
gories.
• The centroid of the categories for each discrete variable is at the origin of the display.
• For a particular variable, the inertia contributed by a given category increases as the marginal
frequency in that category decreases. Low frequency points therefore appear further from
the origin.
• The category points for a binary variable lie on a line through the origin. The distance of
each point to the origin is inversely related to the marginal frequency.
{ex:marital3}
data("PreSex", package="vcd")
PreSex <- aperm(PreSex, 4:1) # order variables G, P, E, M
presex.df <- expand.dft(as.data.frame(PreSex))
This example analyzes the Burt matrix calculated from the presex.df data, specified as
lambda="Burt"
summary(presex.mca)
The output from summary() seems to show that 77.6% of the total inertia is accounted for
in two dimensions. A basic, default plot of the MCA solution is provided by the plot() method
for "mjca" objects.
plot(presex.mca)
This plotting method is not very flexible in terms of control of graphical parameters or the
ability to add additional annotations (labels, lines, legend) to ease interpretation. Instead, we
use the plot method to create an empty plot (with no points or labels), and return the calculated
plot coordinates (res) for the categories. A bit of processing of the coordinates provides the
customized display shown in Figure 6.10.
As indicated above, the category points for each factor appear on lines through the origin,
with distances inversely proportional to their marginal frequencies. For example, the categories
for No premarital and extramarital sex are much larger than the corresponding Yes categories, so
the former are positioned closer to the origin. In contrast, the categories of gender and marital
status are more nearly equal marginally.
Another aspect of interpretation of Figure 6.10 concerns the alignment of the lines for dif-
ferent factors. The positions of the category points on Dimension 1 suggest that Women are
less likely to have had pre-marital and extra-marital sex and that still being married is associated
with the absence of pre- and extra-marital sex. As well, the lines for gender and marital status
are nearly at right angles, suggesting that these variables are unassociated. This interpretation is
more or less correct, but it is only approximate in this MCA scaling of the coordinate axes. An
alternative scaling, based on a biplot representation is described in Section 6.6.
If you compare the MCA result in Figure 6.10 with the mosaic matrix in Figure 5.22, you
will see that they are both showing the bivariate pairwise associations among these variables, but
in different ways. The mosaic plots show the details of marginal and joint frequencies together
with residuals from independence for each 2 × 2 marginal subtable. The MCA plot using the Burt
matrix summarizes each category point in terms of a 2D representation of contributions to total
inertia (association). 4
244 [11-26-2014] 6 Correspondence analysis
Inertia decomposition
The transition from simple CA to MCA is straight-forward in terms of the category scores derived
from the indicator matrix Z or the Burt matrix, B. It is less so in terms of the calculation of
total inertia, and therefore in the chi-square values and corresponding percentages of association
accounted for in some number of dimensions.
In simple CA, the total inertia is χ2 /n, and it therefore makes sense to talk of percentage of
association accounted for by each dimension. But in MCA of the indicator matrix the total inertia,
λ, is simply (J − Q)/Q, because the inertia of each subtable, Zi is equal to its dimensionality,
P
Ji − 1, and the total inertia of an indicator matrix is the average of the inertias of its subtables.
Consequently, the average inertia per dimension is 1/Q, and it is common to interpret only those
dimensions that exceed this average (analogous to the use of 1 as a threshold for eigenvalues in
principal components analysis).
To more adequately reflect the percentage of association in MCA, Greenacre (1990) (see
also Greenacre (2007, Chapter 19) for details), revising an earlier proposal by Benzécri (1977),
suggested the calculation of adjusted inertia, which ignores the contributions of the diagonal
blocks in the Burt matrix,
Q 1 2
{eq:benzecri} (λ?i )2 = (λZ − ) (6.8)
Q−1 i Q
as the principal inertia due to the dimensions with (λZ )2 > 1/Q. This idea is referred to as joint
correspondence analysis, and expresses the contribution of each dimension as (λ?i )2 / (λ?i )2 ,
P
Z 2
{ex:titanic2} with the summation over only dimensions with (λ ) > 1/Q.
mjca() allows different scaling methods for the contributions to inertia of the different
dimensions. The default, used here, is the adjusted inertias as in Eqn. (6.8)
summary(titanic.mca)
Using similar code to that used in Example 6.8, Figure 6.11 shows an enhanced version of
the default plot that connects the category points for each factor by lines using the result returned
by the plot() function.
In this plot, the points for each factor have the property that the sum of coordinates on each
dimension, weighted inversely by the marginal proportions, equals zero. Thus high frequency
categories (e.g., Adult and Male) are close to the origin.
The first dimension is perfectly aligned with gender, and also strongly aligned with Survival.
The second dimension pertains mainly to Class and Age effects. Considering those points which
differ from the origin most similarly (in distance and direction) to the point for Survived, gives
the interpretation that survival was associated with being female or upper class or (to a lesser
degree) being a child.
4
Geometrically, Eqn. (6.9) may be described as approximating the data value yij by the projection
of the end point of vector ai on bj (and vice-versa), as shown in Figure 6.8.
246 [11-26-2014] 6 Correspondence analysis
||
||a
b
θ
os θ
||a|| c
Figure 6.8: The scalar product of vectors of two points from the origin is the length of the pro-
jection of one vector on the other. {fig:Scalarp
As in CA, there are a number of different representations of coordinates for row and column
points for a contingency table within a biplot framework. One set of connections between CA
and the biplot can be be seen through the reconstitution formula, giving the decomposition of
the correspondence matrix P = N /n in terms of the standard coordinates Φ and Γ defined in
Eqn. (6.4) and Eqn. (6.5) as:
M p
!
X
{eq:reconstitution1} pij = ri cj 1 + λm φim γjm (6.10)
m=1
Two other types asymmetric “maps” are also defined with different scalings that turn out to
have better visual properties in terms of representing the relations between the row and column
categories, particularly when the strength of association (inertia) in the data is low.
• The option map="rowgab" (or map="colgab") gives a biplot form proposed by Gabriel
and Odoroff (1990) with the rows (columns) shown in principal coordinates and the columns
(rows) in standard coordinates multiplied by the mass cj (ri ) of the corresponding point.
• The contribution biplot for CA (Greenacre, 2013), with the option map="rowgreen"
(or map="colgreen") provides a reconstruction of the standardized residuals from in-
dependence, using the points in standard coordinates multiplied by the square root of the
corresponding masses. This has the nice visual property of showing more directly the
contributions of the vectors to the low-dimensional solution.
{ex:suicide3}
Using this result, suicide.ca, in the call to plot() below, we use map="colgreen"
and vectors represent the methods of suicide, as shown in Figure 6.9.
The interpretation of the row points for the age–sex categories is similar to what we saw ear-
lier in Figure 6.6. But now, the vectors for the suicide categories reflect the contributions of those
methods to the representation of association. Thus, the methods drown, gun and gas have
large contributions, while knife, hang, and poison are relatively small. Moreover, the pro-
jections of the points for the age–sex combinations on the method vectors reflect the standardized
residuals from independence.
The most comprehensive modern treatment of biplot methodology is the book Understanding
Biplots (Gower et al., 2011). Together with the book, they provide an R package, UBbipl, that is
capable of producing an astounding variety of high-quality plots. Unfortunately, that package is
only available on their publisher’s web site8 and you need the book to be able to use it because all
the documentation is in the book. Nevertheless, we illustrate the use of the cabipl() function
to produce the version of the CA biplot shown in Figure 6.10.
library(UBbipl)
cabipl(as.matrix(suicide.tab),
axis.col = gray(.4), ax.name.size=1,
ca.variant = "PearsonResA",
markers = FALSE,
row.points.size = 1.5,
row.points.col = rep(c("red", "blue"), 4),
plot.col.points = FALSE,
marker.col = "black", marker.size=0.8,
offset = c(2, 2, 0.5, 0.5),
offset.m = rep(-0.2, 14),
output=NULL)
248 [11-26-2014] 6 Correspondence analysis
0.6
70−90 M
●
0.4
55−65 M
●
Dimension 2 (35.49%)
drown hang
70−90 F
0.2
●55−65 F
●
knife
jump
0.0
40−50 M
40−50 F ●
●
poison
−0.2
gun
10−20 M
10−20 F ●
●
25−35 F other
●
−0.4
25−35 M
● gas
−0.6
Dimension 1 (57.16%)
Figure 6.9: CA biplot of the suicide data using the contribution biplot scaling. Associations be-
tween the age-sex categories and the suicide methods can be read as the projections of the points
on the vectors. The lengths of the vectors for the suicide categories reflect their contributions to
fig:ca-suicide-biplot
this representation in a 2D plot.
This plot uses ca.variant = "PearsonResA" to specify that the biplot is to approx-
imate the standardized Pearson residuals by the inner product of each row point on the vector
for the column point for the suicide methods, as also in Figure 6.9. However, Figure 6.10 rep-
resents the methods calibrated axis lines, designed to be read as scales for the projections of the
row points (age–sex) on the methods. The UBbipl package has a huge number of options for
controlling the details of the biplot display. See (Gower et al., 2011, Ch. 2) for all the details.
4
A different use of biplots for contingency tables stems from the close analogy between addi-
tive relations for a quantitative response when there is no interaction between factors, and the
multiplicative relations for a contingency table when there is no association.
For quantitative data Bradu and Gabriel (1978) show how the biplot can be used to diagnose
additive relations among rows and columns. For example, when a two-way table is well-described
8
http://www.wiley.com/legacy/wileychi/gower/material.html
6.6 Biplots for contingency tables [supp-pdf.mkii ] 249
hang
knife
0.15
−0.05
−0.1 70−90 M
0.01
drown 55−65 M −0.1
−0.1
0.05
70−90 F
55−65 F
0 −0.05
jump
40−50 M
40−50 F
−0.05
0.1
10−20 M gun
10−20 F
25−35 F
0.1 −0.01
gas
Figure 6.10: CA biplot of the suicide data, showing calibrated axes for the suicide methods. {fig:cabipl-suicide}
then, the row points, ai , and the column points, bj , will fall on two straight lines at right angles to
each other in the biplot. For a contingency table, the multiplicative relations among frequencies
under independence become additive relations in terms of log frequency, and Gabriel et al. (1997)
illustrate how biplots of log frequency can be used to explore associations in two-way and three-
way tables.
That is, For a two-way table, independence, A ⊥ B, implies that ratios of frequencies should
be proportional for any two rows, i, i0 and any two columns, j, j 0 . Equivalently, this means that
the log odds ratio for all such sets of four cells should be zero:
!
nij ni0 j 0
A ⊥ B ⇐⇒ log θii0 ,jj 0 = log =0
ni0 j nij 0
Now, if the log frequencies have been centered by subtracting the grand mean, Gabriel et al.
(1997) show that log θii0 ,jj 0 is approximated in the biplot (of log(nij ) − log(nij ))
a1
b2
a2
b1
Figure 6.11: Independence implies orthogonal vector differences in a biplot of log frequency.
The line joining a1 to a2 represents (a1 − a2 ). This line is perpendicular to the line (b1 − b2 )
under independence. {fig:bidemo}
Therefore, this biplot criterion for independence in a two-way table is whether (ai −ai0 )T (bi −
bi0 ) ≈ 0 for all pairs of rows, i, i0 , and all pairs of columns, j, j 0 . But (ai − ai0 ) is the vector
connecting ai to ai0 and (bj − bj 0 ) is the vector connecting bj to bj 0 , as shown in Figure 6.11,
and the inner product of any two vectors equals zero iff they are orthogonal. Hence, this criterion
implies that all lines connecting pairs of row points are orthogonal to lines connecting pairs of
{ex:soccer3} column points, as illustrated in Figure 6.11.
data("UKSoccer", package="vcd")
dimnames(UKSoccer) <- list(Home=paste0("H", 0:4),
Away=paste0("A", 0:4))
Basic biplots in R are provided by biplot() that works mainly with the result calcu-
lated by prcomp() or princomp(). Here, we use prcomp() on the log frequencies in
the UKSoccer table, adding 1, because there is one cell with zero frequency.
The result is plotted using a customized plot based on biplot() as shown in Figure 6.12.
To supplement this plot and illustrate the orthogonality of row and column category points
under independence, we added horizontal and vertical lines as calculated below, using the results
returned by prcomp(). The initial version of this plot showed that two points, A2 and H2 did
not align with the others, so these were excluded from the calculations.
6.7 Chapter summary [ch06/summary ] 251
2.0
1.5
A2
0.5
1.0
A3
Dimension 2
A4
0.5
H2
H4
0.0
0.0
H1 A1 H0
H3
−0.5
−1.0
−0.5
−1.5
A0
Dimension 1
Figure 6.12: Biplot for the biadditive representation of independence for the UK Soccer scores.
The row and column categories are independent in this plot when they appear as points on ap-
fig:biplot-soccer-plot
proximately orthogonal lines.
You can see that all the A points (except for A2) and all the H points (except for H2) lie
along straight lines, and these lines are indeed at right angles, signifying independence. The
fact that these straight lines are parallel to the coordinate axes is incidental, and unrelated to the
independence interpretation.
4
extensions, provide ways to interpret the patterns of association and explore visually the
adequacy of certain loglinear models.
• The scores assigned to the categories of each variable are optimal in several equivalent
ways. Among other properties, they maximize the (canonical) correlations between the
quantified variables (weighted by cell frequencies), and make the regressions of each vari-
able on the other most nearly linear, for each CA dimension.
• Multi-way tables may be analyzed in several ways. In the “stacking” approach, two or
more variables may be combined interactively in the rows and/or columns of an n-way
table. Simple CA of the restructured table reveals associations between the row and column
categories of the restructured table, but hides associations between the variables combined
interactively. Each way of stacking corresponds to a particular loglinear model for the full
table.
• The biplot is a related technique for visualizing the elements of a data array by points
or vectors in a joint display of their row and column categories. A standard CA biplot
represents the contributions to lack of independence as the projection of the points for rows
(or columns) on vectors for the other categories.
• Another application of the biplot to contingency table data is described, based on analysis
of log frequency. This analysis also serves to diagnose patterns of independence and partial
independence in two-way and larger tables.
(a) Carry out a simple correspondence analysis on this table. How much of the inertia is ac-
counted for by a one-dimensional solution? How much by a two-dimensional solution?
(b) Plot the 2D CA solution. To what extent can you consider the association between job
satisfaction and income “explained” by the ordinal nature of these variables?
{lab:6.2}
Exercise 6.2 Refer to Exercise 1 in Chapter 5. Carry out a simple correspondence analysis on
the 4 × 5 table criminal from the logmult package.
(a) What percentages of the Pearson χ2 for association are explained by the various dimen-
sions?
(b) Plot the 2D correspondence analysis solution. Describe the pattern of association between
year and age.
{lab:6.3}
Exercise 6.3 The data set caith in MASS gives a classic table tabulating hair color and eye
color of people in Caithness, Scotland, originally from Fisher (1940).
6.9 Lab exercises [ch06/summary ] 253
(a) Carry out a simple correspondence analysis on this table. How many dimensions seem
necessary to account for most of the association in the table?
(b) Plot the 2D solution. The interpretation of the first dimension should be obvious; is there
any interpretation for the second dimension?
{lab:6.4}
Exercise 6.4 The same data, plus a similar table for Aberdeen, are given as a three-way table as
HairEyePlace in vcdExtra.
(a) Carry out similar correspondence analysis analysis to the last exercise for the data from
Aberdeen. Comment on any differences in the placement of the category points.
(b) Analyze the three-way table, stacked to code hair color and place interactively, i.e., for the
loglinear model [HairP lace][Eye]. What does this show?
{lab:6.5}
Exercise 6.5 For the mental health data analyzed in Example 6.2, construct a shaded sieve dia-
gram and mosaic plot. Compare these with the correspondence analysis plot shown in Figure 6.2.
What features of the data and the association between SES and mental health status are shown in
each? {lab:6.6}
Exercise 6.6 Simulated data is often useful to help understand the connections between data,
analysis methods and associated graphic displays. Section 6.4.1 illustrated interactive coding in
R, using a simulated 4-way table of counts of pets, classified by age, color and sex, but with no
associations because the counts had a constant Poisson mean, λ = 15.
(a) Re-do this example, but in the call to rpois(), specify a non-negative vector of Poisson
means to create some associations among the table factors.
(b) Use CA methods to determine if and how the structure you created in the data appears in
the results. {lab:TV3}
{lab:6.7}
Exercise 6.7 The TV data was analyzed using CA in Example 6.4, ignoring the variable Time.
Carry out analyses of the 3-way table, reducing the number of levels of Time to three hourly
intervals as shown below.
data("TV", package="vcdExtra")
# reduce number of levels of Time
TV.df <- as.data.frame.table(TV)
levels(TV.df$Time) <- rep(c("8", "9", "10"), c(4, 4, 3))
TV3 <- xtabs(Freq ~ Day + Time + Network, TV.df)
structable(Day ~ Network + Time, TV3)
(a) Use the stacking approach (Section 6.4) to perform a CA of the table with Network and
Time coded interactively. You can create this using the as.matrix() method for a
"structable" object.
TV3S <- as.matrix(structable(Day ~ Network + Time, TV3), sep=":")
#remove(list=objects(pattern="\\.tab|\\.df|\\.fit"))
.locals$ch06 <- setdiff(ls(), .globals)
#.locals$ch06
remove(list=.locals$ch06[sapply(.locals$ch06,function(n){!is.function(get(n))})])
References
Benzécri, J.-P. (1977). Sur l’analyse des tableaus binaries associés a une correspondense multiple.
Cahiers de l’Analyse des Données, 2, 55–71.
Bradu, D. and Gabriel, R. K. (1978). The biplot as a diagnostic tool for models of two-way tables.
Technometrics, 20, 47–68.
Burt, C. (1950). The factorial analysis of qualitative data. British Journal of Statistical Psychol-
ogy, 3, 166–185.
Emerson, J. W. (1998). Mosaic displays in S-PLUS: A general implementation and a case study.
Statistical Graphics and Computing Newsletter, 9(1), 17–23.
Fienberg, S. E. (1980). The Analysis of Cross-Classified Categorical Data. Cambridge, MA:
MIT Press, 2nd edn.
Fisher, R. A. (1940). The precision of discriminant functions. Annals of Eugenics, 10, 422–429.
Friendly, M. (1991). SAS System for Statistical Graphics. Cary, NC: SAS Institute, 1st edn.
Friendly, M. (1994). Mosaic displays for multi-way contingency tables. Journal of the American
Statistical Association, 89, 190–200.
Friendly, M. (1999). Extending mosaic displays: Marginal, conditional, and partial views of
categorical data. Journal of Computational and Graphical Statistics, 8(3), 373–395.
Gabriel, K. R. (1971). The biplot graphic display of matrices with application to principal com-
ponents analysis. Biometrics, 58(3), 453–467.
Gabriel, K. R. (1980). Biplot. In N. L. Johnson and S. Kotz, eds., Encyclopedia of Statistical
Sciences, vol. 1, (pp. 263–271). New York: John Wiley and Sons.
Gabriel, K. R. (1981). Biplot display of multivariate matrices for inspection of data and diagnosis.
In V. Barnett, ed., Interpreting Multivariate Data, chap. 8, (pp. 147–173). London: John Wiley
and Sons.
Gabriel, K. R., Galindo, M. P., and Vincente-Villardón, J. L. (1997). Use of biplots to diagnose
independence models in three-way contingency tables. In M. Greenacre and J. Blasius, eds.,
Visualization of Categorical Data, chap. 27, (pp. 391–404). San Diego, CA: Academic Press.
Gabriel, K. R. and Odoroff, C. L. (1990). Biplots in biomedical research. Statistics in Medicine,
9, 469–485.
Gifi, A. (1981). Nonlinear Multivariate Analysis. The Netherlands: Department of Data Theory,
University of Leiden.
255
256 [11-26-2014] REFERENCES
Goodman, L. A. (1981). Association models and canonical correlation in the analysis of cross-
classifications having ordered categories. Journal of the American Statistical Association,
76(374), 320–334.
Goodman, L. A. (1985). The analysis of cross-classified data having ordered and/or unordered
categories: Association models, correlation models, and asymmetry models for contingency
tables with or without missing entries. Annals of Statistics, 13(1), 10–69.
Goodman, L. A. (1986). Some useful extensions of the usual correspondence analysis approach
and the usual log-linear models approach in the analysis of contingency tables. International
Statistical Review, 54(3), 243–309. With a discussion and reply by the author.
Gower, J., Lubbe, S., and Roux, N. (2011). Understanding Biplots. Wiley.
Gower, J. C. and Hand, D. J. (1996). Biplots. London: Chapman & Hall.
Greenacre, M. (1984). Theory and Applications of Correspondence Analysis. London: Academic
Press.
Greenacre, M. (1989). The Carroll-Green-Schaffer scaling in correspondence analysis: A theo-
retical and empirical appraisal. Journal of Marketing Research, 26, 358–365.
Greenacre, M. (1990). Some limitations of multiple correspondence analysis. Computational
Statistics Quarterly, 3, 249–256.
Greenacre, M. (1997). Diagnostics for joint displays in correspondence analysis. In J. Blasius
and M. Greenacre, eds., Visualization of Categorical Data, (pp. 221–238). Academic Press.
Greenacre, M. (2013). Contribution biplots. Journal of Computational and Graphical Statistics,
22(1), 107–122.
Greenacre, M. and Hastie, T. (1987). The geometric interpretation of correspondence analysis.
Journal of the American Statistical Association, 82, 437–447.
Greenacre, M. J. (2007). Correspondence analysis in practice. Boca Raton: Chapman &
Hall/CRC.
Hartigan, J. A. and Kleiner, B. (1984). A mosaic of television ratings. The American Statistician,
38, 32–35.
Heuer, J. (1979). Selbstmord Bei Kinder Und Jugendlichen. Stuttgard: Ernst Klett Verlag. [Sui-
cide by children and youth.].
Lebart, L., Morineau, A., and Warwick, K. M. (1984). Multivariate Descriptive Statistical Anal-
ysis: Correspondence Analysis and Related Techniques for Large Matrices. New York: John
Wiley and Sons.
Snee, R. D. (1974). Graphical display of two-way contingency tables. The American Statistician,
28, 9–12.
van der Heijden, P. G. M., de Falguerolles, A., and de Leeuw, J. (1989). A combined approach
to contingency table analysis using correspondence analysis and log-linear analysis. Applied
Statistics, 38(2), 249–292.
van der Heijden, P. G. M. and de Leeuw, J. (1985). Correspondence analysis used complementary
to loglinear analysis. Psychometrika, 50, 429–447.