Pattern Vectors From Algebraic Graph Theory
Pattern Vectors From Algebraic Graph Theory
Pattern Vectors From Algebraic Graph Theory
Article:
Wilson, R C orcid.org/0000-0001-7265-3033, Hancock, E R orcid.org/0000-0003-4496-
2028 and Luo, B (2005) Pattern vectors from algebraic graph theory. IEEE Transactions on
Pattern Analysis and Machine Intelligence. pp. 1112-1124. ISSN 0162-8828
https://doi.org/10.1109/TPAMI.2005.145
Reuse
Items deposited in White Rose Research Online are protected by copyright, with all rights reserved unless
indicated otherwise. They may be downloaded and/or printed for private study, or other acts as permitted by
national copyright laws. The publisher or other rights holders may allow further reproduction and re-use of
the full text version. This is indicated by the licence information on the White Rose Research Online record
for the item.
Takedown
If you consider content in White Rose Research Online to be in breach of UK law, please notify us by
emailing eprints@whiterose.ac.uk including the URL of the record and the reason for the withdrawal request.
eprints@whiterose.ac.uk
https://eprints.whiterose.ac.uk/
1112 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 7, JULY 2005
Abstract—Graph structures have proven computationally cumbersome for pattern analysis. The reason for this is that, before graphs can
be converted to pattern vectors, correspondences must be established between the nodes of structures which are potentially of different
size. To overcome this problem, in this paper, we turn to the spectral decomposition of the Laplacian matrix. We show how the elements of
the spectral matrix for the Laplacian can be used to construct symmetric polynomials that are permutation invariants. The coefficients of
these polynomials can be used as graph features which can be encoded in a vectorial manner. We extend this representation to graphs in
which there are unary attributes on the nodes and binary attributes on the edges by using the spectral decomposition of a Hermitian
property matrix that can be viewed as a complex analogue of the Laplacian. To embed the graphs in a pattern space, we explore whether
the vectors of invariants can be embedded in a low-dimensional space using a number of alternative strategies, including principal
components analysis (PCA), multidimensional scaling (MDS), and locality preserving projection (LPP). Experimentally, we demonstrate
that the embeddings result in well-defined graph clusters. Our experiments with the spectral representation involve both synthetic and
real-world data. The experiments with synthetic data demonstrate that the distances between spectral feature vectors can be used to
discriminate between graphs on the basis of their structure. The real-world experiments show that the method can be used to locate
clusters of graphs.
1 INTRODUCTION
long-vector and embedded in a low-dimensional space using and line-segments. Robles-Kelly and Hancock [26] have
principal components analysis [20]. An alternative to using a developed a statistical variant of the method which avoids
probability distribution over a class archetype is to use thresholding the leading eigenvector and instead employs
pairwise clustering methods. Here, the starting point is a the EM algorithm to iteratively locate clusters of objects.
matrix of pairwise affinities between graphs. There are a The use of graph-spectral methods for correspondence
variety of algorithms available for performing pairwise matching has proven to be an altogether more elusive task.
clustering, but one of the most popular approaches is to use The idea underpinning spectral methods for correspondence
spectral methods, which use the eigenvectors of an affinity analysis is to locate matches between nodes by comparing the
matrix to define clusters [33], [13]. There are a number of eigenvectors of the adjacency matrix. However, although this
examples of applying pairwise clustering methods to graph method works well for graphs with the same number of nodes
edit distances and similarity measures [19], [23]. and small differences in edge structure, it does not work well
However, one of the criticisms that can be aimed at these when graphs of different size are being matched. The reason
methods for learning the distribution of graphs is that they for this is that the eigenvectors of the adjacency matrix are
are, in a sense, brute force because of their need for unstable under changes in the size of the adjacency matrix.
correspondences either to establish an archetype or to Turning to the literature, the algorithm of Umeyama [38]
compute graph similarity. For noisy graphs (those which illustrates this point very clearly. The method commences by
are subject to structural differences), this problem is thought performing singular value decomposition on the adjacency
to be NP-hard. Although relatively robust approximate matrices of the graphs under study. The permutation matrix
methods exist for computing correspondence [5], these can which brings nodes of the graphs into correspondence is
prove time consuming. In this paper, we are interested in the found by taking the sum of the outer products of the sets of
problem of constructing pattern vectors for graphs. These corresponding singular vectors. The method cannot be
vectors represent the graph without the need to compute operated when the adjacency matrices are of different size,
correspondences. These vectors are necessarily large, since i.e., the singular vectors are of different length. To overcome
they must represent the entire set of graphs of a particular this problem, Luo and Hancock [18], have drawn on the
size. For this reason, we are also interested in pattern spaces apparatus of the EM algorithm and treat the correspondences
for families of graphs. In particular, we are interested in the as missing or hidden data. By introducing a correspondence
problem of whether it is possible to construct a low- probability matrix, they overcome problems associated with
dimensional pattern space from our pattern vectors. the different sizes of the adjacency matrices. An alternative
The adopted approach is based on spectral graph theory solution to the problem of size difference is adopted by
[6], [21], [3]. Although existing graph-spectral methods have Kosinov and Caelli [16] who project the graph onto the
proven effective for graph-matching and indexing [38], they eigenspace spanned by its leading eigenvectors. Provided
have not made full use of the available spectral representa- that the eigenvectors are normalized, then the relative angles
tion and are restricted to the use of either the spectrum of of the nodes are robust to size difference under the projection.
eigenvalues or a single eigenvector. In related work, Kesselman et al. have used a minimum
distortion procedure to embed graphs in a low-dimensional
1.1 Related Literature space where correspondences can be located between groups
Spectral graph theory is concerned with understanding how of nodes [14]. Recent work by Robles-Kelly and Hancock [27]
the structural properties of graphs can be characterized using has shown how an eigenvector method can be used to sort the
the eigenvectors of the adjacency matrix or the closely related nodes of a graph into string order and how string matching
Laplacian matrix (the degree matrix minus the adjacency methods can be used to overcome the size difference problem.
matrix). There are good introductory texts on the subject by Finally, Shokoufandeh et al. [34] have shown how to use the
Biggs [3] and Cvetkovic et al. [8]. Comprehensive reviews of interleaving property of the eigenvalues to index shock trees.
recent progress in the field can be found in the research
monograph of Chung [6] and the survey papers of Lovasc [17] 1.2 Contribution
and Mohar [21]. The subject has acquired considerable As noted earlier, one of the problems which hinders the
topicality in the computer science literature since spectral pattern analysis of graphs is that they are neither vectorial in
methods can be used to develop efficient methods for path nature nor easily transformed into vectors. Although the
planning and data clustering. In fact, the Googlebot search spectral methods described above provide a means by which
engine uses ideas derived from spectral graph theory. the structure of graphs can be characterized and the required
Techniques from spectral graph theory have been applied correspondences can be located, they make only limited use of
in a number of areas in computer vision, too, and have led to the available spectral information. Moreover, although
the development of algorithms for grouping and segmenta- graph-matching may provide a fine measure of distance
tion [33], correspondence matching [38], [18], and shape between structures and this, in turn, may be used to cluster
indexing [34]. Among the first to apply spectral ideas to the similar graphs, it does not result in an ordering of the graphs
grouping problem were Scott and Longuet-Higgins [30], who that has metrical significance under structural variations due
showed how to extract groups of points by relocating the to graded structural changes. Hence, we aim to use spectral
eigenvectors of a point-proximity matrix. However, probably methods to vectorize graphs and, hence, embed them in low-
the best known work in this area is that of Shi and Malik [33], dimensional pattern spaces using manifold learning methods.
who showed how the Fiedler (second smallest) eigenvector of We describe a method for constructing spectral features which
the Laplacian matrix could be used to locate groupings that are permutation invariants and which make use of the full
optimize a normalized cut measure. In related work, both spectral matrix. To construct these invariants, we use
Perona and Freeman [24], and Sarkar and Boyer [29] have symmetric polynomials. The arguments of the polynomials
shown how the thresholded leading eigenvector of the are the elements of the spectral matrix. We use the values of the
weighted adjacency matrix can be used for grouping points symmetric polynomials to construct graph pattern-vectors.
1114 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 7, JULY 2005
Size differences between graphs can be accommodated by The recovery of the correspondence map f can be posed as
padding the spectral matrix with trailing vectors of zeros. that of minimizing an error criterion. The minimum value
The representation can be extended to attributed graphs of the criterion can be taken as a measure of the similarity
using graph property matrices with complex entries instead of the two graphs. The additional node “” represents a
of the real valued Laplacian. We show how weighted and null match or dummy node. Extraneous or additional
attributed edges can be encoded by complex numbers with nodes are matched to the dummy node. A number of
the weight as the magnitude or modulus and the attribute as search and optimization methods have been developed to
the phase. If the property matrices are required to be solve this problem [5], [9]. We may also consider the
Hermitian, then the eigenvalues are real and the eigenvectors
correspondence mapping problem as one of finding the
are complex. The real and imaginary components of the
permutation of nodes in the second graph which places
eigenvectors can again be used to compute symmetric
polynomials. them in the same order as that of the first graph. This
We explore how the spectral feature vectors may be used permutation can be used to map the Laplacian matrix of
to construct pattern spaces for sets of graphs [14]. We the second graph onto that of the first. If the graphs are
investigate a number of different approaches. The first of isomorphic, then this permutation matrix satisfies the
these is to apply principal components analysis to the condition L1 ¼ PL2 PT . When the graphs are not iso-
covariance matrix for the vectors. This locates a variance morphic, then this condition no longer holds. However,
preserving embedding for the pattern vectors. The second the Frobenius distance jjL1 PL2 PT jj between the
approach is multidimensional scaling and this preserves the matrices can be used to gauge the degree of similarity
distance between vectors. The third approach is locally between the two graphs. Spectral techniques have been
linear projection which aims to preserve both the variance used to solve this problem. For instance, working
of the data and the pattern of distances [10]. We demon- with adjacency matrices, Umeyama [38] seeks the
strate that this latter method gives the best graph-clusters. permutation matrix PU that minimizes the Frobenius
The outline of this paper is as follows: Section 2 details norm JðPU Þ ¼ jjPU A1 PT U A2 jj. The method performs
our construction of the spectral matrix. In Section 3, we the singular value decompositions A1 ¼ U1 1 UT 1 and
show how symmetric polynomials can be used to construct A2 ¼ U2 2 UT , where the Us are orthogonal matrices and
2
permutation invariants from the elements of the spectral
the s are diagonal matrices. Once these factorizations
matrix. Section 4 provides details of how the symmetric
have been performed, the required permutation matrix is
polynomials can be used to construct useful pattern vectors
and how problems of variable node-set size can be approximated by U2 UT 1.
accommodated. To accommodate attributes, in Section 5, In some applications, especially structural chemistry,
we show how to extend the representation using complex eigenvalues have also been used to compare the structural
numbers to construct a Hermitian property matrix. Section 6 similarity of different graphs. However, although the
briefly reviews how PCA, MDS, and locality preserving eigenvalue spectrum is a permutation invariant, it repre-
projection can be performed on the resulting pattern sents only a fraction of the information residing in the
vectors. In Section 7, we present results on both synthetic eigensystem of the adjacency matrix.
and real-world data. Finally, Section 8 offers some conclu- Since the matrix L is positive semidefinite, it has
sions and suggests directions for further investigation. eigenvalues which are all either positive or zero. The
spectral matrix is found by performing the eigenvector
expansion for the Laplacian matrix L, i.e.,
2 SPECTRAL GRAPH REPRESENTATION
n
Consider the undirected graph G ¼ ðV; E; WÞ with node-set L¼
X
i ei eTi ;
V ¼ f1 ; 2 ; . . . ; n g, edge-set E ¼ fe1 ; e2 ; . . . ; em g V V, i¼1
and weight function W : E ! ð0; 1. The adjacency matrix A
for the graph G is the n n matrix with elements where i and ei are the n eigenvectors and eigenvalues of
the symmetric matrix L. The spectral matrix then has the
scaled eigenvectors as columns and is given by
1 if ða ; b Þ 2 E
Aab ¼
0 otherwise: pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi
¼ 1 e1 ; 2 e2 ; . . . n en : ð1Þ
In other words, the matrix represents the edge structure of the
graph. Clearly since the graph is undirected, the matrix A is The matrix is a complete representation of the graph in
symmetric. If the graph edges are weighted, the adjacency the sense that we can use it to reconstruct the original
matrix is defined to be Laplacian matrix using the relation L ¼ T .
The matrix is a unique representation of L iff all
Wða ; b Þ if ða ; b Þ 2 E n eigenvalues are distinct or zero. This follows directly from
Aab ¼
0 otherwise: the fact that there are n distinct eigenvectors when the
The Laplacian of the graph is given by L ¼ D A, where D eigenvalues are also all distinct. When an eigenvalue is
is the diagonal node degree matrix whose elements Daa ¼ repeated, then there exists a subspace spanned by the
P n eigenvectors of the degenerate eigenvalues in which all
b¼1 Aab are the number of edges which exit the individual
nodes. The Laplacian is more suitable for spectral analysis vectors are also eigenvectors of L. In this situation, if the
than the adjacency matrix since it is positive semidefinite. repeated eigenvalue is nonzero, there is continuum of spectral
In general, the task of comparing two such graphs, matrices representing the same graph. However, this is rare
G1 and G2 , involves finding a correspondence mapping for moderately large graphs. Those graphs for which the
between the nodes of the two graphs, f : V 1 [ $ V 2 [ . eigenvalues are distinct or zero are referred to as simple.
WILSON ET AL.: PATTERN VECTORS FROM ALGEBRAIC GRAPH THEORY 1115
As a result, we can write the spectral matrix 0 of the Sr ¼ ð1Þkþr Pk Srk ; ð2Þ
r k¼1
permuted Laplacian matrix L0 ¼ 0 0T as 0 ¼ P. Direct
comparison of the spectral matrices for different graphs is, where we have used the shorthand Sr for Sr ðv1 ; . . . : : ; vn Þ
hence, not possible because of the unknown permutation. and Pr for Pr ðv1 ; . . . ; vn Þ. As a consequence, the elementary
The eigenvalues of the adjacency matrix have been used symmetric polynomials can be efficiently computed using
as a compact spectral representation for comparing graphs
the power symmetric polynomials.
because they are not changed by the application of a
In this paper, we intend to use the polynomials to construct
permutation matrix. The eigenvalues can be recovered from
the spectral matrix using the identity invariants from the elements of the spectral matrix. The
polynomials can provide spectral “features” which are
n
X invariant under node permutations of the nodes in a graph
j ¼ 2ij :
i¼1
and utilize the full spectral matrix. These features are
constructed as follows: Each column of the spectral matrix
The expression ni¼1 2ij is, in fact, a symmetric polynomial in
P
forms the input to the set of spectral polynomials. For
the components of eigenvector ei . A symmetric polynomial
is invariant under permutation of the variable indices. In example, the column f1;i ; 2;i ; . . . ; n;i g produces the poly-
this case, the polynomial is invariant under permutation of nomials S1 ð1;i ; . . . ; n;i Þ, S2 ð1;i ; . . . ; n;i Þ; . . . ; Sn ð1;i ; . . . ;
the row index i. n;i Þ. The values of each of these polynomials is invariant to
In fact, the eigenvalue is just one example of a family of the node order of the Laplacian. We can construct a set of
symmetric polynomials which can be defined on the n2 spectral features using the n columns of the spectral matrix
components of the spectral matrix. However, there is a in combination with the n symmetric polynomials.
special set of these polynomials, referred to as the elementary Each set of n features for each spectral mode contains all
symmetric polynomials (S) that form a basis set for symmetric
the information about that mode up to a permutation of
polynomials. In other words, any symmetric polynomial
can itself be expressed as a polynomial function of the components. This means that it is possible to reconstruct the
elementary symmetric polynomials belonging to the set S. original components of the mode given the values of the
We therefore turn our attention to the set of elementary features only. This is achieved using the relationship
symmetric polynomials. For a set of variables fv1 ; v2 . . . vn g, between the roots of a polynomial in x and the elementary
they can be defined as symmetric polynomials. The polynomial
n Y
ðx vi Þ ¼ 0 ð3Þ
P
S1 ðv1 ; . . . vn Þ ¼ vi ;
i¼1 i
Pn n
P
S2 ðv1 ; . . . vn Þ ¼ vi vj ; has roots v1 ; v2 ; . . . ; vn . Multiplying out (3) gives
i¼1 j¼iþ1
.. xn S1 xn1 þ S2 xn2 þ . . . þ ð1Þn Sn ¼ 0; ð4Þ
. P
Sr ðv1 ; . . . vn Þ ¼ vi1 vi2 . . . vir ; where we have again used the shorthand Sr for
i1 <i2 <...<ir
.. Sr ðv1 ; . . . : : ; vn Þ. By substituting the feature values into
. (4) and finding the roots, we can recover the values of the
n
Q
Sn ðv1 ; . . . vn Þ ¼ vi : original components. The root order is undetermined, so,
i¼1
as expected, the values are recovered up to a permutation.
1116 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 7, JULY 2005
4 PATTERN VECTORS FROM SYMMETRIC Gold and Rangarajan [9] and probabilistic relaxation [5]
POLYNOMIALS require Oðn4 Þ operations per iteration.
PN T
also a solution, i.e., H e ¼ e. In the real case, we choose i¼1 li ui ui on the covariance matrix C, where the li are
such that e is of unit length. In the complex case, itself may the eigenvalues and the ui are the eigenvectors. We use the
be complex and needs to determined by two constraints. We first s leading eigenvectors (2 or 3, in practice, for visualiza-
P to jei j ¼ 1 and, in addition, we impose the
set the vector length tion purposes) to represent the graphs extracted from the
condition arg ni¼1 ei ¼ 0, which specifies both real and images. The coordinate system of the eigenspace is spanned
imaginary parts. by the s orthogonal vectors U ¼ ðu1 ; u2 ; : :; us Þ. The indivi-
This representation can be extended further by using the dual graphs represented by the long vectors Bk ; k ¼
four-component complex numbers known as quaternions. 1; 2; . . . ; N can be projected onto this eigenspace using the
As with real and complex numbers, there is an appropriate formula yk ¼ UT ðBk B Þ. Hence, each graph Gk is repre-
eigendecomposition which allows the spectral matrix to be sented by an s-component vector yk in the eigenspace.
found. In this case, an edge weight and three additional Linear discriminant analysis (LDA) is an extension of PCA
binary measurements may be encoded on an edge. It is not to the multiclass problem. We commence by constructing
possible to encode more than one unary measurement using separate data matrices X1 ; X2 ; . . . XNc for each of the Nc
this approach. However, for the experiments in this paper, classes. These may be used to compute the corresponding
we have concentrated on the complex representation. class covariance matricesPCi ¼ Xi XTi . The average class
When the eigenvectors are constructed in this way, the covariance matrix C ~ ¼ 1 Nc Ci is found. This matrix is
Nc i¼1
spectral matrix is found by performing the eigenvector used as a sphering transform. We commence by computing
expansion
the eigendecomposition
n
i ei eyi ;
X N
H¼ ~¼
X
i¼1
C li ui uTi ¼ UUT ;
i¼1
where i and ei are the n eigenvectors and eigenvalues of the ~ as
Hermitian matrix H. We construct the complex spectral where U is the matrix with the eigenvectors of C
matrix for the graph G using the eigenvectors as columns, i.e., columns and ¼ diagðl1 ; l2 ; : : ; ln Þ is the diagonal eigenva-
lue matrix. The sphered representation of the data is
1
X0 ¼ 2 UT X. Standard PCA is then applied to the
pffiffiffiffiffi pffiffiffiffiffi pffiffiffiffiffi
¼ 1 e1 ; 2 e2 ; . . . n en : ð7Þ
resulting data matrix X0 . The purpose of this technique is
We can again reconstruct the original Hermitian property to find a linear projection which describes the class
matrix using the relation H ¼ y . differences rather than the overall variance of the data.
Since the components of the eigenvectors are complex
6.2 Multidimensional Scaling
numbers, therefore, each symmetric polynomial is complex.
Hence, the symmetric polynomials must be evaluated with Multidimensional scaling (MDS) is a procedure which allows
complex arithmetic and also evaluate to complex numbers. data specified in terms of a matrix of pairwise distances to be
Each Sr therefore has both real and imaginary components. embedded in a Euclidean space. Here, we intend to use the
The real and imaginary components of the symmetric method to embed the graphs extracted from different
polynomials are interleaved and stacked to form a feature- viewpoints in a low-dimensional space. To commence, we
vector B for the graph. This feature-vector is real. require pairwise distances between graphs. We do this by
computing the norms between the spectral pattern vectors for
the graphs. For the graphs indexed i1 and i2 , the distance is
6 GRAPH EMBEDDING METHODS
di1 ;i2 ¼ ðBi1 Bi2 ÞT ðBi1 Bi2 Þ. The pairwise distances di1 ;i2
We explore three different methods for embedding the are used as the elements of an N N dissimilarity matrix R,
graph feature vectors in a pattern space. Two of these are whose elements are defined as follows:
classical methods. Principal components analysis (PCA)
finds the projection that accounts for the variance or scatter
di1 ;i2 if i1 6¼ i2
of the data. Multidimensional scaling (MDS), on the other Ri1 ;i2 ¼ ð8Þ
0 if i1 ¼ i2 :
hand, preserves the relative distances between objects. The
remaining method is a newly reported one that offers a In this paper, we use the classical multidimensional
compromise between preserving variance and the relational scaling method [7] to embed the graphs in a Euclidean space
arrangement of the data and is referred to as locality
using the matrix of pairwise dissimilarities R. The first step
preserving projection [10].
In this paper, we are concerned with the set of graphs of MDS is to calculate a matrix T whose element with row r
G1 ; G2 ; : : ; Gk ; . . . ; GN . The kth graph is denoted by Gk ¼ and column c is given by Trc ¼ 12 ½d2rc d^2r: d^2:c þ d^2:: ,
where d^r: ¼ N1 N
P
ðV k ; E k Þ and the associated vector of symmetric polynomials c¼1 drc is the average dissimilarity value
is denoted by Bk . over the rth row, d^:c is the dissimilarity average value over
the cth column and d^:: ¼ N12 N
P PN
6.1 Principal Component Analysis r¼1 c¼1 dr;c is the average
dissimilarity value over all rows and columns of the
We commence by constructing the matrix X ¼ ½B1 B
jB2 B j . . . jBk B
j. . .jBN B
, with the graph feature dissimilarity matrix R.
vectors as columns. Here, B is the mean feature vector for We subject the matrix T to an eigenvector analysis to
the data set. Next, we compute the covariance matrix for the obtain a matrix of embedding coordinates Y. If the rank of T
elements of the feature vectors by taking the matrix is k; k N, then we will have k nonzero eigenvalues. We
product C ¼ XXT . We extract the principal components arrange these k nonzero eigenvalues in descending
directions by performing the eigendecomposition C ¼ order, i.e., l1 l2 . . . lk > 0. The corresponding ordered
1118 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 7, JULY 2005
eigenvectors are denoted by ui , where li is the reveals cluster-structure. Here, we explore the two different
ith eigenvalue. The applications of view-based polyhedral object recognition and
pffiffi embedding
pffiffiffiffi coordinate
pffiffiffiffi system for the
graphs is Y ¼ ½ ðl1 Þu1 ; l2 u2 ; . . . ; ls us . For the graph shock graph recognition. The latter application focuses on the
indexed j, the embedded vector of coordinates is a row of use of the complex variant of the property matrix.
matrix Y, so yj ¼ ðYj;1 ; Yj;2 ; . . . ; Yj;s ÞT . 7.1 Synthetic Data
6.3 Locality Preserving Projection We commence by examining the ability of the spectral feature
set and distance measure to separate both structurally related
Our next pattern space embedding method is He and
and structurally unrelated graphs. This study utilizes
Niyogi’s Locality Preserving Projections(LPP) [10]. LPP is a
random graphs which consist of 30 nodes and 140 edges.
linear dimensionality reduction method which attempts to The edges are generated by connecting random pairs of
project high-dimensional data to a low-dimensional mani- nodes. From each seed graph constructed in this way, we
fold, while preserving the neighborhood structure of the generate structurally related graphs by applying random edit
data set. The method is relatively insensitive to outliers and operations to simulate the effect of noise. The structural
noise. This is an important feature in our graph clustering variants are generated from a seed graph by either deleting an
task since the outliers are usually introduced by imperfect edge or inserting a new random edge. Each of these
segmentation processes. The linearity of the method makes operations is assigned unit cost and, therefore, a graph with
it computationally efficient. a deleted edge has unit edit distance to the original seed
The graph feature vectors are used as the columns of a graph. In Fig. 1a, we have plotted the distribution of feature
data-matrix X ¼ ðB1 jB2 j . . . jBN Þ. The relational structure of vector distance di1 ;i2 ¼ ðBi1 Bi2 ÞT ðBi1 Bi2 Þ for two sets of
the data is represented by a proximity weight matrix W with graphs. The leftmost curve shows the distribution of
elements Wi1 ;i2 ¼ exp½kdi1 ;i2 , where k is a constant. If Q is distances when we take individual seed graphs and remove
the diagonal degree matrix with the row weights Qk;k ¼ a single edge at random. The rightmost curve is the
P N distribution of distance obtained when we compare the
j¼1 Wk;j as elements, then the relational structure of the
data is represented using the Laplacian matrix J ¼ Q W. distinct seed graphs. In the case of the random edge-edits, the
The idea behind LPP is to analyze the structure of the modal distance between feature vectors is much less than that
weighted covariance matrix XWXT . The optimal projection for the structurally distinct seed graphs. Hence, the distance
of the data is found by solving the generalized eigenvector between feature vectors appears to provide scope for
problem distinguishing between distinct graphs when there are small
variations in edge structure due to noise.
XJXT u ¼ lXQXT u: ð9Þ The results presented in Table 1 take this study one step
further and demonstrate the performance under different
We project the data onto the space spanned by the levels of corruption. Here, we have computed the confusion
eigenvectors corresponding to the s smallest eigenvalues. probability, i.e., the overlap between distribution of feature-
Let U ¼ ðu1 ; u2 ; . . . ; us Þ be the matrix with the corresponding vector distance for the edited graphs and the seed graphs, as a
eigenvectors as columns, then projection of the kth feature
vector is yk ¼ UT Bk .
TABLE 1
Performance of Feature Set for Edited Graphs
7 EXPERIMENTS
There are three aspects to the experimental evaluation of the
techniques reported in this paper. We commence with a study
on synthetic data aimed at evaluating the ability of the
spectral features to distinguish between graphs under
controlled structural error. The second part of the study
focuses on real-world data and assesses whether the spectral
feature vectors can be embedded in a pattern space that
WILSON ET AL.: PATTERN VECTORS FROM ALGEBRAIC GRAPH THEORY 1119
Fig. 2. (a) Distance between feature vectors versus graph edit distance and (b) MDS applied to three random graph-sets.
1120 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 7, JULY 2005
Fig. 4. Example images from the CMU, MOVI, and chalet sequences
Fig. 3. Classification error rates. and their corresponding graphs.
we show examples of the raw image data and the associated spanned by the leading two eigenvectors. The different
graphs for the three toy houses which we refer to as CMU/ objects form well-separated and compact clusters.
VASC, MOVI, and Swiss Chalet. To take this study further, we investigate the effects of
In Fig. 5, we compare the results obtained with the applying the embedding methods to two of the sequences
different embedding strategies and the different graph separately. Fig. 7 shows the results obtained with the
features. In Column 1, we show the results obtained when Laplacian polynomials for MOVI and Chalet sequences.
PCA is used, Column 2 when MDS is used, and Column 3 Column 1 is for PCA, Column 2 is for MDS, and Column 3 is
when LPP is used. The top row shows the results obtained for LPP. In the case of both image sequences, LPP results in a
using a standard spectral feature, namely, the spectrum smooth trajectory as the different views in the sequence are
ordered eigenvalues of the Laplacian, i.e., Bk ¼ ðk1 ; k2 ; . . .ÞT . traversed.
The second row shows the results obtained when the
symmetric polynomials are computed using the spectral 7.3 Shock Graphs
matrix for the Laplacian. The different image sequences are The final example focuses on the use of the complex property
displayed in various shades of gray and black. matrix representation and is furnished by shock trees which
There are a number of conclusions that can be drawn are a compact representation of curved boundary shape.
from this plot. First, the most distinct clusters are produced There are a number of ways in which a shape graph can be
when either MDS or LPP are used. Second, the best spectral computed [15], [35]. One of the most effective recent methods
features seem to be the symmetric polynomials computed has been to use the Hamilton-Jacobi equations from classical
from the Laplacian spectral matrix. To display the cluster- mechanics to solve the eikonal equation for inward boundary
structure obtained, in Fig. 6, we visualize the results motion [35]. The skeleton is the set of singularities in the
obtained using LPP and the Laplacian polynomials by boundary motion where opposing fronts collide and these
placing thumbnails of the original images in the space can be characterized as locations where the divergence of the
Fig. 5. Clustering CMU, MOVI, and Chalet sequences, Row 1: Eigenvalues. Row 2: Laplacian Matrix Polynomials. Column 1: PCA, Column 2: MDS,
and Column 3: LPP.
WILSON ET AL.: PATTERN VECTORS FROM ALGEBRAIC GRAPH THEORY 1121
Fig. 7. Comparison of the MOVI and Chalet sequences with various projection methods. Column 1: PCA, Column 2: MDS, and Column 3: LPP.
1122 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 7, JULY 2005
Fig. 8. Examples from the shape database with their associated skeletons.
Fig. 9. (a) MDS, (b) PCA, and (c) LPP applied to the shock graphs.
attributed trees, any ambiguity is removed by the measure- Fig. 9c shows the result of the LPP procedure on the data
ment information. set. The results are similar to the PCA method.
One of the motivations for the work presented here was
7.3.2 Experiments with Shock Trees the potential ambiguities that are encountered when using
Our experiments are performed using a database of 42 binary the spectral features of trees. To demonstrate the effect of
shapes. Each binary shape is extracted from a 2D view of a using attributed trees rather than simply weighting the
3D object. There are three classes in the database and, for each edges, we have compared the LDA projections using both
object, there are a number of views acquired from different types of data. Fig. 10 illustrates the result of this compar-
viewing directions and a number of different examples of the ison. Fig. 10b shows the result obtained using the
class. We extract the skeleton from each binary shape and symmetric polynomials from the eigenvectors of the
attribute the resulting tree in the manner outlined in Laplacian matrix L ¼ D W, associated with the edge
Section 7.3.1. Fig. 8 shows some examples of the types of weight matrix. Fig. 10a shows the result of using the using
shape present in the database along with their skeletons. the Hermitian property matrix. The Hermitian property
We commence by showing some results for the three matrix for the attributed trees produces a better class
shapes shown in Fig. 8. The objects studied are a hand, some separation than the Laplacian matrix for the weighted trees.
puppies, and some cars. The dog and car shapes consist of a The separation can be measured by the Fisher discriminant
number of different objects and different views of each object. between the classes, which is the squared distance between
The hand category contains different hand configurations. class centers divided by the variance along the line joining
We apply the three embedding strategies outlined in Section 6 the centers. For the Hermitian property matrix, the
to the vectors of permutation invariants extracted from the separations are 1.12 for the car/dog classes, 0.97 for the
Hermitian variant of the Laplacian. We commence in Fig. 9a car/hand, and 0.92 for the dog/hand. For the weighted
by showing the result of applying the MDS procedure to the matrix, the separations are 0.77 for the car/dog, 1.02 for the
three shape categories. The “hand” shapes form a compact car/hand, and 0.88 for the dog/hand.
cluster in the MDS space. There are other local clusters
consisting of three or four members of the remaining two
classes. This reflects the fact that, while the hand shapes have 8 CONCLUSIONS
very similar shock graphs, the remaining two categories have In this paper, we have shown how graphs can be converted
rather variable shock graphs because of the different objects. into pattern vectors by utilizing the spectral decomposition
Fig. 9b shows the result of using PCA. Here, the distribu- of the Laplacian matrix and basis sets of symmetric
tions of shapes are much less compact. While a distinct cluster polynomials. These feature vectors are complete, unique,
of hand shapes still occurs, they are generally more dispersed and continuous. However, and most important of all, they
over the feature space. There are some distinct clusters of the are permutation invariants. We investigate how to embed
car shape, but the distributions overlap more in the PCA the vectors in a pattern space, suitable for clustering the
projection when compared to the MDS space. graphs. Here, we explore a number of alternatives including
WILSON ET AL.: PATTERN VECTORS FROM ALGEBRAIC GRAPH THEORY 1123
Fig. 10. A comparison of attributed trees with weighted trees. (a) Trees with edge weights based on boundary lengths. (b) Attributed trees with
additional edge angle information.
PCA, MDS, and LPP. In an experimental study, we show [14] Y. Kesselman, A. Shokoufandeh, M. Demerici, and S. Dickinson,
“Many-to-Many Graph Matching via Metric Embedding,” Proc.
that the feature vectors derived from the symmetric IEEE Conf. Computer Vision and Pattern Recognition, pp. 850-857,
polynomials of the Laplacian spectral decomposition yield 2003.
good clusters when MDS or LPP are used. [15] B.B. Kimia, A.R. Tannenbaum, and S.W. Zucker, “Shapes, Shocks,
There are clearly a number of ways in which the work and Deformations I. The Components of 2-Dimensional Shape and
presented in this paper can be developed. For instance, since the Reaction-Diffusion Space,” Int’l J. Computer Vision, vol. 15,
pp. 189-224, 1995.
the representation based on the symmetric polynomials is [16] S. Kosinov and T. Caelli, “Inexact Multisubgraph Matching Using
complete, they may form the means by which a generative Graph Eigenspace and Clustering Models,” Proc. Joint IAPR Int’l
model of variations in graph structure can be developed. This Workshops Structural, Syntactic, and Statistical Pattern Recognition,
model could be learned in the space spanned by the SSPR 2002 and SPR 2002, pp. 133-142, 2002.
[17] L. Lovasz, “Random Walks on Graphs: A Survey,” Bolyai Soc.
permutation invariants and the mean graph and its modes
Math. Studies, vol. 2, pp. 1-46, 1993.
of variation reconstructed by inverting the system of [18] B. Luo and E.R. Hancock, “Structural Matching Using the EM
equations associated with the symmetric polynomials. Algorithm and Singular Value Decomposition,” IEEE Trans. Pattern
Analysis and Machine Intelligence, vol. 23, pp. 1120-1136, 2001.
[19] B. Luo, A. Torsello, A. Robles-Kelly, R.C. Wilson, and E.R. Hancock,
ACKNOWLEDGMENTS ”A Probabilistic Framework for Graph Clustering,” Proc. IEEE Conf.
Computer Vision and Pattern Recognition, pp. 912-919, 2001.
Bin Luo is partially supported by the National Natural [20] B. Luo, R.C. Wilson, and E.R. Hancock, “Eigenspaces for Graphs,”
Science Foundation of China (No. 60375010). Int’l J. Image and Graphics, vol. 2, pp. 247-268, 2002.
[21] B. Mohar, “Laplace Eigenvalues of Graphs—A Survey,” Discrete
Math., vol. 109, pp. 171-183, 1992.
REFERENCES [22] A. Munger, H. Bunke, and X. Jiang, “Combinatorial Search vs.
[1] A.D. Bagdanov and M. Worring, “First Order Gaussian Graphs for Genetic Algorithms: A Case Study Based on the Generalized
Efficient Structure Classification,” Pattern Recogntion, vol. 36, Median Graph Problem,” Pattern Recognition Letters, vol. 20,
pp. 1311-1324, 2003. pp. 1271-1279, 1999.
[2] M. Belkin and P. Niyogi, “Laplacian Eigenmaps for Dimension- [23] M. Pavan and M. Pelillo, “Dominant Sets and Hierarchical
ality Reduction and Data Representation,” Neural Computation, Clustering,” Proc. Ninth IEEE Int’l Conf. Computer Vision, vol. I,
vol. 15, no. 6, pp. 1373-1396, 2003. pp. 362-369, 2003.
[3] N. Biggs, Algebraic Graph Theory. Cambridge Univ. Press, 1993. [24] P. Perona and W.T. Freeman, “A Factorization Approach to
[4] P. Botti and R. Morris, “Almost All Trees Share a Complete Set of Grouping,” Proc. European Conf. Computer Vision, pp. 655-670, 1998.
Inmanantal Polynomials,” J. Graph Theory, vol. 17, pp. 467-476, [25] S. Rizzi, “Genetic Operators for Hierarchical Graph Clustering,”
1993. Pattern Recognition Letters, vol. 19, pp. 1293-1300, 1998.
[5] W.J. Christmas, J. Kittler, and M. Petrou, “Structural Matching in [26] A. Robles-Kelly and E.R. Hancock, “An Expectation-Maximisation
Computer Vision Using Probabilistic Relaxation,” IEEE Trans. Framework for Segmentation and Grouping,” Image and Vision
Pattern Analysis and Machine Intelligence, vol. 17, pp. 749-764, 1995. Computing, vol. 20, pp. 725-738, 2002.
[6] F.R.K. Chung, “Spectral Graph Theory,” Am. Math. Soc. Edition., [27] A. Robles-Kelly and E.R. Hancock, “Edit Distance from Graph
CBMS series 92, 1997. Spectra,” Proc. Ninth IEEE Int’l Conf. Computer Vision, vol. I, pp. 127-
[7] T. Cox and M. Cox, Multidimensional Scaling. Chapman-Hall, 135, 2003.
1994.
[28] S. Roweis and L. Saul, “Non-Linear Dimensionality Reduction by
[8] D. Cvetkovic, P. Rowlinson, and S. Simic, Eigenspaces of Graphs.
Locally Linear Embedding,” Science, vol. 299, pp. 2323-2326, 2002.
Cambridge Univ. Press, 1997.
[9] S. Gold and A. Rangarajan, “A Graduated Assignment Algorithm [29] S. Sarkar and K.L. Boyer, “Quantitative Measures of Change Based
for Graph Matching,” IEEE Trans. Pattern Analysis and Machine on Feature Organization: Eigenvalues and Eigenvectors,” Computer
Intelligence, vol. 18, pp. 377-388, 1996. Vision and Image Understanding, vol. 71, pp. 110-136, 1998.
[10] X. He and P. Niyogi, “Locality Preserving Projections,” Advances [30] G.L. Scott and H.C. Longuet-Higgins, “Feature Grouping by
in Neural Information Processing Systems 16, MIT Press, 2003. Relocalisation of Eigenvectors of the Proximity Matrix,” Proc.
[11] D. Heckerman, D. Geiger, and D.M. Chickering, “Learning British Machine Vision Conf., pp. 103-108, 1990.
Bayesian Networks: The Combination of Knowledge and Statis- [31] J. Segen, “Learning Graph Models of Shape,” Proc. Fifth Int’l Conf.
tical Data,” Machine Learning, vol. 20, pp. 197-243, 1995. Machine Learning, J. Laird, ed., pp. 29-25, 1988.
[12] I.T. Jolliffe, Principal Components Analysis. Springer-Verlag, 1986. [32] K. Sengupta and K.L. Boyer, “Organizing Large Structural
[13] R. Kannan et al., “On Clusterings: Good, Bad and Spectral,” Proc. Modelbases,” IEEE Trans. Pattern Analysis and Machine Intelligence,
41st Symp. Foundation of Computer Science, pp. 367-377, 2000. vol. 17, 1995.
1124 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 27, NO. 7, JULY 2005
[33] J. Shi and J. Malik, “Normalized Cuts and Image Segmentation,” Edwin R. Hancock studied physics as an
IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 22, undergraduate at the University of Durham and
pp. 888-905, 2000. graduated with honors in 1977. He remained at
[34] A. Shokoufandeh, S. Dickinson, K. Siddiqi, and S. Zucker, “Indexing Durham to complete the PhD degree in the area
Using a Spectral Coding of Topological Structure,” Proc. IEEE Conf. of high-energy physics in 1981. Following this,
Computer Vision and Pattern Recognition, pp. 491-497, 1999. he worked for 10 years as a researcher in the
[35] K. Siddiqi, A. Shokoufandeh, S.J. Dickinson, and S.W. Zucker, fields of high-energy nuclear physics and pattern
“Shock Graphs and Shape Matching,” Int’l J. Computer Vision, recognition at the Rutherford-Appleton Labora-
vol. 35, pp. 13-32, 1999. tory (now the Central Research Laboratory of
[36] J.B. Tenenbaum, V.D. Silva, and J.C. Langford, “A Global the Research Councils). In 1991, he moved to
Geometric Framework for Non-Linear Dimensionality Reduc- the University of York as a lecturer in the Department of Computer
tion,” Science, vol. 290, pp. 586-591, 2000. Science. He was promoted to senior lecturer in 1997 and to reader in
[37] A. Torsello and E.R. Hancock, “A Skeletal Measure of 2D Shape 1998. In 1998, he was appointed to a chair in computer vision. Professor
Similarity,” Proc. Fourth Int’l Workshop Visual Form, pp. 594-605, Hancock now leads a group of some 15 faculty, research staff, and PhD
2001. students working in the areas of computer vision and pattern
[38] S. Umeyama, “An Eigen Decomposition Approach to Weighted recognition. His main research interests are in the use of optimization
Graph Matching Problems,” IEEE Trans. Pattern Analysis and and probabilistic methods for high and intermediate level vision. He is
Machine Intelligence, vol. 10, pp. 695-703, 1988. also interested in the methodology of structural and statistical pattern
[39] B.J. van Wyk and M.A. van Wyk, “Kronecker Product Graph recognition. He is currently working on graph-matching, shape-from-X,
Matching,” Pattern Recognition, vol. 39, no. 9, pp. 2019-2030, 2003. image databases, and statistical learning theory. He has published more
[40] A.K.C. Wong, J. Constant, and M.L. You, “Random Graphs,” than 90 journal papers and 350 refereed conference publications. He
Syntactic and Structural Pattern Recognition, World Scientific, 1990. was awarded the Pattern Recognition Society medal in 1991 and an
outstanding paper award in 1997 by the journal Pattern Recognition. In
Richard C. Wilson received the BA degree in 1998, he became a fellow of the International Association for Pattern
physics from the University of Oxford in 1992. In Recognition. He has been a member of the editorial boards of the
1996, he completed the PhD degree at the journals IEEE Transactions on Pattern Analysis and Machine Intelli-
University of York in the area of pattern gence and Pattern Recognition. He has also been a guest editor for
recognition. From 1996 to 1998, he worked as special editions of the journals Image and Vision Computing and Pattern
a research associate at the University of York. Recognition.
After a period of postdoctoral research, he was
awarded an Advanced Research Fellowship in Bin Luo received the BEng degree in electro-
1998. In 2003, he took up a lecturing post and he nics and the MEng degree in computer science
is now a reader in the Department of Computer from Anhui university of China in 1984 and 1991,
Science at the University of York. He has published more than respectively. From 1996 to 1997, he worked as a
100 papers in journals, edited books, and refereed conferences. He British Council visiting scholar at the University
received an outstanding paper award in the 1997 Pattern Recognition of York under the Sino-British Friendship Scho-
Society awards and has won the best paper prize in ACCV 2002. He is larship Scheme (SBFSS). In 2002, he was
currently an associate editor of the journal Pattern Recognition. His awarded the PhD degree in computer science
research interests are in statistical and structural pattern recognition, from the University of York, United Kingdom. He
graph methods for computer vision, high-level vision, and scene is at present a professor at Anhui University of
understanding. He is a member of the IEEE Computer Society. China. He has published more than 60 papers in journals, edited books,
and refereed conferences. His current research interests include graph
spectral analysis, large image database retrieval, image and graph
matching, statistical pattern recognition, and image feature extraction.