Spectral embedding (Random walk Minimize trace(YTLY)version) su bj ect to: YT Y = I
Note that neither
Spectral embedding
rows nor columns
(Symmetric version)
of matrix Y will
have unit norm
Minimize
trace(ZT 1/2L 1/2Z) subject to: ZTZ = I
Note that rows of matrix Z will not have unit norm
Figure 19.8: Scaling relationships between random walk and symmetric versions of spectral clustering
intuitively generalize to the relaxed version of the problem because eigenvectors have both positive and negative components.
19.3.4.1 Important Observations and Intuitions
A few observations are noteworthy about the relationships between spectral clustering,
PageRank, and eigenvector analysis:
Normalized random walk Laplacian: The smallest right eigenvectors of Λ−1L = Λ−1(Λ − W ) = I − P are used for the random walk embedding, where P is the stochastic transition matrix of the graph. The smallest right eigenvectors of I − P are the same as the largest right eigenvectors of P . The largest right eigenvector of P has eigenvalue 1. It is noteworthy that the largest left eigenvector of P , which also has eigenvalue 1, yields the PageRank of the graph. Therefore, both the left and the right eigenvectors of the stochastic transition matrix P yield different insights about the network.
Normalized symmetric Laplacian: The smallest eigenvectors of the symmetric Lapla-cian Λ−1/2(Λ − W )Λ−1/2 are the same as the largest eigenvectors of the symmetric matrix Λ−1/2W Λ−1/2. The matrix Λ−1/2W Λ−1/2 can be viewed as a normalized and sparsified similarity matrix of the graph. Most forms of nonlinear embeddings such as SVD, Kernel PCA, and ISOMAP are extracted as large eigenvectors of similar-ity matrices (cf. Table 2.3 of Chap. 2). It is the choice of the similarity matrix that regulates the varying properties of these different embeddings.
Goal of normalization: Spectral clustering is more effective when the unnormalized Laplacian L is normalized with the node-degree matrix Λ. While it is possible to explain this behavior with a cut-interpretation of spectral clustering, the intuition does not generalize easily to continuous embeddings with both positive and negative eigenvector components. A simpler way of understanding normalization is by examin-ing the similarity matrix Λ−1/2W Λ−1/2 whose large eigenvectors yield the normalized spectral embedding. In this matrix, the edge similarities are normalized by the geomet-ric mean of the node degrees at their end points. This can be viewed as a normalization of the edge similarities with a local measure of the network density. As discussed in Chap. 3, normalizing similarity and distance functions with local density is helpful even in the case of multidimensional data mining applications. One of the most well-known algorithms for outlier analysis in multidimensional data, referred to as LOF,