Group-average linkage: In this case, the distance between two groups of objects is equal to the average distance between all mi · mj pairs of objects in the groups. To compute the row (column) for the merged cluster in M , a weighted average of the ith and jth rows (columns) in the matrix M is used. For any value of k = i, j, this is
equal to
|
mi·Mik +mj
|
·Mjk
|
(for rows), and
|
mi·Mki+mj ·Mkj
|
(for columns).
|
|
mi+mj
|
|
|
mi+mj
|
Closest centroid: In this case, the closest centroids are merged in each iteration. This approach is not desirable, however, because the centroids lose information about the relative spreads of the different clusters. For example, such a method will not discrim-inate between merging pairs of clusters of varying sizes, as long as their centroid pairs are at the same distance. Typically, there is a bias toward merging pairs of larger clusters because centroids of larger clusters are statistically more likely to be closer to each other.
Variance-based criterion: This criterion minimizes the change in the objective function (such as cluster variance) as a result of the merging. Merging always results in a worsening of the clustering objective function value because of the loss of granularity. It is desired to merge clusters where the change (degradation) in the objective function as a result of merging is as little as possible. To achieve this goal, the zeroth, first, and second order moment statistics are maintained with each cluster. The average squared error SEi of the ith cluster can be computed as a function of the number mi of points in the cluster (zeroth-order moment), the sum Fir of the data points in the cluster i along each dimension r (first-order moment), and the squared sum Sir of the data points in the cluster i across each dimension r (second-order moment) according to the following relationship;
d
|
|
SEi = (Sir/mi − Fir2 /mi2).
|
(6.8)
|
r=1
|
|
This relationship can be shown using the basic definition of variance and is used by many clustering algorithms such as BIRCH (cf. Chap. 7). Therefore, for each cluster, one only needs to maintain these cluster-specific statistics. Such statistics are easy to maintain across merges because the moment statistics of a merge of the two clusters i and j can be computed easily as the sum of their moment statistics. Let SEi∪j denote the variance of a potential merge between the two clusters i and j. Therefore, the change in variance on executing a merge of clusters i and j is as follows:
ΔSEi∪j = SEi∪j − SEi − SEj .
|
(6.9)
|
This change can be shown to always be a positive quantity. The cluster pair with the smallest increase in variance because of the merge is selected as the relevant pair to
6.4. HIERARCHICAL CLUSTERING ALGORITHMS
|
171
|
|
|
Dostları ilə paylaş: |