17.2. MATCHING AND DISTANCE COMPUTATION IN GRAPHS
|
565
|
Algorithm MCG(Graphs: G1, G2, Current Partially Matched Pairs: M,
Current Best Match: Mbest)
begin
= Set of all label matching node pairs from (G1, G2) not in M; Prune C using heuristic methods; (Optional efficiency optimization) for each pair (i1, i2) ∈ C do
if M ∪ {(i1, i2)} is a valid matching
then Mbest =MCG(G1, G2, M ∪ {(i1, i2)}, Mbest);
endfor
if (|M| > |Mbest|) then return(M) else return(Mbest);
end
Figure 17.5: Maximum common subgraph (MCG) algorithm
As in the case of the subgraph isomorphism algorithm, the candidate matching node-pairs are explored recursively. The same steps of candidate extension and pruning are used in the MCG algorithm, as in the case of the subgraph isomorphism problem. However, some of the pruning steps used in the subgraph isomorphism algorithm, which are based on subgraph assumptions, can no longer be used. For example, in the MCG algorithm, a matching node-pair (i1 , i2) in M no longer needs to satisfy the constraint that the degree of a node in one graph is greater or less than that of its matching node in the other. Because of the more limited pruning in the maximum common subgraph problem, it will explore a larger search space. This is intuitively reasonable, because the maximum common subgraph problem is a more general one than subgraph isomorphism. However, some optimizations such as expanding only to connected nodes, and sequencing optimizations such as processing rare labels earlier, can still be used.
The largest common subgraph found so far is tracked in Mbest. At the end of the procedure, the largest matching subgraph found so far is returned by the algorithm. It is also relatively easy to modify this algorithm to determine all possible MCGs. The main difference is that all the current MCGs can be dynamically tracked instead of tracking a single MCG.
17.2.3 Graph Matching Methods for Distance Computation
Graph matching methods are closely related to distance computation between graphs. This is because pairs of graphs that share large subgraphs in common are likely to be more similar. A second way to compute distances between graphs is by using the edit distance. The edit distance in graphs is analogous to the notion of the edit distance in strings. Both these methods will be discussed in this section.
17.2.3.1 MCG-based Distances
When two graphs share a large subgraph in common, it is indicative of similarity. There are several ways of transforming the MCG size into a distance value. Some of these distance definitions have also been demonstrated to be metrics because they are nonnegative, sym-metric, and satisfy the triangle inequality. Let the MCG of graphs G1 and G2 be denoted by M CS(G1, G2) with a size of |M CS(G1, G2)|. Let the sizes of the graphs G1 and G2
566 CHAPTER 17. MINING GRAPH DATA
be denoted by |G1| and |G2|, respectively. The various distance measures are defined as a function of these quantities.
Unnormalized non-matching measure: The unnormalized non-matching distance mea-sure U (G1, G2) between two graphs is defined as follows:
Dostları ilə paylaş: |