VF2 algorithm: In the VF2 algorithm, those candidates (iq, i) are pruned if iq is not connected to already matched nodes in Gq (i.e., nodes of Gq included in M). Subsequently, the pruning step also removes those node-pairs (iq, i) in which i is not connected to the matched nodes in the data graph G. These pruning rules assume that the query and data graphs are connected. The algorithm also compares the number of neighbor nodes of each of i and iq that are connected to nodes in M but are not included in M. The number of such nodes in the data graph must be no smaller than the number of such nodes in the query graph. Finally, the number of neighbor nodes of each of i and iq that are not directly connected to nodes in M are compared. The number of such nodes in the data graph must be no smaller than the number of such nodes in the query graph.
Sequencing optimizations: The effectiveness of the pruning steps is sensitive to the order in which nodes are added to the matching set M. In general, nodes with rarer labels in the query graph should be selected first in the exploration of different can-didate pairs in C. Rarer labels can be matched in fewer ways across graphs. Early exploration of rare labels leads to exploration of more relevant partial matches M at the earlier levels of the recursion. This also helps the pruning effectiveness. Enhanced versions of VF2 and QuickSI combine node sequencing and the aforementioned node pruning steps.
The reader is referred to the bibliographic notes for details of these algorithms. The defi-nition of subgraph isomorphism in this section assumes that all edges of the node-induced subgraph of the data graph are present in the query graph. In some applications, such as frequent subgraph mining, a more general definition is used, in which any subset of edges of the node-induced subgraph is also considered a subgraph isomorphism. The more gen-eral case can be solved with minor changes to the basic algorithm in which the criteria to generate candidates and validate them are both relaxed appropriately.
17.2.2 Maximum Common Subgraph (MCG) Problem
The MCG problem is a generalization of the subgraph isomorphism problem. The MCG between two graphs is at most equal to the smaller of the two, when one is a subgraph of the other. The basic principles of subgraph isomorphism algorithms can be extended easily to the MCG isomorphism problem. The following will discuss the extension of the Ullman algorithm to the MCG problem. The main differences between these methods are in terms of the pruning criteria and the fact that the maximum common subgraph is continuously tracked over the course of the algorithm as the search space of subgraphs is explored.
The recursive exploration process of the MCG algorithm is identical to that of the subgraph isomorphism algorithm. The algorithm is illustrated in Fig. 17.5. The two input graphs are denoted by G1 and G2, respectively. As in the case of subgraph matching, the current matching in the recursive exploration is denoted by the set M. For each matching node-pair (i1, i 2) ∈ M, it is assumed that i1 is drawn from G1, and i2 is drawn from G2. Another input parameter to the algorithm is the current best (largest) matching set of node-pairs Mbest. Both M and Mbest are initialized to null in the initial call made to the recursive algorithm by the analyst. Strictly speaking, each recursive call determines the best matching under the constraint that the pairs in M must be matched. This is the reason that this parameter is set to null at the top-level recursive call. However, in lower level calls, the value of M is not null.
Dostları ilə paylaş: |