4.3 Graphical Representation The similar pages are represented graphically using a simple tree structure. Nodes that are similar to each other are linked together using black lines. However, a complication arises due to the tendency of the similarity tree to have cycles.
When cycles occur, there are 2 choices, either actually drawing the cycle, or ignoring the existence of the cycle and drawing duplicate copies of nodes instead. We feel that since cycles are an innate properly of similarity relations, it is unacceptable to ignore them. However, the naïve method of just drawing all cycles uniformly makes the tree appear very cluttered.
Thus, we propose a representation of the tree where each node is linked by a solid line the first time it is drawn. Thereafter, each time another node points to it (thus creating a cycle), a lighter, dotted line is drawn based on the distance between the 2 nodes in the tree. Hence, long cycles will be represented by lighter lines, while short cycles will be represented by darker lines. We choose to define the distance between 2 nodes as the difference in the depth of their first appearance. This is one of many possible definitions and is a possible subject of future investigation.