Concepedia

Abstract

We consider the problem of comparing CUAL graphs (Connected, Undirected, Acyclic graphs with nodes being Labeled). This problem is motivated by the study of information retrieval for bio-chemical and molecular databases. Suppose we define the distance between two CUAL graphs G 1 and G 2 to be the weighted number of edit operations (insert node, delete node and relabel node) to transform G 1 to G 2 . By reduction from exact cover by 3-sets, one can show that finding the distance between two CUAL graphs is NP-complete. In view of the hardness of the problem, we propose a constrained distance metric, called the degree-2 distance, by requiring that any node to be inserted (deleted) have no more than 2 neighbors. With this metric, we present an efficient algorithm to solve the problem. The algorithm runs in time O(N 1 N 2 D 2 ) for general weighting edit operations and in time [Formula: see text] for integral weighting edit operations, where N i , i=1, 2, is the number of nodes in G i , D=min{d 1 , d 2 } and d i is the maximum degree of G i .