Publication | Closed Access
SPIN
326
Citations
22
References
2004
Year
Unknown Venue
EngineeringGraph TheoryData ScienceData MiningFrequent Pattern MiningPattern DiscoveryKnowledge DiscoveryBusinessPattern MiningStructure MiningComputer ScienceNew AlgorithmLarge Graph DatabasesGraph ProcessingBig DataFull Enumeration
One fundamental challenge for mining recurring subgraphs from semi-structured data sets is the overwhelming abundance of such patterns. In large graph databases, the total number of frequent subgraphs can become too large to allow a full enumeration using reasonable computational resources. In this paper, we propose a new algorithm that mines only maximal frequent subgraphs, i.e. subgraphs that are not a part of any other frequent subgraphs. This may exponentially decrease the size of the output set in the best case; in our experiments on practical data sets, mining maximal frequent subgraphs reduces the total number of mined patterns by two to three orders of magnitude.Our method first mines all frequent trees from a general graph database and then reconstructs all maximal subgraphs from the mined trees. Using two chemical structure benchmarks and a set of synthetic graph data sets, we demonstrate that, in addition to decreasing the output size, our algorithm can achieve a five-fold speed up over the current state-of-the-art subgraph mining algorithms.
| Year | Citations | |
|---|---|---|
Page 1
Page 1