Publication | Closed Access
An iterative MapReduce approach to frequent subgraph mining in biological datasets
58
Citations
15
References
2012
Year
Unknown Venue
Cluster ComputingBiological DatasetsEngineeringPattern DiscoveryNetwork AnalysisPattern MiningSubgraph MiningGraph DatabaseMap-reduceData ScienceData MiningData IntegrationData ManagementSystems BiologyKnowledge DiscoveryComputer ScienceMultifold ScalabilityBioinformaticsFrequent Pattern MiningGraph TheoryFrequent SubgraphsAssociation RuleComputational BiologyBusinessStructure MiningIterative Mapreduce ApproachFrequent Subgraph MiningBig Data
Mining frequent subgraphs has attracted a great deal of attention in many areas, such as bioinformatics, web data mining and social networks. There are many promising main memory-based techniques available in this area, but they lack scalability as the main memory is a bottleneck. Taking the massive data into consideration, traditional database systems like relational databases and object databases fail miserably with respect to efficiency as frequent subgraph mining is computationally intensive. With the advent of the MapReduce framework by Google, a few researchers have applied the MapReduce model on a single graph for mining frequent substructures. In this paper, we propose to make use of the MapReduce programming model which achieves multifold scalability on a set of labeled graphs. We tested our method on both real and synthetic datasets. To the best of our knowledge, this is the first attempt to implement transaction graphs using the MapReduce model.
| Year | Citations | |
|---|---|---|
Page 1
Page 1