Publication | Closed Access
Incremental FP-Growth mining strategy for dynamic threshold value and database based on MapReduce
20
Citations
7
References
2014
Year
Unknown Venue
Cluster ComputingEngineeringBig Data AnalyticsBig Data EraPattern MiningDynamic Threshold ValueMap-reduceMining MethodsData ScienceData MiningLarge-scale DataThreshold ValueData IntegrationData ManagementKnowledge DiscoveryComputer ScienceEvolutionary Data MiningFrequent Pattern MiningAssociation RuleData Stream MiningMassive Data ProcessingBig Data
With the coming of the Big Data era, data mining has been confronted with new opportunities and challenges. Some limitations are exposed when traditional association rule mining algorithms are used to deal with large-scale data. In the Apriori algorithm, scanning the external storage repeatedly leads to high I/O load and brings about low performance. As for FP-Growth algorithm, the effectiveness is limited by internal memory size because mining process is on the base of large tree-form data structure. What's more, although remarkable achievements have been scored, there are still problems in dynamic scenarios. The paper presents a parallelized incremental FP-Growth mining strategy based on MapReduce, which aims to process large-scale data. The proposed incremental algorithm realizes effective data mining when threshold value and original database change at the same time. This novel algorithm is implemented on Hadoop and shows great advantages according to the experimental results.
| Year | Citations | |
|---|---|---|
Page 1
Page 1