Efficient algorithms for discovering association rules

TLDR

Association rules link itemsets to consequent items, and Agrawal, Imielinski, and Swami pioneered mining them from large datasets using successive database passes. This work proposes a faster algorithm for mining association rules. The algorithm uses combinatorial analysis of earlier passes to prune unnecessary candidate rules. Experiments on a university enrollment database show a five‑fold speedup over the prior method, and demonstrate that sampling is generally an efficient rule‑finding strategy.

Abstract

Association rules are statements of the form 90 % of the rows of the relation, if the row has value 1 in the columns in set W, then it has 1 also in column B. Agrawal, Imielinski, and Swami introduced the problem of mining association rules from large collections of data, and gave a method based on successive passes over the database. We give an improved algorithm for the problem. The method is based on careful combinatorial analysis of the information obtained in previous passes; this makes it possible to eliminate unnecessary candidate rules. Experiments on a university course enrollment database indicate that the method outperforms the previous one by a factor of 5. We also show that sampling is in general a very efficient way of finding such rules.

References

Page 1

	Year	Citations
Mining association rules between sets of items in large databases Rakesh Agrawal, Tomasz Imieliński, Arun Swami EngineeringBusiness IntelligencePattern DiscoveryLarge DatabasePattern Mining	1993	14.7K
Fast Algorithms for Mining Association Rules in Large Databases Rakesh Agrawal, Ramakrishnan Srikant Very Large Data Bases EngineeringInformation RetrievalData ScienceData MiningFrequent Pattern Mining	1994	9.4K
A guided tour of chernoff bounds Torben Hagerup, Christine Rüb Information Processing Letters Mathematical ProgrammingEngineeringLower BoundGuided TourProbability Theory	1990	522
High risk characteristics for motor vehicle crashes in persons with diabetes by age. Thomas J. Songer, Rashida Dorsey PubMed Type 1Diabetes ManagementDiabetes EpidemiologyDriver BehaviorHigher Involvement	2006	38
Medical conditions and car crashes. Patricia C. Dischinger, Shiu M. Ho, Joseph A. Kufera PubMed Driver BehaviorAccident InvestigationDriver CulpabilityRoad Traffic SafetyPatient Safety	2000	29

Page 1