Incremental and interactive sequence mining

TLDR

Frequent sequence discovery in temporal databases is crucial, yet existing methods assume static data and require full rescans when updates occur, making the task challenging due to potential invalidation or creation of sequences. The study proposes novel techniques to maintain frequent sequences amid database updates and user interactions. We maintain sequences by updating only affected parts rather than rescanning the entire database, allowing parameter changes without full re‑execution and thus reducing runtime. Experiments show that the approach yields execution time reductions of several orders of magnitude in practice.

Abstract

The discovery of frequent sequences in temporal databases is an important data mining problem. Most current work assumes that the database is static, and a database update requires rediscovering all the patterns by scanning the entire old and new database. In this paper, we propose novel techniques for maintaining sequences in the presence of a) database updates, and b) user interaction (e.g. modifying mining parameters). This is a very challenging task, since such updates can invalidate existing sequences or introduce new ones. In both the above scenarios, we avoid re-executing the algorithm on the entire dataset, thereby reducing execution time. Experimental results confirm that our approach results in execution time improvements of up to several orders of magnitude in practice.

References

Page 1

	Year	Citations

Page 1