Effective personalization based on association rule discovery from web usage data

TLDR

Personalization at the earliest site visit must rely on clickstream logs, but the absence of explicit ratings and the sparse, voluminous data make standard collaborative filtering difficult, and while usage‑mining clustering can scale, it often sacrifices accuracy. The authors propose effective and scalable techniques for Web personalization based on association rule discovery from usage data. The approach uses association rule mining on clickstream logs to generate personalized recommendations. Experiments on real usage data demonstrate that the method improves recommendation effectiveness and outperforms k‑nearest‑neighbor collaborative filtering in computational efficiency.

Abstract

To engage visitors to a Web site at a very early stage (i.e., before registration or authentication), personalization tools must rely primarily on clickstream data captured in Web server logs. The lack of explicit user ratings as well as the sparse nature and the large volume of data in such a setting poses serious challenges to standard collaborative filtering techniques in terms of scalability and performance. Web usage mining techniques such as clustering that rely on offline pattern discovery from user transactions can be used to improve the scalability of collaborative filtering, however, this is often at the cost of reduced recommendation accuracy. In this paper we propose effective and scalable techniques for Web personalization based on association rule discovery from usage data. Through detailed experimental evaluation on real usage data, we show that the proposed methodology can achieve better recommendation effectiveness, while maintaining a computational advantage over direct approaches to collaborative filtering such as the k-nearest-neighbor strategy.