Concepedia

TLDR

Many existing approaches to collaborative filtering cannot handle very large datasets nor easily accommodate users with few ratings. The authors propose Probabilistic Matrix Factorization (PMF) to address scalability and sparsity challenges in collaborative filtering. PMF is extended with an adaptive prior to automatically control model capacity and a constrained variant that groups users with similar rating sets. The extended PMF generalizes better for users with few ratings and, when combined with Restricted Boltzmann Machine predictions, achieves a 0.8861 error rate—about 7% lower than Netflix's own system.

Abstract

Many existing approaches to collaborative filtering can neither handle very large datasets nor easily deal with users who have very few ratings. In this paper we present the Probabilistic Matrix Factorization (PMF) model which scales linearly with the number of observations and, more importantly, performs well on the large, sparse, and very imbalanced Netflix dataset. We further extend the PMF model to include an adaptive prior on the model parameters and show how the model capacity can be controlled automatically. Finally, we introduce a constrained version of the PMF model that is based on the assumption that users who have rated similar sets of movies are likely to have similar preferences. The resulting model is able to generalize considerably better for users with very few ratings. When the predictions of multiple PMF models are linearly combined with the predictions of Restricted Boltzmann Machines models, we achieve an error rate of 0.8861, that is nearly 7% better than the score of Netflix's own system.

References

YearCitations

1999

3.7K

2013

2.1K

2007

1.9K

2005

973

2004

954

2003

697

1992

606

2004

436

2003

267

2004

87

Page 1