Publication | Closed Access
Probabilistic Matrix Factorization
3.6K
Citations
11
References
2007
Year
EngineeringMachine LearningImbalanced Netflix DatasetText MiningPmf ModelInformation RetrievalData ScienceData MiningProbabilistic Matrix FactorizationStatisticsLow-rank ApproximationPredictive AnalyticsKnowledge DiscoveryComputer ScienceCold-start ProblemAdaptive PriorInformation Filtering SystemGroup RecommendersMatrix FactorizationCollaborative Filtering
Many existing approaches to collaborative filtering cannot handle very large datasets nor easily accommodate users with few ratings. The authors propose Probabilistic Matrix Factorization (PMF) to address scalability and sparsity challenges in collaborative filtering. PMF is extended with an adaptive prior to automatically control model capacity and a constrained variant that groups users with similar rating sets. The extended PMF generalizes better for users with few ratings and, when combined with Restricted Boltzmann Machine predictions, achieves a 0.8861 error rate—about 7% lower than Netflix's own system.
Many existing approaches to collaborative filtering can neither handle very large datasets nor easily deal with users who have very few ratings. In this paper we present the Probabilistic Matrix Factorization (PMF) model which scales linearly with the number of observations and, more importantly, performs well on the large, sparse, and very imbalanced Netflix dataset. We further extend the PMF model to include an adaptive prior on the model parameters and show how the model capacity can be controlled automatically. Finally, we introduce a constrained version of the PMF model that is based on the assumption that users who have rated similar sets of movies are likely to have similar preferences. The resulting model is able to generalize considerably better for users with very few ratings. When the predictions of multiple PMF models are linearly combined with the predictions of Restricted Boltzmann Machines models, we achieve an error rate of 0.8861, that is nearly 7% better than the score of Netflix's own system.
| Year | Citations | |
|---|---|---|
1999 | 3.7K | |
2013 | 2.1K | |
2007 | 1.9K | |
2005 | 973 | |
2004 | 954 | |
2003 | 697 | |
1992 | 606 | |
2004 | 436 | |
2003 | 267 | |
2004 | 87 |
Page 1
Page 1