Publication | Closed Access
Large-Scale Bayesian Logistic Regression for Text Categorization
815
Citations
49
References
2007
Year
Logistic regression analysis of high‑dimensional data, such as natural language text, poses computational and statistical challenges, and maximum likelihood estimation often fails in these applications. The authors propose a simple Bayesian logistic regression approach that uses a Laplace prior to avoid overfitting and yields sparse predictive models for text data. They describe a model‑fitting algorithm and provide open‑source implementations (BBR and BMR) for the Bayesian logistic regression approach. Applied to various document classification tasks, the method produces compact predictive models that are at least as effective as support vector machine classifiers or ridge logistic regression with feature selection. Keywords: information retrieval, Lasso, ridge regression, support vector classifier, variable selection.
AbstractLogistic regression analysis of high-dimensional data, such as natural language text, poses computational and statistical challenges. Maximum likelihood estimation often fails in these applications. We present a simple Bayesian logistic regression approach that uses a Laplace prior to avoid overfitting and produces sparse predictive models for text data. We apply this approach to a range of document classification problems and show that it produces compact predictive models at least as effective as those produced by support vector machine classifiers or ridge logistic regression combined with feature selection. We describe our model fitting algorithm, our open source implementations (BBR and BMR), and experimental results.KEY WORDS : Information retrievalLassoPenalizationRidge regressionSupport vector classifierVariable selection
| Year | Citations | |
|---|---|---|
Page 1
Page 1