A Comparison of Logistic Regression Pseudo R 2 Indices

Abstract

occurrence for some binary outcome, using one or more continuous or categorical variables as predictors. When the outcome is expressed as the log-odds of the event’s occurrence, the logistic regression equation is a linear combination of the predictors, where the regression parameters are typically obtained using maximum likelihood estimation, and where each regression weight indicates the change in the log-odds of the event’s occurrence per unit of change in its associated predictor. Adequacy of fit for a logistic regression model is typically assessed by assessing (1) the significance of the omnibus chi-square test of the model coefficients, which assesses the incremental decrease in the log-likelihood (i.e., deviance) of the regression model containing the full set of predictors when it is compared to the model that contains only the intercept term, and determines whether the former significantly improves prediction over the latter; and (2) the Hosmer-Lemeshow goodness-of-fit test (Hosmer & Lemeshow, 2000), which groups cases into deciles based upon the predicted probability of each, then assesses the degree to which the observed frequencies match the expected frequencies using a chi-square goodness-of-fit test, and where a non-significant test result suggests a well-fitting model. Additionally, when examining individual predictors, the adjusted odds-ratio (i.e., the exponentiated regression coefficient) associated with each predictor can be evaluated as an effect size. When the predicted probabilities resulting from logistic regression are used for classification purposes, additional indices of model fit are often employed. Simple proportions of correctly classified cases, both for the overall sample as well as for each of the groups in the sample provide one such index. Also, a Receiver Operator Characteristic (ROC) curve, which graphically represents “true positive” and “false positive” classification rates as a function of different classification cutoff values for the predicted probabilities resulting from the logistic regression provides another such index. In addition, a number of goodness-of-fit indices exist to assess the predictive capacity of the logistic regression model. These “pseudo R 2 ” indices have been developed that are intended as logistic regression analogs of R 2 as used in ordinary least-squares (OLS) regression. One such index, outlined by Maddala

References

Page 1

	Year	Citations

Page 1