Concepedia

Publication | Open Access

Importance of events per independent variable in proportional hazards regression analysis II. Accuracy and precision of regression estimates

2.1K

Citations

4

References

1995

Year

TLDR

The study used Monte Carlo simulations of a proportional hazards model on 673 patients with 252 deaths and seven predictors, varying events per variable from 2 to 25 across 500 simulated datasets to compare simulated coefficients with the original trial results. Results showed that lower EPVs produced increasingly biased coefficients, unreliable 90 % confidence limits, invalid variance estimates, and compromised Z‑tests, indicating that an EPV of 10 is the most prudent threshold and values below it warrant cautious interpretation.

Abstract

The analytical effect of the number of events per variable (EPV) in a proportional hazards regression analysis was evaluated using Monte Carlo simulation techniques for data from a randomized trial containing 673 patients and 252 deaths, in which seven predictor variables had an original significance level of p < 0.10. The 252 deaths and 7 variables correspond to 36 events per variable analyzed in the full data set. Five hundred simulated analyses were conducted for these seven variables at EPVs of 2, 5, 10, 15, 20, and 25. For each simulation, a random exponential survival time was generated for each of the 673 patients, and the simulated results were compared with their original counterparts. As EPV decreased, the regression coefficients became more biased relative to the true value; the 90% confidence limits about the simulated values did not have a coverage of 90% for the original value; large sample properties did not hold for variance estimates from the proportional hazards model, and the Z statistics used to test the significance of the regression coefficients lost validity under the null hypothesis. Although a single boundary level for avoiding problems is not easy to choose, the value of EPV = 10 seems most prudent. Below this value for EPV, the results of proportional hazards regression analyses should be interpreted with caution because the statistical model may not be valid.

References

YearCitations

Page 1