Concepedia

Publication | Closed Access

Statistical Problems in the Reporting of Clinical Trials

669

Citations

21

References

1987

Year

TLDR

Clinical trial reports contain extensive comparative data, but interpretation is hampered by overuse of significance testing and a lack of pre‑specified trial size or interim stopping rules. The study advocates for clearer predefined policies, limiting primary comparisons, and emphasizing effect sizes and confidence intervals over arbitrary significance thresholds. The authors reviewed 45 comparative trial reports from top journals, examining issues such as multiple endpoints, repeated measures, subgroup analyses, multiple treatments, and the total number of significance tests. Trial summaries tend to highlight statistically significant endpoints, leading to a bias that exaggerates treatment differences.

Abstract

Reports of clinical trials often contain a wealth of data comparing treatments. This can lead to problems in interpretation, particularly when significance testing is used extensively. We examined 45 reports of comparative trials published in the British Medical Journal, the Lancet, or the New England Journal of Medicine to illustrate these statistical problems. The issues we considered included the analysis of multiple end points, the analysis of repeated measurements over time, subgroup analyses, trials of multiple treatments, and the overall number of significance tests in a trial report. Interpretation of large amounts of data is complicated by the common failure to specify in advance the intended size of a trial or statistical stopping rules for interim analyses. In addition, summaries or abstracts of trials tend to emphasize the more statistically significant end points. Overall, the reporting of clinical trials appears to be biased toward an exaggeration of treatment differences. Trials should have a clearer predefined policy for data analysis and reporting. In particular, a limited number of primary treatment comparisons should be specified in advance. The overuse of arbitrary significance levels (for example, P less than 0.05) is detrimental to good scientific reporting, and more emphasis should be given to the magnitude of treatment differences and to estimation methods such as confidence intervals.

References

YearCitations

Page 1