How to get statistically significant effects in any ERP experiment (and why you shouldn't)

TLDR

ERP experiments produce thousands of data points per participant, and while this richness enables sophisticated hypothesis testing, it also creates many opportunities for statistically significant but spurious effects. The paper shows that routine, seemingly innocuous methods for quantifying and analyzing ERP data can yield very high rates of false positives, with the chance of at least one bogus effect exceeding 50% in many studies. By illustrating how selecting time windows and electrode sites from grand‑averaged data and employing multifactor statistical analyses inflate false positives, the authors use reanalyses and simulations to demonstrate the problem and propose strategies to avoid it and improve the validity of significant findings.

Abstract

Abstract ERP experiments generate massive datasets, often containing thousands of values for each participant, even after averaging. The richness of these datasets can be very useful in testing sophisticated hypotheses, but this richness also creates many opportunities to obtain effects that are statistically significant but do not reflect true differences among groups or conditions (bogus effects). The purpose of this paper is to demonstrate how common and seemingly innocuous methods for quantifying and analyzing ERP effects can lead to very high rates of significant but bogus effects, with the likelihood of obtaining at least one such bogus effect exceeding 50% in many experiments. We focus on two specific problems: using the grand‐averaged data to select the time windows and electrode sites for quantifying component amplitudes and latencies, and using one or more multifactor statistical analyses. Reanalyses of prior data and simulations of typical experimental designs are used to show how these problems can greatly increase the likelihood of significant but bogus results. Several strategies are described for avoiding these problems and for increasing the likelihood that significant effects actually reflect true differences among groups or conditions.

References

Page 1

	Year	Citations

Page 1