Detecting selection along environmental gradients: analysis of eight methods and their effectiveness for outbreeding and selfing populations

TLDR

Genome‑scale diversity data enable detailed studies of how species adapt to environmental gradients, but such analyses are sensitive to undocumented demographic effects such as migration patterns and reproductive regimes. The study provides guidelines for using popular and recent statistical methods to detect selection footprints. The authors simulated 100 populations along a selective gradient, varying migration, sampling, and self‑fertilization, and evaluated the power and robustness of eight methods—three for genotype–environment correlations and five for adaptive differentiation. They found that genotype–environment correlation methods outperform differentiation‑based methods in power but incur high false‑positive rates, especially when allele frequencies are correlated, and that robust methods and sampling many populations improve results while caution is needed to avoid spurious signals from allele‑frequency autocorrelation.

Abstract

Abstract Thanks to genome‐scale diversity data, present‐day studies can provide a detailed view of how natural and cultivated species adapt to their environment and particularly to environmental gradients. However, due to their sensitivity, up‐to‐date studies might be more sensitive to undocumented demographic effects such as the pattern of migration and the reproduction regime. In this study, we provide guidelines for the use of popular or recently developed statistical methods to detect footprints of selection. We simulated 100 populations along a selective gradient and explored different migration models, sampling schemes and rates of self‐fertilization. We investigated the power and robustness of eight methods to detect loci potentially under selection: three designed to detect genotype–environment correlations and five designed to detect adaptive differentiation (based on F ST or similar measures). We show that genotype–environment correlation methods have substantially more power to detect selection than differentiation‐based methods but that they generally suffer from high rates of false positives. This effect is exacerbated whenever allele frequencies are correlated, either between populations or within populations. Our results suggest that, when the underlying genetic structure of the data is unknown, a number of robust methods are preferable. Moreover, in the simulated scenario we used, sampling many populations led to better results than sampling many individuals per population. Finally, care should be taken when using methods to identify genotype–environment correlations without correcting for allele frequency autocorrelation because of the risk of spurious signals due to allele frequency correlations between populations.

References

Page 1

	Year	Citations

Page 1