The Next Generation of Molecular Markers From Massively Parallel Sequencing of Pooled DNA Samples

TLDR

Next‑generation sequencing is poised to transform genetic analysis, yet its high coverage costs limit population‑scale studies, especially in nonmodel organisms. The authors aim to adapt Tajima’s π and Watterson’s estimators for unbiased allele‑frequency estimation from pooled NGS data. They modify these population‑genetic estimators to correct for pooling and sequencing errors, enabling accurate allele‑frequency inference from pooled samples. NGS pooling improves SNP discovery and allele‑frequency accuracy, and the adjusted estimators outperform individual sequencing at equal effort, offering a cost‑effective genome‑wide strategy though not always preferable.

Abstract

Next generation sequencing (NGS) is about to revolutionize genetic analysis. Currently NGS techniques are mainly used to sequence individual genomes. Due to the high sequence coverage required, the costs for population-scale analyses are still too high to allow an extension to nonmodel organisms. Here, we show that NGS of pools of individuals is often more effective in SNP discovery and provides more accurate allele frequency estimates, even when taking sequencing errors into account. We modify the population genetic estimators Tajima's π and Watterson's to obtain unbiased estimates from NGS pooling data. Given the same sequencing effort, the resulting estimators often show a better performance than those obtained from individual sequencing. Although our analysis also shows that NGS of pools of individuals will not be preferable under all circumstances, it provides a cost-effective approach to estimate allele frequencies on a genome-wide scale.

References

Page 1

	Year	Citations

Page 1