How to track and assess genotyping errors in population genetics studies

TLDR

Genotyping errors, which arise when laboratory-determined genotypes differ from true genotypes, are common yet underreported in population genetics, potentially biasing conclusions, especially in individual identification studies. The study aims to track and identify genotyping errors to clean datasets and proposes a protocol to limit and quantify errors at each genotyping step. The authors analyze four diverse case studies using microsatellites or AFLPs and outline a protocol that includes contamination precautions, blind samples, automation, rigorous scoring, and systematic error reporting. The error rates observed ranged from 0.8% to 2.6%, with allelic dropouts and peak intensity differences as primary sources, and human factors contributing significantly.

Abstract

Abstract Genotyping errors occur when the genotype determined after molecular analysis does not correspond to the real genotype of the individual under consideration. Virtually every genetic data set includes some erroneous genotypes, but genotyping errors remain a taboo subject in population genetics, even though they might greatly bias the final conclusions, especially for studies based on individual identification. Here, we consider four case studies representing a large variety of population genetics investigations differing in their sampling strategies (noninvasive or traditional), in the type of organism studied (plant or animal) and the molecular markers used [microsatellites or amplified fragment length polymorphisms (AFLPs)]. In these data sets, the estimated genotyping error rate ranges from 0.8% for microsatellite loci from bear tissues to 2.6% for AFLP loci from dwarf birch leaves. Main sources of errors were allelic dropouts for microsatellites and differences in peak intensities for AFLPs, but in both cases human factors were non‐negligible error generators. Therefore, tracking genotyping errors and identifying their causes are necessary to clean up the data sets and validate the final results according to the precision required. In addition, we propose the outline of a protocol designed to limit and quantify genotyping errors at each step of the genotyping process. In particular, we recommend (i) several efficient precautions to prevent contaminations and technical artefacts; (ii) systematic use of blind samples and automation; (iii) experience and rigor for laboratory work and scoring; and (iv) systematic reporting of the error rate in population genetics studies.

References

Page 1

	Year	Citations

Page 1