Estimating the probability of identity among genotypes in natural populations: cautions and guidelines

TLDR

DNA fingerprinting is becoming essential in conservation genetics, yet probability‑of‑identity calculations often assume random allele associations, which can be inaccurate due to population substructure. The study aimed to assess the accuracy of probability‑of‑identity estimates and to develop a conservative sib‑based estimator to mitigate bias. Researchers compared observed and expected P(ID) using large microsatellite datasets from grey wolves, brown bears, and Australian northern hairy‑nosed wombats, and proposed an equation for sibs while offering guidelines for marker numbers. Theoretical P(ID) values were consistently lower than observed ones, sometimes by up to three orders of magnitude, and the sib‑based estimator provides a conservative upper bound for identical multilocus genotypes.

Abstract

Abstract Individual identification using DNA fingerprinting methods is emerging as a critical tool in conservation genetics and molecular ecology. Statistical methods that estimate the probability of sampling identical genotypes using theoretical equations generally assume random associations between alleles within and among loci. These calculations are probably inaccurate for many animal and plant populations due to population substructure. We evaluated the accuracy of a probability of identity ( P (ID) ) estimation by comparing the observed and expected P (ID) , using large nuclear DNA microsatellite data sets from three endangered species: the grey wolf ( Canis lupus ), the brown bear ( Ursus arctos ), and the Australian northern hairy‐nosed wombat ( Lasiorinyus krefftii ). The theoretical estimates of P (ID) were consistently lower than the observed P (ID) , and can differ by as much as three orders of magnitude. To help researchers and managers avoid potential problems associated with this bias, we introduce an equation for P (ID) between sibs. This equation provides an estimator that can be used as a conservative upper bound for the probability of observing identical multilocus genotypes between two individuals sampled from a population. We suggest computing the actual observed P (ID) when possible and give general guidelines for the number of codominant and dominant marker loci required to achieve a reasonably low P (ID) (e.g. 0.01–0.0001).

References

Page 1

	Year	Citations

Page 1