Concepedia

Publication | Closed Access

ABGD, Automatic Barcode Gap Discovery for primary species delimitation

3.2K

Citations

67

References

2011

Year

TLDR

DNA barcodes, short DNA sequences common across species, enable assignment of organisms into species within uncharacterized groups. The authors propose an automatic procedure that sorts sequences into hypothetical species by detecting the barcode gap where within‑species divergence is smaller than between‑species divergence. ABGD estimates a one‑sided confidence limit for intraspecific divergence from a prior range, identifies the first significant gap beyond this limit to partition the data, recursively refines partitions, and evaluates theoretical limits through speciation‑population‑genetics simulations. Across six metazoan datasets, ABGD proved computationally efficient and accurate for typical intraspecific divergence thresholds, but its performance drops with very few sequences per species and it is sensitive to recent speciation events; overall, it offers a fast, simple preliminary species delimitation that should be supplemented with additional evidence.

Abstract

Abstract Within uncharacterized groups, DNA barcodes, short DNA sequences that are present in a wide range of species, can be used to assign organisms into species. We propose an automatic procedure that sorts the sequences into hypothetical species based on the barcode gap, which can be observed whenever the divergence among organisms belonging to the same species is smaller than divergence among organisms from different species. We use a range of prior intraspecific divergence to infer from the data a model‐based one‐sided confidence limit for intraspecific divergence. The method, called Automatic Barcode Gap Discovery (ABGD), then detects the barcode gap as the first significant gap beyond this limit and uses it to partition the data. Inference of the limit and gap detection are then recursively applied to previously obtained groups to get finer partitions until there is no further partitioning. Using six published data sets of metazoans, we show that ABGD is computationally efficient and performs well for standard prior maximum intraspecific divergences (a few per cent of divergence for the five data sets), except for one data set where less than three sequences per species were sampled. We further explore the theoretical limitations of ABGD through simulation of explicit speciation and population genetics scenarios. Our results emphasize in particular the sensitivity of the method to the presence of recent speciation events, via (unrealistically) high rates of speciation or large numbers of species. In conclusion, ABGD is fast, simple method to split a sequence alignment data set into candidate species that should be complemented with other evidence in an integrative taxonomic approach.

References

YearCitations

Page 1