Counting with <scp>DNA</scp> in metabarcoding studies: How should we convert sequence reads to dietary data?

TLDR

DNA sequencing advances now allow recovery of food DNA sequence counts from diverse dietary samples, enabling semi‑quantitative trophic interaction studies, though occurrence‑based summaries are often favored due to taxon‑specific recovery biases. The study asks whether relative read abundance can replace frequency‑of‑occurrence data for accurate diet estimation and urges continued methodological scrutiny in this emerging field. Using representative metabarcoding datasets, the authors show that occurrence‑based summaries tend to overstate low‑quantity foods and depend on the count threshold defining an occurrence. Simulations reveal that relative read abundance yields a more accurate population‑level diet when biases are moderate, both methods improve with fewer taxa per sample, and all bias sources must be considered and justified when interpreting count data.

Abstract

Abstract Advances in DNA sequencing technology have revolutionized the field of molecular analysis of trophic interactions, and it is now possible to recover counts of food DNA sequences from a wide range of dietary samples. But what do these counts mean? To obtain an accurate estimate of a consumer's diet should we work strictly with data sets summarizing frequency of occurrence of different food taxa, or is it possible to use relative number of sequences? Both approaches are applied to obtain semi‐quantitative diet summaries, but occurrence data are often promoted as a more conservative and reliable option due to taxa‐specific biases in recovery of sequences. We explore representative dietary metabarcoding data sets and point out that diet summaries based on occurrence data often overestimate the importance of food consumed in small quantities (potentially including low‐level contaminants) and are sensitive to the count threshold used to define an occurrence. Our simulations indicate that using relative read abundance ( RRA ) information often provides a more accurate view of population‐level diet even with moderate recovery biases incorporated; however, RRA summaries are sensitive to recovery biases impacting common diet taxa. Both approaches are more accurate when the mean number of food taxa in samples is small. The ideas presented here highlight the need to consider all sources of bias and to justify the methods used to interpret count data in dietary metabarcoding studies. We encourage researchers to continue addressing methodological challenges and acknowledge unanswered questions to help spur future investigations in this rapidly developing area of research.

References

Page 1

	Year	Citations

Page 1