Bayes or Bootstrap? A Simulation Study Comparing the Performance of Bayesian Markov Chain Monte Carlo Sampling and Bootstrapping in Assessing Phylogenetic Confidence

TLDR

Bayesian Markov chain Monte Carlo sampling is widely used in phylogenetics for estimating maximum likelihood topologies and nodal confidence, yet its relationship to the commonly used nonparametric bootstrap proportion remains poorly understood. The study used simulations to compare Bayesian posterior probabilities (BMCMC‑PP) with maximum likelihood and parsimony bootstrap proportions (ML‑BP, MP‑BP) in assessing phylogenetic confidence. Simulations evolved DNA sequences on 17‑taxon trees under 18 evolutionary scenarios, evaluating how each method assigns support to correct and incorrect monophyletic groups and how support changes with increasing character number. BMCMC‑PP generally outperformed bootstrap methods, showing stronger correlation with ML‑BP, better support for correct monophyletic groups, lower bias, and higher accuracy with fewer characters, though it differed from ML‑BP on short internodes and correlated poorly with MP‑BP.

Abstract

Bayesian Markov chain Monte Carlo sampling has become increasingly popular in phylogenetics as a method for both estimating the maximum likelihood topology and for assessing nodal confidence. Despite the growing use of posterior probabilities, the relationship between the Bayesian measure of confidence and the most commonly used confidence measure in phylogenetics, the nonparametric bootstrap proportion, is poorly understood. We used computer simulation to investigate the behavior of three phylogenetic confidence methods: Bayesian posterior probabilities calculated via Markov chain Monte Carlo sampling (BMCMC-PP), maximum likelihood bootstrap proportion (ML-BP), and maximum parsimony bootstrap proportion (MP-BP). We simulated the evolution of DNA sequence on 17-taxon topologies under 18 evolutionary scenarios and examined the performance of these methods in assigning confidence to correct monophyletic and incorrect monophyletic groups, and we examined the effects of increasing character number on support value. BMCMC-PP and ML-BP were often strongly correlated with one another but could provide substantially different estimates of support on short internodes. In contrast, BMCMC-PP correlated poorly with MP-BP across most of the simulation conditions that we examined. For a given threshold value, more correct monophyletic groups were supported by BMCMC-PP than by either ML-BP or MP-BP. When threshold values were chosen that fixed the rate of accepting incorrect monophyletic relationship as true at 5%, all three methods recovered most of the correct relationships on the simulated topologies, although BMCMC-PP and ML-BP performed better than MP-BP. BMCMC-PP was usually a less biased predictor of phylogenetic accuracy than either bootstrapping method. BMCMC-PP provided high support values for correct topological bipartitions with fewer characters than was needed for nonparametric bootstrap.

References

Page 1

	Year	Citations

Page 1