Survey of Branch Support Methods Demonstrates Accuracy, Power, and Robustness of Fast Likelihood-based Approximation Schemes

TLDR

Phylogenetic inference relies on branch support measures, yet conventional bootstrap and Bayesian posterior methods are computationally costly and their interpretation remains debated, prompting the development of fast approximate likelihood-based tests such as aLRT and SH‑aLRT that combine speed with high accuracy and power. The authors propose a Bayesian-like transformation of aLRT, called aBayes, to enhance branch support estimation. They evaluate aBayes alongside aLRT and SH‑aLRT by comparing their performance to standard bootstrap, Bayesian posterior, and rapid bootstrap across simulations and real datasets. Simulations and real data show that under moderate model violations all tests are accurate, but aLRT and aBayes deliver the highest power and speed; under severe violations they can yield inflated false positives, so SH‑aLRT is recommended when violations are detectable, while standard bootstrap is overly conservative and slow.

Abstract

Phylogenetic inference and evaluating support for inferred relationships is at the core of many studies testing evolutionary hypotheses. Despite the popularity of nonparametric bootstrap frequencies and Bayesian posterior probabilities, the interpretation of these measures of tree branch support remains a source of discussion. Furthermore, both methods are computationally expensive and become prohibitive for large data sets. Recent fast approximate likelihood-based measures of branch supports (approximate likelihood ratio test [aLRT] and Shimodaira-Hasegawa [SH]-aLRT) provide a compelling alternative to these slower conventional methods, offering not only speed advantages but also excellent levels of accuracy and power. Here we propose an additional method: a Bayesian-like transformation of aLRT (aBayes). Considering both probabilistic and frequentist frameworks, we compare the performance of the three fast likelihood-based methods with the standard bootstrap (SBS), the Bayesian approach, and the recently introduced rapid bootstrap. Our simulations and real data analyses show that with moderate model violations, all tests are sufficiently accurate, but aLRT and aBayes offer the highest statistical power and are very fast. With severe model violations aLRT, aBayes and Bayesian posteriors can produce elevated false-positive rates. With data sets for which such violation can be detected, we recommend using SH-aLRT, the nonparametric version of aLRT based on a procedure similar to the Shimodaira-Hasegawa tree selection. In general, the SBS seems to be excessively conservative and is much slower than our approximate likelihood-based methods.

References

Page 1

	Year	Citations
CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP Joseph Felsenstein Evolution BiologyBiodiversityPhylogeneticsMolecular EcologyBiogeography	1985	41K
MRBAYES: Bayesian inference of phylogenetic trees John P. Huelsenbeck, Fredrik Ronquist Bioinformatics	2001	21.9K
New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0 Stéphane Guindon, Jean-François Dufayard, Vincent Lefort, Systematic Biology	2010	18.4K
Bootstrap Methods: Another Look at the Jackknife B. Efron The Annals of Statistics EngineeringMachine LearningSampling OptimizationStatistical FoundationStatistical Analysis	1979	17.1K
A Simple, Fast, and Accurate Algorithm to Estimate Large Phylogenies by Maximum Likelihood Stéphane Guindon, Olivier Gascuel Systematic Biology	2003	16.8K
RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models Alexandros Stamatakis Bioinformatics	2006	15.7K
Evolutionary trees from DNA sequences: A maximum likelihood approach Joseph Felsenstein Journal of Molecular Evolution PhylogeneticsMolecular EcologyMedicineGeneticsEvolutionary Biology	1981	14.7K
A Rapid Bootstrap Algorithm for the RAxML Web Servers Alexandros Stamatakis, Paul Hoover, Jacques Rougemont Systematic Biology	2008	7K
A Direct Approach to False Discovery Rates John D. Storey Journal of the Royal Statistical Society Series B (Statistical Methodology) EngineeringStatistical FoundationInformation ForensicsData ScienceData Mining	2002	5.7K
Multiple Comparisons of Log-Likelihoods with Applications to Phylogenetic Inference Hidetoshi Shimodaira, M. Hasegawa Molecular Biology and Evolution BiologyPhylogeneticsMaximum-likelihood MethodNatural SciencesEvolutionary Biology	1999	4.3K

Page 1