The No-U-turn sampler: adaptively setting path lengths in Hamiltonian Monte Carlo

TLDR

Hamiltonian Monte Carlo uses gradient‑informed steps to avoid random walk behavior and converge quickly to high‑dimensional targets, but its efficiency depends critically on user‑specified step size and number of steps, which if mis‑chosen can cause random walk behavior or wasted computation. The authors propose the No‑U‑Turn Sampler, an extension of HMC that removes the need to pre‑specify the number of steps by recursively building candidate points and halting when the trajectory starts to double back. NUTS constructs a set of likely points via a recursive algorithm that stops upon detecting a U‑turn, and simultaneously adapts the step size on the fly using primal‑dual averaging. Empirical results show that NUTS matches or surpasses well‑tuned HMC in efficiency without user intervention or costly tuning, enabling its use as a turnkey sampler in BUGS‑style automatic inference engines.

Abstract

Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm that avoids the random walk behavior and sensitivity to correlated parameters that plague many MCMC methods by taking a series of steps informed by first-order gradient information. These features allow it to converge to high-dimensional target distributions much more quickly than simpler methods such as random walk Metropolis or Gibbs sampling. However, HMC's performance is highly sensitive to two user-specified parameters: a step size e and a desired number of steps L. In particular, if L is too small then the algorithm exhibits undesirable random walk behavior, while if L is too large the algorithm wastes computation. We introduce the No-U-Turn Sampler (NUTS), an extension to HMC that eliminates the need to set a number of steps L. NUTS uses a recursive algorithm to build a set of likely candidate points that spans a wide swath of the target distribution, stopping automatically when it starts to double back and retrace its steps. Empirically, NUTS performs at least as efficiently as (and sometimes more effciently than) a well tuned standard HMC method, without requiring user intervention or costly tuning runs. We also derive a method for adapting the step size parameter e on the fly based on primal-dual averaging. NUTS can thus be used with no hand-tuning at all, making it suitable for applications such as BUGS-style automatic inference engines that require efficient turnkey samplers.

References

Page 1

	Year	Citations

Page 1