Approximate Inference in Generalized Linear Mixed Models

TLDR

The generalized linear mixed model (GLMM) provides a unifying framework for handling overdispersion, correlated errors, shrinkage estimation, and smoothing of regression relationships by modeling observations as conditionally independent given normally distributed random effects with link‑function–dependent means and variance functions. Approximate inference is achieved by applying Laplace’s method to the marginal quasi‑likelihood, yielding penalized quasilikelihood (PQL) equations for mean parameters and pseudo‑likelihood for variance components, implemented through repeated REML calls and demonstrated across diverse applications such as binomial overdispersion, longitudinal epilepsy data, and spatial cancer incidence. Simulation and case studies show that PQL provides useful approximate inference for parameters and random effects, though it tends to underestimate variance components and fixed effects in clustered binary data, with accuracy improving as binomial denominators increase.

Abstract

Statistical approaches to overdispersion, correlated errors, shrinkage estimation, and smoothing of regression relationships may be encompassed within the framework of the generalized linear mixed model (GLMM). Given an unobserved vector of random effects, observations are assumed to be conditionally independent with means that depend on the linear predictor through a specified link function and conditional variances that are specified by a variance function, known prior weights and a scale factor. The random effects are assumed to be normally distributed with mean zero and dispersion matrix depending on unknown variance components. For problems involving time series, spatial aggregation and smoothing, the dispersion may be specified in terms of a rank deficient inverse covariance matrix. Approximation of the marginal quasi-likelihood using Laplace's method leads eventually to estimating equations based on penalized quasilikelihood or PQL for the mean parameters and pseudo-likelihood for the variances. Implementation involves repeated calls to normal theory procedures for REML estimation in variance components problems. By means of informal mathematical arguments, simulations and a series of worked examples, we conclude that PQL is of practical value for approximate inference on parameters and realizations of random effects in the hierarchical model. The applications cover overdispersion in binomial proportions of seed germination; longitudinal analysis of attack rates in epilepsy patients; smoothing of birth cohort effects in an age-cohort model of breast cancer incidence; evaluation of curvature of birth cohort effects in a case-control study of childhood cancer and obstetric radiation; spatial aggregation of lip cancer rates in Scottish counties; and the success of salamander matings in a complicated experiment involving crossing of male and female effects. PQL tends to underestimate somewhat the variance components and (in absolute value) fixed effects when applied to clustered binary data, but the situation improves rapidly for binomial observations having denominators greater than one.

References

Page 1

	Year	Citations

Page 1