Publication | Open Access
A simple new approach to variable selection in regression, with application to genetic fine-mapping
72
Citations
70
References
2018
Year
Unknown Venue
Bayesian StatisticBayesian Decision TheoryEngineeringFeature SelectionRegression AnalysisBayesian InferenceVariable SelectionGenotype-phenotype AssociationPosterior DistributionGenetic Fine-mappingGenetic AlgorithmBiostatisticsBayesian MethodsPublic HealthStatisticsLatent Variable MethodsBayesian Hierarchical ModelingStatistical GeneticsStatistical Learning TheoryStepwise Selection MethodsBayesian StatisticsSimple New ApproachRobust ModelingStatistical InferenceApproximate Bayesian Computation
We introduce a simple new approach to variable selection in linear regression, with a particular focus on quantifying uncertainty in which variables should be selected . The approach is based on a new model — the “Sum of Single Effects” ( SuSiE ) model — which comes from writing the sparse vector of regression coefficients as a sum of “single-effect” vectors, each with one non-zero element. We also introduce a corresponding new fitting procedure — Iterative Bayesian Stepwise Selection (IBSS) — which is a Bayesian analogue of stepwise selection methods. IBSS shares the computational simplicity and speed of traditional stepwise methods, but instead of selecting a single variable at each step, IBSS computes a distribution on variables that captures uncertainty in which variable to select. We provide a formal justification of this intuitive algorithm by showing that it optimizes a variational approximation to the posterior distribution under the SuSiE model. Further, this approximate posterior distribution naturally yields convenient novel summaries of uncertainty in variable selection, providing a Credible Set of variables for each selection. Our methods are particularly well-suited to settings where variables are highly correlated and detectable effects are sparse, both of which are characteristics of genetic fine-mapping applications. We demonstrate through numerical experiments that our methods outper-form existing methods for this task, and illustrate their application to fine-mapping genetic variants influencing alternative splicing in human cell-lines. We also discuss the potential and challenges for applying these methods to generic variable selection problems.
| Year | Citations | |
|---|---|---|
Page 1
Page 1