Publication | Open Access
Application of high-dimensional feature selection: evaluation for genomic prediction in man
330
Citations
45
References
2015
Year
The study examined how five feature‑selection methods affect the predictive accuracy of G‑BLUP and Bayes C models for complex traits. Prediction was performed for height, HDL cholesterol, and BMI in 2,186 Croatian and 810 UK individuals using genome‑wide SNP data with mixed‑model and Bayesian approaches. When all SNPs were used, Bayes C and G‑BLUP performed similarly for all traits in the Croatian cohort and for height and BMI in the UK cohort, but Bayes C outperformed G‑BLUP for HDL in the UK; supervised feature selection within G‑BLUP offered a flexible, generalizable, and computationally efficient alternative to Bayes C, though its predictive performance requires careful assessment.
Abstract In this study, we investigated the effect of five feature selection approaches on the performance of a mixed model (G-BLUP) and a Bayesian (Bayes C) prediction method. We predicted height, high density lipoprotein cholesterol (HDL) and body mass index (BMI) within 2,186 Croatian and into 810 UK individuals using genome-wide SNP data. Using all SNP information Bayes C and G-BLUP had similar predictive performance across all traits within the Croatian data and for the highly polygenic traits height and BMI when predicting into the UK data. Bayes C outperformed G-BLUP in the prediction of HDL, which is influenced by loci of moderate size, in the UK data. Supervised feature selection of a SNP subset in the G-BLUP framework provided a flexible, generalisable and computationally efficient alternative to Bayes C; but careful evaluation of predictive performance is required when supervised feature selection has been used.
| Year | Citations | |
|---|---|---|
Page 1
Page 1