Publication | Open Access
Detecting genetic risk factors for Alzheimer's disease in whole genome sequence data via Lasso screening
55
Citations
17
References
2015
Year
Unknown Venue
GeneticsGenetic EpidemiologyFeature SelectionDisease Gene IdentificationLarge-scale Lasso ProblemGenome-wide Association StudyAlzheimer's DiseaseBiostatisticsAging-associated DiseasePublic HealthMolecular DiagnosticsAdni Wgs SnpPersonal GenomicsStatistical GeneticsOmicsNeuroimagingBioinformaticsImaging GenomicsNeurodegenerative DiseasesLasso Regression-aDementiaComputational BiologyNeuroscienceMedicineGenetic Risk Factors
Genetic factors play a key role in Alzheimer's disease (AD). The Alzheimer's Disease Neuroimaging Initiative (ADNI) whole genome sequence (WGS) data offers new power to investigate mechanisms of AD by combining entire genome sequences with neuroimaging and clinical data. Here we explore the ADNI WGS SNP (single nucleotide polymorphism) data in depth and extract approximately six million valid SNP features. We investigate imaging genetics associations using Lasso regression-a widely used sparse learning technique. To solve the large-scale Lasso problem more efficiently, we employ a highly efficient screening rule for Lasso-called dual polytope projections (DPP)-to remove irrelevant features from the optimization problem. Experiments demonstrate that the DPP can effectively identify irrelevant features and leads to a 400× speedup. This allows us for the first time to run the compute-intensive model selection procedure called stability selection to rank SNPs that may affect the brain and AD risk.
| Year | Citations | |
|---|---|---|
Page 1
Page 1