Publication | Open Access
Participation bias in the UK Biobank distorts genetic associations and downstream analyses
345
Citations
40
References
2023
Year
Volunteer-based studies like the UK Biobank are central to genetic epidemiology, yet participants are rarely representative of the target population. The study aimed to assess how selective participation affects genetic analyses by deriving participation probabilities from 14 harmonized variables. Using these probabilities, weighted genome‑wide association analyses were performed on 19 traits. Weighted analyses, with effective sample sizes of 94,643–102,215 versus 263,464–283,749 in unweighted analyses, altered SNP effect sizes, revealed 12 novel associations, changed heritability estimates by at most 5 %, but produced substantial shifts in genetic correlations (up to 0.31) and Mendelian randomization estimates (up to 0.15) for socio‑behavioural traits, underscoring the need for greater representativeness in biobank studies.
While volunteer-based studies such as the UK Biobank have become the cornerstone of genetic epidemiology, the participating individuals are rarely representative of their target population. To evaluate the impact of selective participation, here we derived UK Biobank participation probabilities on the basis of 14 variables harmonized across the UK Biobank and a representative sample. We then conducted weighted genome-wide association analyses on 19 traits. Comparing the output from weighted genome-wide association analyses (n effective = 94,643 to 102,215) with that from standard genome-wide association analyses (n = 263,464 to 283,749), we found that increasing representativeness led to changes in SNP effect sizes and identified novel SNP associations for 12 traits. While heritability estimates were less impacted by weighting (maximum change in h 2, 5%), we found substantial discrepancies for genetic correlations (maximum change in r g, 0.31) and Mendelian randomization estimates (maximum change in β STD, 0.15) for socio-behavioural traits. We urge the field to increase representativeness in biobank samples, especially when studying genetic correlates of behaviour, lifestyles and social outcomes.
| Year | Citations | |
|---|---|---|
Page 1
Page 1