TY - JOUR
T1 - Fine-scale population structure in the UK Biobank
T2 - implications for genome-wide association studies
AU - Cook, James P
AU - Mahajan, Anubha
AU - Morris, Andrew P
N1 - © The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected].
PY - 2020
Y1 - 2020
N2 - The UK Biobank is a prospective study of more than 500 000 participants that has aggregated data from questionnaires, physical measures, biomarkers, imaging and follow-up for a wide range of health-related outcomes, together with genome-wide genotyping supplemented with high-density imputation. Previous studies have highlighted fine-scale population structure in the UK on a North-West to South-East cline, but the impact of unmeasured geographical confounding on genome-wide association studies (GWAS) of complex human traits in the UK Biobank has not been investigated. We considered 368 325 white British individuals from the UK Biobank, and performed GWAS of their birth location. We demonstrate that widely used approaches to adjust for population structure, including principal components analysis and mixed modelling with a random effect for a genetic relationship matrix, cannot fully account for the fine-scale geographical confounding in the UK Biobank. We observe significant genetic correlation of birth location with a range of lifestyle-related traits, including body-mass index and fat mass, hypertension, and lung function, even after adjustment for population structure. Variants driving associations with birth location are also strongly associated with many of these lifestyle-related traits after correction for population structure, indicating that there could be environmental factors that are confounded with geography that have not been adequately accounted for. Our findings highlight the need for caution in the interpretation of lifestyle-related trait GWAS in UK Biobank, particularly in loci demonstrating strong residual association with birth location.
AB - The UK Biobank is a prospective study of more than 500 000 participants that has aggregated data from questionnaires, physical measures, biomarkers, imaging and follow-up for a wide range of health-related outcomes, together with genome-wide genotyping supplemented with high-density imputation. Previous studies have highlighted fine-scale population structure in the UK on a North-West to South-East cline, but the impact of unmeasured geographical confounding on genome-wide association studies (GWAS) of complex human traits in the UK Biobank has not been investigated. We considered 368 325 white British individuals from the UK Biobank, and performed GWAS of their birth location. We demonstrate that widely used approaches to adjust for population structure, including principal components analysis and mixed modelling with a random effect for a genetic relationship matrix, cannot fully account for the fine-scale geographical confounding in the UK Biobank. We observe significant genetic correlation of birth location with a range of lifestyle-related traits, including body-mass index and fat mass, hypertension, and lung function, even after adjustment for population structure. Variants driving associations with birth location are also strongly associated with many of these lifestyle-related traits after correction for population structure, indicating that there could be environmental factors that are confounded with geography that have not been adequately accounted for. Our findings highlight the need for caution in the interpretation of lifestyle-related trait GWAS in UK Biobank, particularly in loci demonstrating strong residual association with birth location.
U2 - 10.1093/hmg/ddaa157
DO - 10.1093/hmg/ddaa157
M3 - Article
C2 - 32691046
SN - 0964-6906
VL - 29
SP - 2803
EP - 2811
JO - Human Molecular Genetics
JF - Human Molecular Genetics
IS - 16
ER -