WLS Genetic Data

Genetic Data

The WLS has two waves of genetic data. The first wave has ~7000 cases and information on ~80 SNPs. More information about the first wave of WLS genetic data can be found here. The second wave has ~9000 cases with genetic data from The Illumina HumonOmniExpress BeadChip and are available from dbGaP and from WLS. More information about the wave two data can be found here. The wave two data has been imputed to the 1000 Genomes Project phase 3 reference panel. More information about the imputation can be found here.

Researchers wanting access to either wave of data should follow these instructions.

Polygenic Scores

In addition to the genetic data, WLS makes certain polygenic scores freely available. LINK

What is a polygenic score?

A polygenic score collapses the effects of genetic variants across the entire genome into a single quantitative measure of genetic risk for a chosen phenotype. Polygenic scores use effect sizes from genome-wide association studies (GWAS) for that phenotype as weights. The predictive power of polygenic scores increases with the sample size of the underlying GWAS.

Population stratification can bias the estimated association between the outcome of interest and the polygenic score. This will happen when there are differences in the distribution of the score across ancestry groups. Controlling for either the top 5 or top 10 principal components of the covariance matrix of the individuals' genotypic data is a common way to account for population stratification. For that reason all scores found on this page are accompanied by principal components. However, since principal components can reveal fine grain ancestry, they have been randomly shuffled in sets of 5. Users must either include principal components 1-5 or 1-10 in their analysis to control for population stratification.

Polygenic Index Repository (version 1.1)

Source study: "Resource Profile and User Guide of the Polygenic Index Repository" LINK

Documentation (PDF) -- Polygenic Index Repository User Guide

Documentation (TXT) -- Phenotypes in Repository


Educational attainment, cognitive performance and math-related scores

Source study: "Gene discovery and polygenic prediction from a genome-wide association study of educational attainment in 1.1 million individuals" LINK

Documentation (PDF) -- Lee_et_al_(2018)_PGS_WLS.pdf


Depression, subjective well-being and neuroticism scores

Source study: "Multi-trait analysis of genome-wide association summary statistics using MTAG" LINK

Documentation (PDF) -- Turley_et_al_(2018)_PGS_WLS.pdf