TY - JOUR
T1 - Modeling Linkage Disequilibrium Increases Accuracy of Polygenic Risk Scores
AU - Schizophrenia Working Group of the Psychiatric Genomics Consortium
AU - Psychosis Endophenotypes International Consortium
AU - Wellcome Trust Case Control Consortium
AU - Discovery, Biology, and Risk of Inherited Variants in Breast Cancer (DRIVE) study
AU - Hereditary Breast and Ovarian Cancer Research Group Netherlands (HEBON)
AU - Vilhjálmsson, Bjarni J.
AU - Yang, Jian
AU - Finucane, Hilary K.
AU - Gusev, Alexander
AU - Lindström, Sara
AU - Ripke, Stephan
AU - Genovese, Giulio
AU - Loh, Po Ru
AU - Bhatia, Gaurav
AU - Do, Ron
AU - Hayeck, Tristan
AU - Won, Hong Hee
AU - Neale, Benjamin M.
AU - Corvin, Aiden
AU - Walters, James T.R.
AU - Farh, Kai How
AU - Holmans, Peter A.
AU - Lee, Phil
AU - Bulik-Sullivan, Brendan
AU - Collier, David A.
AU - Huang, Hailiang
AU - Pers, Tune H.
AU - Agartz, Ingrid
AU - Agerbo, Esben
AU - Albus, Margot
AU - Alexander, Madeline
AU - Amin, Farooq
AU - Bacanu, Silviu A.
AU - Begemann, Martin
AU - Belliveau, Richard A.
AU - Bene, Judit
AU - Bergen, Sarah E.
AU - Bevilacqua, Elizabeth
AU - Bigdeli, Tim B.
AU - Black, Donald W.
AU - Bruggeman, Richard
AU - Buccola, Nancy G.
AU - Buckner, Randy L.
AU - Byerley, William
AU - Cahn, Wiepke
AU - Cai, Guiqing
AU - Campion, Dominique
AU - Cantor, Rita M.
AU - Carr, Vaughan J.
AU - Carrera, Noa
AU - Catts, Stanley V.
AU - Chambert, Kimberly D.
AU - Chan, Raymond C.K.
AU - Chen, Ronald Y.L.
AU - Lee, S. Hong
N1 - Publisher Copyright:
© 2015 The American Society of Human Genetics. All rights reserved.
PY - 2015/1/1
Y1 - 2015/1/1
N2 - Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R2 increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.
AB - Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R2 increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.
UR - http://www.scopus.com/inward/record.url?scp=84952665106&partnerID=8YFLogxK
U2 - 10.1016/j.ajhg.2015.09.001
DO - 10.1016/j.ajhg.2015.09.001
M3 - Article
C2 - 26430803
AN - SCOPUS:84952665106
SN - 0002-9297
VL - 97
SP - 576
EP - 592
JO - American Journal of Human Genetics
JF - American Journal of Human Genetics
IS - 4
ER -