Publication Date

2015

Journal Title

Am J Hum Genet

Abstract

Polygenic risk scores have shown great promise in predicting complex disease risk and will become more accurate as training sample sizes increase. The standard approach for calculating risk scores involves linkage disequilibrium (LD)-based marker pruning and applying a p value threshold to association statistics, but this discards information and can reduce predictive accuracy. We introduce LDpred, a method that infers the posterior mean effect size of each marker by using a prior on effect sizes and LD information from an external reference panel. Theory and simulations show that LDpred outperforms the approach of pruning followed by thresholding, particularly at large sample sizes. Accordingly, predicted R(2) increased from 20.1% to 25.3% in a large schizophrenia dataset and from 9.8% to 12.0% in a large multiple sclerosis dataset. A similar relative improvement in accuracy was observed for three additional large disease datasets and for non-European schizophrenia samples. The advantage of LDpred over existing methods will grow as sample sizes increase.

Volume Number

97

Issue Number

4

Pages

576-92

Document Type

Article

EPub Date

2015/10/03

Status

Faculty

Facility

School of Medicine

Primary Department

Psychiatry

Additional Departments

Molecular Medicine

PMID

26430803

DOI

10.1016/j.ajhg.2015.09.001


Included in

Psychiatry Commons

Share

COinS