Commonly used genomic arrays may lose information due to imperfect coverage of discovered variants for autism spectrum disorder.
Autism spectrum disorder (ASD)
Information Loss
Polygenic scores (PGS)
Journal
Journal of neurodevelopmental disorders
ISSN: 1866-1955
Titre abrégé: J Neurodev Disord
Pays: England
ID NLM: 101483832
Informations de publication
Date de publication:
12 Sep 2024
12 Sep 2024
Historique:
received:
26
09
2023
accepted:
29
08
2024
medline:
13
9
2024
pubmed:
13
9
2024
entrez:
12
9
2024
Statut:
epublish
Résumé
Common genetic variation has been shown to account for a large proportion of ASD heritability. Polygenic scores generated for autism spectrum disorder (ASD-PGS) using the most recent discovery data, however, explain less variance than expected, despite reporting significant associations with ASD and other ASD-related traits. Here, we investigate the extent to which information loss on the target study genome-wide microarray weakens the predictive power of the ASD-PGS. We studied genotype data from three cohorts of individuals with high familial liability for ASD: The Early Autism Risk Longitudinal Investigation (EARLI), Markers of Autism Risk in Babies-Learning Early Signs (MARBLES), and the Infant Brain Imaging Study (IBIS), and one population-based sample, Study to Explore Early Development Phase I (SEED I). Individuals were genotyped on different microarrays ranging from 1 to 5 million sites. Coverage of the top 88 genome-wide suggestive variants implicated in the discovery was evaluated in all four studies before quality control (QC), after QC, and after imputation. We then created a novel method to assess coverage on the resulting ASD-PGS by correlating a PGS informed by a comprehensive list of variants to a PGS informed with only the available variants. Prior to imputations, None of the four cohorts directly or indirectly covered all 88 variants among the measured genotype data. After imputation, the two cohorts genotyped on 5-million arrays reached full coverage. Analysis of our novel metric showed generally high genome-wide coverage across all four studies, but a greater number of SNPs informing the ASD-PGS did not result in improved coverage according to our metric. The studies we analyzed contained modest sample sizes. Our analyses included microarrays with more than 1-million sites, so smaller arrays such as Global Diversity and the PsychArray were not included. Our PGS metric for ASD is only generalizable to samples of European ancestries, though the coverage metric can be computed for traits that have sufficiently large-sized discovery findings in other ancestries. We show that commonly used genotyping microarrays have incomplete coverage for common ASD variants, and imputation cannot always recover lost information. Our novel metric provides an intuitive approach to reporting information loss in PGS and an alternative to reporting the total number of SNPs included in the PGS. While applied only to ASD here, this metric can easily be used with other traits.
Sections du résumé
BACKGROUND
BACKGROUND
Common genetic variation has been shown to account for a large proportion of ASD heritability. Polygenic scores generated for autism spectrum disorder (ASD-PGS) using the most recent discovery data, however, explain less variance than expected, despite reporting significant associations with ASD and other ASD-related traits. Here, we investigate the extent to which information loss on the target study genome-wide microarray weakens the predictive power of the ASD-PGS.
METHODS
METHODS
We studied genotype data from three cohorts of individuals with high familial liability for ASD: The Early Autism Risk Longitudinal Investigation (EARLI), Markers of Autism Risk in Babies-Learning Early Signs (MARBLES), and the Infant Brain Imaging Study (IBIS), and one population-based sample, Study to Explore Early Development Phase I (SEED I). Individuals were genotyped on different microarrays ranging from 1 to 5 million sites. Coverage of the top 88 genome-wide suggestive variants implicated in the discovery was evaluated in all four studies before quality control (QC), after QC, and after imputation. We then created a novel method to assess coverage on the resulting ASD-PGS by correlating a PGS informed by a comprehensive list of variants to a PGS informed with only the available variants.
RESULTS
RESULTS
Prior to imputations, None of the four cohorts directly or indirectly covered all 88 variants among the measured genotype data. After imputation, the two cohorts genotyped on 5-million arrays reached full coverage. Analysis of our novel metric showed generally high genome-wide coverage across all four studies, but a greater number of SNPs informing the ASD-PGS did not result in improved coverage according to our metric.
LIMITATIONS
CONCLUSIONS
The studies we analyzed contained modest sample sizes. Our analyses included microarrays with more than 1-million sites, so smaller arrays such as Global Diversity and the PsychArray were not included. Our PGS metric for ASD is only generalizable to samples of European ancestries, though the coverage metric can be computed for traits that have sufficiently large-sized discovery findings in other ancestries.
CONCLUSIONS
CONCLUSIONS
We show that commonly used genotyping microarrays have incomplete coverage for common ASD variants, and imputation cannot always recover lost information. Our novel metric provides an intuitive approach to reporting information loss in PGS and an alternative to reporting the total number of SNPs included in the PGS. While applied only to ASD here, this metric can easily be used with other traits.
Identifiants
pubmed: 39266988
doi: 10.1186/s11689-024-09571-8
pii: 10.1186/s11689-024-09571-8
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
54Informations de copyright
© 2024. The Author(s).
Références
Lewis CM, Vassos E. Polygenic risk scores: from research tools to clinical instruments. Genome Med. 2020;12:44.
pubmed: 32423490
pmcid: 7236300
doi: 10.1186/s13073-020-00742-5
Martin AR, Daly MJ, Robinson EB, Hyman SE, Neale BM. Predicting polygenic risk of psychiatric disorders. Biol Psychiatry. 1969;2019(86):97–109.
Janssens ACJW. Validity of polygenic risk scores: are we measuring what we think we are? Hum Mol Genet. 2019;28:R143–50.
pubmed: 31504522
pmcid: 7013150
doi: 10.1093/hmg/ddz205
Wray NR, Trzaskowski M, Byrne EM, Abdellaoui A, Adams MJ, Agerbo E, et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat Genet. 2018;50:668–81.
pubmed: 29700475
pmcid: 5934326
doi: 10.1038/s41588-018-0090-3
Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nat Genet. 2019;51:584–91.
pubmed: 30926966
pmcid: 6563838
doi: 10.1038/s41588-019-0379-x
Wojcik GL, Graff M, Nishimura KK, Tao R, Haessler J, Gignoux CR, et al. Genetic analyses of diverse populations improves discovery for complex traits. Nat Lond. 2019;570:514–8.
doi: 10.1038/s41586-019-1310-4
Lam M, Lencz T, Consortium (COGENT) CG. SU101 - identification of key snps and pathways underlying differential genetic correlations between education and cognition on schizophrenia. Eur Neuropsychopharmacol. 2019;29:S943-4.
doi: 10.1016/j.euroneuro.2017.08.290
Marchini J, Howie B. Genotype imputation for genome-wide association studies. Nat Rev Genet. 2010;11:499–511.
pubmed: 20517342
doi: 10.1038/nrg2796
Nguyen DT, Tran TTH, Tran MH, Tran K, Pham D, Duong NT, et al. A comprehensive evaluation of polygenic score and genotype imputation performances of human SNP arrays in diverse populations. Sci Rep. 2022;12:17556.
pubmed: 36266455
pmcid: 9585077
doi: 10.1038/s41598-022-22215-y
Dr M-CL, PhD MVL, Prof SB-C. Autism. Lancet. 2014;383:896–910.
Gaugler T, Klei L, Sanders SJ, Bodea CA, Goldberg AP, Lee AB, et al. Most genetic risk for autism resides with common variation. Nat Genet. 2014;46:881–5.
pubmed: 25038753
pmcid: 4137411
doi: 10.1038/ng.3039
Grove J, Ripke S, Als TD, Mattheisen M, Walters RK, Won H, et al. Identification of common genetic risk variants for autism spectrum disorder. 2019; Available from: https://research.vumc.nl/en/publications/a4919ac9-a15d-4b8b-892a-6ed1a324754f .
Baselmans BML, Yengo L, van Rheenen W, Wray NR. Risk in relatives, heritability, snp-based heritability, and genetic correlations in psychiatric disorders: a review. Biol Psychiatry. 1969;2021(89):11–9.
Klei L, McClain LL, Mahjani B, Panayidou K, Rubeis SD, Grahnat ACS, et al. How rare and common risk variation jointly affect liability for autism spectrum disorder. Mol Autism. 2021;12:66.
pubmed: 34615521
pmcid: 8495987
doi: 10.1186/s13229-021-00466-2
Weiner DJ, Wigdor EM, Ripke S, Walters RK, Kosmicki JA, Grove J, et al. Polygenic transmission disequilibrium confirms that common and rare variation act additively to create risk for autism spectrum disorders. Nat Genet. 2017;49:978–85.
pubmed: 28504703
pmcid: 5552240
doi: 10.1038/ng.3863
Torske T, Nærland T, Bettella F, Bjella T, Malt E, Høyland AL, et al. Autism spectrum disorder polygenic scores are associated with every day executive function in children admitted for clinical assessment. Autism Res. 2020;13:207–20.
pubmed: 31571410
doi: 10.1002/aur.2207
Jansen A, Dieleman G, Jansen P, Verhulst F, Posthuma D, Polderman TJ. Psychiatric polygenic risk scores as predictor for attention deficit/hyperactivity disorder and autism spectrum disorder in a clinical child and adolescent sample. Behav Genet. 2020;50:203–12.
pubmed: 31346826
doi: 10.1007/s10519-019-09965-8
Takahashi N, Harada T, Nishimura T, Okumura A, Choi D, Iwabuchi T, et al. Association of genetic risks with autism spectrum disorder and early neurodevelopmental delays among children without intellectual disability. JAMA Netw Open. 2020;3: e1921644.
pubmed: 32031653
pmcid: 11076129
doi: 10.1001/jamanetworkopen.2019.21644
Serdarevic F, Tiemeier H, Jansen PR, Alemany S, Xerxa Y, Neumann A, et al. Polygenic risk scores for developmental disorders, neuromotor functioning during infancy, and autistic traits in childhood. Biol Psychiatry. 1969;2020(87):132–8.
Clarke T-K, Lupton MK, Fernandez-Pujals AM, Starr J, Davies G, Cox S, et al. Common polygenic risk for autism spectrum disorder (ASD) is associated with cognitive ability in the general population. Mol Psychiatry. 2016;21:419–25.
pubmed: 25754080
doi: 10.1038/mp.2015.12
Newschaffer CJ, Croen LA, Fallin MD, Hertz-Picciotto I, Nguyen DV, Lee NL, et al. Infant siblings and the investigation of autism risk factors. J Neurodev Disord. 2012;4: 7.
pubmed: 22958474
pmcid: 3436647
doi: 10.1186/1866-1955-4-7
Hertz-Picciotto I, Schmidt RJ, Walker CK, Bennett DH, Oliver M, Shedd-Wise KM, et al. A prospective study of environmental exposures and early biomarkers in autism spectrum disorder: design, protocols, and preliminary data from the MARBLES study. Environ Health Perspect. 2018;126:117004.
pubmed: 30465702
pmcid: 6371714
doi: 10.1289/EHP535
Hazlett HC, Gu H, Munsell BC, Kim SH, Styner M, Wolff JJ, et al. Early brain development in infants at high risk for autism spectrum disorder. Nature. 2017;542:348–51.
pubmed: 28202961
pmcid: 5336143
doi: 10.1038/nature21369
Shen MD, Swanson MR, Wolff JJ, Elison JT, Girault JB, Kim SH, et al. Subcortical brain development in autism and fragile X syndrome: evidence for dynamic, age- and disorder-specific trajectories in infancy. Am J Psychiatry. 2022;179:562–72.
pubmed: 35331012
pmcid: 9762548
doi: 10.1176/appi.ajp.21090896
Wolff JJ, Gu H, Gerig G, Elison JT, Styner M, Gouttard S, et al. Differences in white matter fiber tract development present from 6 to 24 months in infants with autism. Am J Psychiatry. 2012;169:589–600.
pubmed: 22362397
pmcid: 3377782
doi: 10.1176/appi.ajp.2011.11091447
Schendel DE, DiGuiseppi C, Croen LA, Fallin MD, Reed PL, Schieve LA, et al. The Study to Explore Early Development (SEED): a multisite epidemiologic study of autism by the Centers for Autism and Developmental Disabilities Research and Epidemiology (CADDRE) Network. J Autism Dev Disord. 2012;42:2121–40.
pubmed: 22350336
pmcid: 4455890
doi: 10.1007/s10803-012-1461-8
Anderson CA, Zondervan KT, Pettersson FH, Clarke GM, Cardon LR, Morris AP. Data quality control in genetic case-control association studies. Nat Protoc. 2010;5:1564–73.
pubmed: 21085122
pmcid: 3025522
doi: 10.1038/nprot.2010.116
Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7.
pubmed: 25722852
pmcid: 4342193
doi: 10.1186/s13742-015-0047-8
Loh P-R, Danecek P, Palamara PF, Fuchsberger C, Reshef YA, Finucane HK, et al. Reference-based phasing using the haplotype reference consortium panel. Nat Genet. 2016;48:1443–8.
pubmed: 27694958
pmcid: 5096458
doi: 10.1038/ng.3679
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet. 2006;38:904–9.
pubmed: 16862161
doi: 10.1038/ng1847
Altshuler DM, Albers CA, Abecasis GR, et al. A global reference for human genetic variation. Nature. 2015;526:68–74.
doi: 10.1038/nature15393
Lambert SA, Gil L, Jupp S, Ritchie SC, Xu Y, Buniello A, et al. The polygenic score catalog as an open database for reproducibility and systematic evaluation. Nat Genet. 2021;53:420–5.
pubmed: 33692568
pmcid: 11165303
doi: 10.1038/s41588-021-00783-5
Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nat Lond. 2009;460:748–52.
doi: 10.1038/nature08185
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559–75.
pubmed: 17701901
pmcid: 1950838
doi: 10.1086/519795
Euesden J, Lewis CM, O’Reilly PF. PRSice: polygenic risk score software. Bioinformatics. 2015;31:1466–8.
pubmed: 25550326
doi: 10.1093/bioinformatics/btu848
Choi SW, O’Reilly PF. PRSice-2: Polygenic Risk Score software for biobank-scale data. GigaScience. 2019;8. Available from: https://www.ncbi.nlm.nih.gov/pubmed/31307061 .
Privé F, Arbel J, Vilhjálmsson BJ. LDpred2: better, faster, stronger. Bioinformatics. 2020;36:5424–31.
pmcid: 8016455
doi: 10.1093/bioinformatics/btaa1029
Ni G, Wang Y, Ge T, Smoller JW, Ripke S, Farh K-H, et al. A comparison of ten polygenic score methods for psychiatric disorders applied across multiple cohorts. Biol Psychiatry. 1969;2021(90):611–20.
Gusev A, Ripke S, Walters JTR, Agartz I, Albus M, Bene J, et al. Modeling linkage disequilibrium increases accuracy of polygenic risk scores. Am J Hum Genet. 2015;97:576–92.
pubmed: 26430803
pmcid: 4596916
doi: 10.1016/j.ajhg.2015.09.001
Mccarthy S, Das S, Kretzschmar W, Luo Y, Timpson N, Zhang H, et al. A reference panel of 64,976 haplotypes for genotype imputation. 2016. Available from: https://explore.openaire.eu/search/publication?articleId=dedup_wf_001::a754d81bb6b6cd0c831e119802af6cc3 .
Vergara C, Parker MM, Franco L, Cho MH, Valencia-Duarte AV, Beaty TH, et al. Genotype imputation performance of three reference panels using African ancestry individuals. Hum Genet. 2018;137:281–92.
pubmed: 29637265
pmcid: 6209094
doi: 10.1007/s00439-018-1881-4
Mathias RA, Taub MA, Gignoux CR, Fu W, Musharoff S, O’Connor TD, et al. A continuum of admixture in the western hemisphere revealed by the African diaspora genome. Nat Commun. 2016;7:12522.
pubmed: 27725671
pmcid: 5062574
doi: 10.1038/ncomms12522
Gurdasani D, Carstensen T, Tekola-Ayele F, Pagani L, Tachmazidou I, Hatzikotoulas K, et al. The African genome variation project shapes medical genetics in Africa. Nature. 2015;517:327–32.
pubmed: 25470054
doi: 10.1038/nature13997
Huang L, Li Y, Singleton AB, Hardy JA, Abecasis G, Rosenberg NA, et al. Genotype-imputation accuracy across worldwide human populations. Am J Hum Genet. 2009;84:235–50.
pubmed: 19215730
pmcid: 2668016
doi: 10.1016/j.ajhg.2009.01.013
Jostins L, Morley KI, Barrett JC. Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets. Eur J Hum Genet EJHG. 2011;19:662–6.
pubmed: 21364697
doi: 10.1038/ejhg.2011.10
Lloyd-Jones LR, Zeng J, Sidorenko J, Yengo L, Moser G, Kemper KE, et al. Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nat Commun. 2019;10:5086–111.
pubmed: 31704910
pmcid: 6841727
doi: 10.1038/s41467-019-12653-0
Zhou G, Zhao H. A fast and robust Bayesian nonparametric method for prediction of complex traits using summary statistics. PLoS Genet. 2021;17: e1009697.
pubmed: 34310601
pmcid: 8341714
doi: 10.1371/journal.pgen.1009697
Ge T, Chen C-Y, Ni Y, Feng Y-CA, Smoller JW. Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nat Commun. 2019;10:1776.
pubmed: 30992449
pmcid: 6467998
doi: 10.1038/s41467-019-09718-5