Improving genetic risk modeling of dementia from real-world data in underrepresented populations.
Journal
Communications biology
ISSN: 2399-3642
Titre abrégé: Commun Biol
Pays: England
ID NLM: 101719179
Informations de publication
Date de publication:
25 Aug 2024
25 Aug 2024
Historique:
received:
08
02
2024
accepted:
16
08
2024
medline:
26
8
2024
pubmed:
26
8
2024
entrez:
25
8
2024
Statut:
epublish
Résumé
Genetic risk modeling for dementia offers significant benefits, but studies based on real-world data, particularly for underrepresented populations, are limited. We employ an Elastic Net model for dementia risk prediction using single-nucleotide polymorphisms prioritized by functional genomic data from multiple neurodegenerative disease genome-wide association studies. We compare this model with APOE and polygenic risk score models across genetic ancestry groups (Hispanic Latino American sample: 610 patients with 126 cases; African American sample: 440 patients with 84 cases; East Asian American sample: 673 patients with 75 cases), using electronic health records from UCLA Health for discovery and the All of Us cohort for validation. Our model significantly outperforms other models across multiple ancestries, improving the area-under-precision-recall curve by 31-84% (Wilcoxon signed-rank test p-value <0.05) and the area-under-the-receiver-operating characteristic by 11-17% (DeLong test p-value <0.05) compared to the APOE and the polygenic risk score models. We identify shared and ancestry-specific risk genes and biological pathways, reinforcing and adding to existing knowledge. Our study highlights the benefits of integrating functional mapping, multiple neurodegenerative diseases, and machine learning for genetic risk models in diverse populations. Our findings hold potential for refining precision medicine strategies in dementia diagnosis.
Identifiants
pubmed: 39183196
doi: 10.1038/s42003-024-06742-0
pii: 10.1038/s42003-024-06742-0
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
1049Subventions
Organisme : U.S. Department of Health & Human Services | National Institutes of Health (NIH)
ID : K08AG065519-01A1
Organisme : U.S. Department of Health & Human Services | National Institutes of Health (NIH)
ID : UH2AG083254
Organisme : California Department of Public Health (CDPH)
ID : U54NS123746
Informations de copyright
© 2024. The Author(s).
Références
Pandey, E., Tejan, V. & Garg, S. A novel approach towards behavioral and psychological symptoms of dementia management. ABP 1, 32–35 (2023).
doi: 10.25259/ABP_7_2023
Aggarwal, N. T., Tripathi, M., Dodge, H. H., Alladi, S. & Anstey, K. J. Trends in Alzheimer’s disease and dementia in the Asian-Pacific region. Int. J. Alzheimer’s Dis. 2012, e171327 (2012).
Pedroza, P. et al. Global and regional spending on dementia care from 2000–2019 and expected future health spending scenarios from 2020–2050: an economic modelling exercise. eClinMedicine 45, 101337 (2022).
doi: 10.1016/j.eclinm.2022.101337
2022 Alzheimer’s disease facts and figures. Alzheimers Dement. 18, 700–789 (2022).
Kunkle, B. W. et al. Genetic meta-analysis of diagnosed Alzheimer’s disease identifies new risk loci and implicates Aβ, tau, immunity and lipid processing. Nat. Genet 51, 414–430 (2019).
pubmed: 30820047
pmcid: 6463297
doi: 10.1038/s41588-019-0358-2
Kulminski, A. M., Philipp, I., Shu, L. & Culminskaya, I. Definitive roles of TOMM40-APOE-APOC1 variants in the Alzheimer’s risk. Neurobiol. Aging 110, 122–131 (2022).
pubmed: 34625307
doi: 10.1016/j.neurobiolaging.2021.09.009
Duncan, L. et al. Analysis of polygenic risk score usage and performance in diverse human populations. Nat. Commun. 10, 3328 (2019).
pubmed: 31346163
pmcid: 6658471
doi: 10.1038/s41467-019-11112-0
de Rojas, I. et al. Common variants in Alzheimer’s disease and risk stratification by polygenic risk scores. Nat. Commun. 12, 3417 (2021).
pubmed: 34099642
pmcid: 8184987
doi: 10.1038/s41467-021-22491-8
Fu, M. & Chang, T. S. Phenome-wide association study of polygenic risk score for Alzheimer’s disease in electronic health records. Front Aging Neurosci. 14, 800375 (2022).
pubmed: 35370621
pmcid: 8965623
doi: 10.3389/fnagi.2022.800375
Chaudhury, S. et al. Alzheimer’s disease polygenic risk score as a predictor of conversion from mild-cognitive impairment. Transl. Psychiatry 9, 1–7 (2019).
Escott-Price, V., Myers, A. J., Huentelman, M. & Hardy, J. Polygenic risk score analysis of pathologically confirmed Alzheimer disease. Ann. Neurol. 82, 311–314 (2017).
pubmed: 28727176
pmcid: 5599118
doi: 10.1002/ana.24999
Qiao, J. et al. Evaluating significance of European-associated index SNPs in the East Asian population for 31 complex phenotypes. BMC Genom. 24, 324 (2023).
doi: 10.1186/s12864-023-09425-y
Majara, L. et al. Low and differential polygenic score generalizability among African populations due largely to genetic diversity. HGG Adv. 4, 100184 (2023).
pubmed: 36873096
pmcid: 9982687
Peterson, R. E. et al. Genome-wide association studies in ancestrally diverse populations: opportunities, methods, pitfalls, and recommendations. Cell 179, 589–603 (2019).
pubmed: 31607513
pmcid: 6939869
doi: 10.1016/j.cell.2019.08.051
Grinde, K. E. et al. Generalizing polygenic risk scores from Europeans to Hispanics/Latinos. Genet Epidemiol. 43, 50–62 (2019).
pubmed: 30368908
doi: 10.1002/gepi.22166
Ware, E. B., Faul, J. D., Mitchell, C. M. & Bakulski, K. M. Considering the APOE locus in Alzheimer’s disease polygenic scores in the health and retirement study: a longitudinal panel study. BMC Med. Genom. 13, 164 (2020).
doi: 10.1186/s12920-020-00815-9
Dickson, S. P. et al. GenoRisk: A polygenic risk score for Alzheimer’s disease. Alzheimer’s Dement.: Transl. Res. Clin. Interv. 7, e12211 (2021).
doi: 10.1002/trc2.12211
Gao, X. R. et al. Explainable machine learning aggregates polygenic risk scores and electronic health records for Alzheimer’s disease prediction. Sci. Rep. 13, 450 (2023).
pubmed: 36624143
pmcid: 9829871
doi: 10.1038/s41598-023-27551-1
Robinson, J. L. et al. Pathological combinations in neurodegenerative disease are heterogeneous and disease-associated. Brain 146, 2557–2569 (2023).
pubmed: 36864661
pmcid: 10232273
doi: 10.1093/brain/awad059
Schneider, J. A., Arvanitakis, Z., Bang, W. & Bennett, D. A. Mixed brain pathologies account for most dementia cases in community-dwelling older persons. Neurology 69, 2197–2204 (2007).
pubmed: 17568013
doi: 10.1212/01.wnl.0000271090.28148.24
Zekry, D., Hauw, J.-J. & Gold, G. Mixed dementia: epidemiology, diagnosis, and treatment. J. Am. Geriatrics Soc. 50, 1431–1438 (2002).
doi: 10.1046/j.1532-5415.2002.50367.x
Dubois, B., Padovani, A., Scheltens, P., Rossi, A. & Dell’Agnello, G. Timely diagnosis for Alzheimer’s disease: a literature review on benefits and challenges. J. Alzheimers Dis. 49, 617–631 (2016).
pubmed: 26484931
doi: 10.3233/JAD-150692
Bradford, A., Kunik, M. E., Schulz, P., Williams, S. P. & Singh, H. Missed and delayed diagnosis of dementia in primary care: prevalence and contributing factors. Alzheimer Dis. Assoc. Disord. 23, 306–314 (2009).
pubmed: 19568149
pmcid: 2787842
doi: 10.1097/WAD.0b013e3181a6bebc
Lang, L. et al. Prevalence and determinants of undetected dementia in the community: a systematic literature review and a meta-analysis. BMJ Open 7, e011146 (2017).
pubmed: 28159845
pmcid: 5293981
doi: 10.1136/bmjopen-2016-011146
Kotagal, V. et al. Factors associated with cognitive evaluations in the United States. Neurology 84, 64–71 (2015).
pubmed: 25428689
pmcid: 4336093
doi: 10.1212/WNL.0000000000001096
Taylor, D. H., Østbye, T., Langa, K. M., Weir, D. & Plassman, B. L. The accuracy of medicare claims as an epidemiological tool: the case of dementia revisited. J. Alzheimers Dis. 17, 807–815 (2009).
pubmed: 19542620
pmcid: 3697480
doi: 10.3233/JAD-2009-1099
Amjad, H. et al. Underdiagnosis of dementia: an observational study of patterns in diagnosis and awareness in US older adults. J. Gen. Intern Med 33, 1131–1138 (2018).
pubmed: 29508259
pmcid: 6025653
doi: 10.1007/s11606-018-4377-y
Ponjoan, A. et al. How well can electronic health records from primary care identify Alzheimer’s disease cases? Clin. Epidemiol. 11, 509–518 (2019).
pubmed: 31456649
pmcid: 6620769
doi: 10.2147/CLEP.S206770
Johnson, R. et al. The UCLA ATLAS community health initiative: promoting precision health research in a diverse biobank. Cell Genom. 3, 100243 (2023).
pubmed: 36777178
pmcid: 9903668
doi: 10.1016/j.xgen.2022.100243
Zou, H. & Hastie, T. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67, 301–320 (2005).
doi: 10.1111/j.1467-9868.2005.00503.x
Watanabe, K., Taskesen, E., van Bochoven, A. & Posthuma, D. Functional mapping and annotation of genetic associations with FUMA. Nat. Commun. 8, 1826 (2017).
pubmed: 29184056
pmcid: 5705698
doi: 10.1038/s41467-017-01261-5
Kamboh, M. I. et al. Genome-wide association study of Alzheimer’s disease. Transl. Psychiatry 2, e117–e117 (2012).
pubmed: 22832961
pmcid: 3365264
doi: 10.1038/tp.2012.45
Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat. Genet 47, 291–295 (2015).
pubmed: 25642630
pmcid: 4495769
doi: 10.1038/ng.3211
Santiago, J. A., Bottero, V. & Potashkin, J. A. Transcriptomic and network analysis identifies shared and unique pathways across dementia spectrum disorders. Int. J. Mol. Sci. 21, 2050 (2020).
pubmed: 32192109
pmcid: 7139711
doi: 10.3390/ijms21062050
Clark, K. et al. The prediction of Alzheimer’s disease through multi-trait genetic modeling. Front. Aging Neurosci. 15, 1168638 (2023).
pubmed: 37577355
pmcid: 10416111
doi: 10.3389/fnagi.2023.1168638
Kunkle, B. W. et al. Novel Alzheimer disease risk loci and pathways in African American individuals using the African genome resources panel: a meta-analysis. JAMA Neurol. 78, 102–113 (2021).
pubmed: 33074286
doi: 10.1001/jamaneurol.2020.3536
Belloy, M. E., Napolioni, V. & Greicius, M. D. A quarter century of APOE and Alzheimer’s disease: progress to date and the path forward. Neuron 101, 820–838 (2019).
pubmed: 30844401
pmcid: 6407643
doi: 10.1016/j.neuron.2019.01.056
Privé, F. et al. Portability of 245 polygenic scores when derived from the UK biobank and applied to 9 ancestry groups from the same cohort. Am. J. Hum. Genet. 109, 12–23 (2022).
pubmed: 34995502
pmcid: 8764121
doi: 10.1016/j.ajhg.2021.11.008
Marden, J. R., Walter, S., Tchetgen Tchetgen, E. J., Kawachi, I. & Glymour, M. M. Validation of a polygenic risk score for dementia in black and white individuals. Brain Behav. 4, 687–697 (2014).
pubmed: 25328845
pmcid: 4107377
doi: 10.1002/brb3.248
Dikilitas, O. et al. Use of polygenic risk scores for coronary heart disease in ancestrally diverse populations. Curr. Cardiol. Rep. 24, 1169–1177 (2022).
pubmed: 35796859
pmcid: 9645134
doi: 10.1007/s11886-022-01734-0
Sariya, S. et al. Polygenic risk score for Alzheimer’s disease in Caribbean Hispanics. Ann. Neurol. 90, 366–376 (2021).
pubmed: 34038570
pmcid: 8435026
doi: 10.1002/ana.26131
Ruan, X., Huang, D., Huang, J., Xu, D. & Na, R. Application of European-specific polygenic risk scores for predicting prostate cancer risk in different ancestry populations. Prostate 83, 30–38 (2023).
pubmed: 35996327
doi: 10.1002/pros.24431
Jung, S.-H. et al. Transferability of Alzheimer disease polygenic risk score across populations and its association with Alzheimer disease-related phenotypes. JAMA Netw. Open 5, e2247162 (2022).
pubmed: 36520433
pmcid: 9856322
doi: 10.1001/jamanetworkopen.2022.47162
McKhann, G. M. et al. The diagnosis of dementia due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement 7, 263–269 (2011).
pubmed: 21514250
doi: 10.1016/j.jalz.2011.03.005
Ho, Y., Hu, F. & Lee, P. The advantages and challenges of using real‐world data for patient care. Clin. Transl. Sci. 13, 4–7 (2020).
pubmed: 31456349
doi: 10.1111/cts.12683
McKhann, G. et al. Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDA work group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology 34, 939–944 (1984).
pubmed: 6610841
doi: 10.1212/WNL.34.7.939
Engelhardt, E. et al. Vascular dementia: diagnostic criteria and supplementary exams. recommendations of the Scientific Department of Cognitive Neurology and Aging of the Brazilian Academy of Neurology. Part I. Dement Neuropsychol. 5, 251–263 (2011).
pubmed: 29213752
pmcid: 5619038
doi: 10.1590/S1980-57642011DN05040003
Illumina. Infinium Global Diversity Array-8 BeadChip | Array for Human Genotyping Screening. https://sapac.illumina.com/products/by-type/microarray-kits/infinium-global-diversity.html (2024).
Lajonchere, C. et al. An integrated, scalable, electronic video consent process to power precision health research: large, population-based, cohort implementation and scalability study. J. Med. Internet Res. 23, e31121 (2021).
pubmed: 34889741
pmcid: 8701720
doi: 10.2196/31121
Naeim, A. et al. Electronic video consent to power precision health research: a pilot cohort study. JMIR Form. Res. 5, e29123 (2021).
pubmed: 34313247
pmcid: 8459215
doi: 10.2196/29123
All of Us Research Program Investigators. et al. The ‘all of us’ research program. N. Engl. J. Med. 381, 668–676 (2019).
doi: 10.1056/NEJMsr1809937
Purcell, S. & Chang, C. PLINK 1.9. https://www.cog-genomics.org/plink/ (2024).
Das, S. et al. Next-generation genotype imputation service and methods. Nat. Genet 48, 1284–1287 (2016).
pubmed: 27571263
pmcid: 5157836
doi: 10.1038/ng.3656
Wagner, J. K. et al. Anthropologists’ views on race, ancestry, and genetics. Am. J. Phys. Anthropol. 162, 318–327 (2017).
pubmed: 27874171
doi: 10.1002/ajpa.23120
Johnson, R. et al. Leveraging genomic diversity for discovery in an EHR-linked biobank: the UCLA ATLAS community health initiative. medRxiv. https://doi.org/10.1101/2021.09.22.21263987 (2021).
1000 Genomes Project Consortium. 1000 Genomes (20181203_biallelic_SNV). http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/data_collections/1000_genomes_project/release/20181203_biallelic_SNV/ (2018).
Abdi, H. & Williams, L. J. Principal component analysis. WIREs Comput. Stat. 2, 433–459 (2010).
doi: 10.1002/wics.101
Johnson, R. et al. Leveraging genomic diversity for discovery in an electronic health record linked biobank: the UCLA ATLAS community health initiative. Genome Med. 14, 104 (2022).
pubmed: 36085083
pmcid: 9461263
doi: 10.1186/s13073-022-01106-x
Jun, G. R. et al. Transethnic genome-wide scan identifies novel Alzheimer disease loci. Alzheimers Dement. 13, 727–738 (2017).
pubmed: 28183528
doi: 10.1016/j.jalz.2016.12.012
Nalls, M. A. et al. Identification of novel risk loci, causal insights, and heritable risk for Parkinson’s disease: a meta-analysis of genome-wide association studies. Lancet Neurol. 18, 1091–1102 (2019).
pubmed: 31701892
pmcid: 8422160
doi: 10.1016/S1474-4422(19)30320-5
Chen, J. A. et al. Joint genome-wide association study of progressive supranuclear palsy identifies novel susceptibility loci and genetic correlation to neurodegenerative diseases. Mol. Neurodegener. 13, 41 (2018).
pubmed: 30089514
pmcid: 6083608
doi: 10.1186/s13024-018-0270-8
Chia, R. et al. Genome sequencing analysis identifies new loci associated with Lewy body dementia and provides insights into its genetic architecture. Nat. Genet. 53, 294–303 (2021).
pubmed: 33589841
pmcid: 7946812
doi: 10.1038/s41588-021-00785-3
Malik, R. et al. Multiancestry genome-wide association study of 520,000 subjects identifies 32 loci associated with stroke and stroke subtypes. Nat. Genet. 50, 524–537 (2018).
pubmed: 29531354
pmcid: 5968830
doi: 10.1038/s41588-018-0058-3
Zhu, Y., Tazearslan, C. & Suh, Y. Challenges and progress in interpretation of non-coding genetic variants associated with human disease. Exp. Biol. Med. (Maywood) 242, 1325–1334 (2017).
pubmed: 28581336
doi: 10.1177/1535370217713750
Kingsley, C. B. Identification of causal sequence variants of disease in the next generation sequencing era. In Disease Gene Identification: Methods and Protocols (ed. DiStefano, J. K.) 37–46 (Humana Press, Totowa, NJ, 2011).
Lek, M. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
pubmed: 27535533
pmcid: 5018207
doi: 10.1038/nature19057
Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).
pubmed: 20601685
pmcid: 2938201
doi: 10.1093/nar/gkq603
Kircher, M. et al. A general framework for estimating the relative pathogenicity of human genetic variants. Nat. Genet. 46, 310–315 (2014).
pubmed: 24487276
pmcid: 3992975
doi: 10.1038/ng.2892
Safieh, M., Korczyn, A. D. & Michaelson, D. M. ApoE4: an emerging therapeutic target for Alzheimer’s disease. BMC Med. 17, 64 (2019).
pubmed: 30890171
pmcid: 6425600
doi: 10.1186/s12916-019-1299-4
Denny, J. C. et al. Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nat. Biotechnol. 31, 1102–1110 (2013).
pubmed: 24270849
pmcid: 3969265
doi: 10.1038/nbt.2749
H
Friedman, J. H. Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001).
doi: 10.1214/aos/1013203451
Chen, T. & Guestrin, C. XGBoost: A scalable tree boosting system. In Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 785–794 (ACM Digital Library, 2016).
Efron, B. & Tibshirani, R. J. An Introduction to the Bootstrap 1993 edn, Vol. 436 (CRC Press, 1994).
Davis, J. & Goadrich, M. The relationship between precision-recall and ROC curves. In Proc. 23rd International Conference on Machine Learning - ICML ’06. 233–240 (ACM Press, Pittsburgh, Pennsylvania, 2006).
DeLong, E. R., DeLong, D. M. & Clarke-Pearson, D. L. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44, 837–845 (1988).
pubmed: 3203132
doi: 10.2307/2531595
Conover, W. Practical Nonparametric Statistics 3rd edn, Vol. 608 (John Wiley & Sons, Inc, 1999).
Ferreira, J. A. The Benjamini-hochberg method in the case of discrete test statistics. Int. J. Biostat. 3, 2–7 (2007).
Fu, M. [Codes] Improving genetic risk modeling of dementia from real-world data in underrepresented populations. Res. Sq. 15, rs.3.rs-3911508 (2024).