Protein prediction for trait mapping in diverse populations.
Journal
PloS one
ISSN: 1932-6203
Titre abrégé: PLoS One
Pays: United States
ID NLM: 101285081
Informations de publication
Date de publication:
2022
2022
Historique:
received:
17
08
2021
accepted:
08
02
2022
entrez:
24
2
2022
pubmed:
25
2
2022
medline:
16
3
2022
Statut:
epublish
Résumé
Genetically regulated gene expression has helped elucidate the biological mechanisms underlying complex traits. Improved high-throughput technology allows similar interrogation of the genetically regulated proteome for understanding complex trait mechanisms. Here, we used the Trans-omics for Precision Medicine (TOPMed) Multi-omics pilot study, which comprises data from Multi-Ethnic Study of Atherosclerosis (MESA), to optimize genetic predictors of the plasma proteome for genetically regulated proteome-wide association studies (PWAS) in diverse populations. We built predictive models for protein abundances using data collected in TOPMed MESA, for which we have measured 1,305 proteins by a SOMAscan assay. We compared predictive models built via elastic net regression to models integrating posterior inclusion probabilities estimated by fine-mapping SNPs prior to elastic net. In order to investigate the transferability of predictive models across ancestries, we built protein prediction models in all four of the TOPMed MESA populations, African American (n = 183), Chinese (n = 71), European (n = 416), and Hispanic/Latino (n = 301), as well as in all populations combined. As expected, fine-mapping produced more significant protein prediction models, especially in African ancestries populations, potentially increasing opportunity for discovery. When we tested our TOPMed MESA models in the independent European INTERVAL study, fine-mapping improved cross-ancestries prediction for some proteins. Using GWAS summary statistics from the Population Architecture using Genomics and Epidemiology (PAGE) study, which comprises ∼50,000 Hispanic/Latinos, African Americans, Asians, Native Hawaiians, and Native Americans, we applied S-PrediXcan to perform PWAS for 28 complex traits. The most protein-trait associations were discovered, colocalized, and replicated in large independent GWAS using proteome prediction model training populations with similar ancestries to PAGE. At current training population sample sizes, performance between baseline and fine-mapped protein prediction models in PWAS was similar, highlighting the utility of elastic net. Our predictive models in diverse populations are publicly available for use in proteome mapping methods at https://doi.org/10.5281/zenodo.4837327.
Identifiants
pubmed: 35202437
doi: 10.1371/journal.pone.0264341
pii: PONE-D-21-26641
pmc: PMC8870552
doi:
Substances chimiques
Proteins
0
Proteome
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
e0264341Subventions
Organisme : NIDDK NIH HHS
ID : P30 DK020595
Pays : United States
Organisme : NHGRI NIH HHS
ID : R15 HG009569
Pays : United States
Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
Bioinformatics. 2012 May 15;28(10):1353-8
pubmed: 22492648
Gigascience. 2015 Feb 25;4:7
pubmed: 25722852
PLoS One. 2013 Jul 02;8(7):e67650
pubmed: 23844046
Environ Health Perspect. 2017 Jun 08;125(6):067002
pubmed: 28749367
Neuron. 2009 Aug 13;63(3):287-303
pubmed: 19679070
Nat Genet. 2016 Mar;48(3):245-52
pubmed: 26854917
Genet Epidemiol. 2015 May;39(4):276-93
pubmed: 25810074
Pediatr Cardiol. 2015 Feb;36(2):438-44
pubmed: 25266886
J Stat Softw. 2010;33(1):1-22
pubmed: 20808728
Nat Genet. 2016 Oct;48(10):1279-83
pubmed: 27548312
Hum Mol Genet. 2004 Sep 1;13(17):1959-68
pubmed: 15229191
Nat Genet. 2012 May 13;44(6):659-69
pubmed: 22581228
Nat Commun. 2018 May 8;9(1):1825
pubmed: 29739930
Bioinformatics. 2012 Dec 15;28(24):3326-8
pubmed: 23060615
J Neurosci. 2008 Nov 5;28(45):11445-53
pubmed: 18987181
Nature. 2018 Jun;558(7708):73-79
pubmed: 29875488
J Am Soc Nephrol. 2019 Oct;30(10):2000-2016
pubmed: 31537649
Genet Epidemiol. 2020 Jul;44(5):425-441
pubmed: 32190932
Proteomics. 2020 Jun;20(12):e1900278
pubmed: 32386347
Nat Genet. 2019 Jun;51(6):957-972
pubmed: 31152163
PLoS Genet. 2013;9(2):e1003264
pubmed: 23408905
JAMA. 2009 Jul 1;302(1):37-48
pubmed: 19567438
PLoS Genet. 2010 Sep 09;6(9):e1001094
pubmed: 20838585
PLoS Med. 2017 Sep 12;14(9):e1002383
pubmed: 28898252
Nat Genet. 2018 Mar;50(3):390-400
pubmed: 29403010
Lancet. 2017 Nov 25;390(10110):2360-2371
pubmed: 28941948
Nat Genet. 2015 Jun;47(6):589-97
pubmed: 25961943
Circulation. 2016 Jul 26;134(4):270-85
pubmed: 27444932
PLoS Genet. 2014 May 15;10(5):e1004383
pubmed: 24830394
Nat Rev Neurol. 2019 Sep;15(9):501-518
pubmed: 31367008
Proc Natl Acad Sci U S A. 2013 Mar 19;110(12):4673-8
pubmed: 23487758
Nat Rev Neurol. 2013 Feb;9(2):106-18
pubmed: 23296339
Nat Genet. 2012 Jun 17;44(7):821-4
pubmed: 22706312
Nat Genet. 2018 Mar;50(3):401-413
pubmed: 29507422
HGG Adv. 2021 Apr 8;2(2):
pubmed: 33937878
PLoS Genet. 2020 Aug 14;16(8):e1008927
pubmed: 32797036
Am J Hum Genet. 2018 Nov 1;103(5):691-706
pubmed: 30388399
iScience. 2020 Nov 23;23(12):101850
pubmed: 33313492
Nat Commun. 2018 Aug 15;9(1):3268
pubmed: 30111768
Nat Metab. 2020 Oct;2(10):1135-1148
pubmed: 33067605
Nat Genet. 2019 Apr;51(4):584-591
pubmed: 30926966
PLoS One. 2010 Dec 07;5(12):e15004
pubmed: 21165148
Nucleic Acids Res. 2020 Jan 8;48(D1):D682-D688
pubmed: 31691826
Front Genet. 2019 Apr 03;10:261
pubmed: 31001318
Am J Epidemiol. 2002 Nov 1;156(9):871-81
pubmed: 12397006
Nat Genet. 2019 Apr;51(4):592-599
pubmed: 30926968
Am J Hum Genet. 2008 May;82(5):1185-92
pubmed: 18439548
Bioinformatics. 2011 Nov 1;27(21):2987-93
pubmed: 21903627
Sci Adv. 2020 Sep 10;6(37):
pubmed: 32917697
Bioinformatics. 2019 Dec 15;35(24):5346-5348
pubmed: 31329242
Eur J Epidemiol. 2020 Feb;35(2):139-146
pubmed: 31900758
Nature. 2019 Jun;570(7762):514-518
pubmed: 31217584
Am J Epidemiol. 2011 Oct 1;174(7):849-59
pubmed: 21836165
Am J Hum Genet. 2016 Jun 2;98(6):1114-1129
pubmed: 27236919
Circulation. 2018 Nov 27;138(22):2469-2481
pubmed: 30571344
Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773
pubmed: 30357393
Cell. 2019 Oct 31;179(4):984-1002.e36
pubmed: 31675503
Hum Mol Genet. 2012 Jul 1;21(13):3063-72
pubmed: 22492993
Nat Commun. 2019 Nov 12;10(1):5121
pubmed: 31719535
PLoS Genet. 2018 Aug 10;14(8):e1007586
pubmed: 30096133
Database (Oxford). 2018 Jan 1;2018:
pubmed: 30576484
Am J Hum Genet. 2007 Sep;81(3):559-75
pubmed: 17701901
Circulation. 2011 Feb 22;123(7):731-8
pubmed: 21300955
Nature. 2020 Sep;585(7824):184-186
pubmed: 32901124
Nat Commun. 2016 Jan 21;7:10023
pubmed: 26831199
Hum Mol Genet. 2021 Apr 26;30(3-4):305-317
pubmed: 33575800
Nat Genet. 2015 Sep;47(9):1091-8
pubmed: 26258848
Nat Commun. 2015 Dec 22;6:10206
pubmed: 26690388
Nat Commun. 2020 Dec 18;11(1):6417
pubmed: 33339817
Mol Ther Nucleic Acids. 2014 Oct 07;3:e201
pubmed: 25291143
Bioinformatics. 2010 Nov 15;26(22):2867-73
pubmed: 20926424
Nat Genet. 2018 Nov;50(11):1514-1523
pubmed: 30275531
Hum Mol Genet. 2011 Mar 15;20(6):1224-31
pubmed: 21196492
Respir Res. 2019 Apr 2;20(1):64
pubmed: 30940143
Genet Epidemiol. 2020 Sep 10;:
pubmed: 32964524
Am J Hum Genet. 2012 Sep 7;91(3):502-12
pubmed: 22939635
Nature. 2021 Feb;590(7845):290-299
pubmed: 33568819
Hum Mol Genet. 2019 Jan 1;28(1):166-174
pubmed: 30239722
J Alzheimers Dis. 2020;76(3):883-893
pubmed: 32568201
PLoS One. 2014 Apr 24;9(4):e95866
pubmed: 24763700
Cell. 2019 Oct 17;179(3):589-603
pubmed: 31607513
Am J Hum Genet. 2016 Dec 1;99(6):1245-1260
pubmed: 27866706