Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data.


Journal

PLoS computational biology
ISSN: 1553-7358
Titre abrégé: PLoS Comput Biol
Pays: United States
ID NLM: 101238922

Informations de publication

Date de publication:
09 2023
Historique:
received: 30 01 2023
accepted: 04 09 2023
revised: 26 09 2023
medline: 4 10 2023
pubmed: 14 9 2023
entrez: 14 9 2023
Statut: epublish

Résumé

The development of high-throughput next-generation sequencing technologies and large-scale genetic association studies produced numerous advances in the biostatistics field. Various aggregation tests, i.e. statistical methods that analyze associations of a trait with multiple markers within a genomic region, have produced a variety of novel discoveries. Notwithstanding their usefulness, there is no single test that fits all needs, each suffering from specific drawbacks. Selecting the right aggregation test, while considering an unknown underlying genetic model of the disease, remains an important challenge. Here we propose a new ensemble method, called Excalibur, based on an optimal combination of 36 aggregation tests created after an in-depth study of the limitations of each test and their impact on the quality of result. Our findings demonstrate the ability of our method to control type I error and illustrate that it offers the best average power across all scenarios. The proposed method allows for novel advances in Whole Exome/Genome sequencing association studies, able to handle a wide range of association models, providing researchers with an optimal aggregation analysis for the genetic regions of interest.

Identifiants

pubmed: 37708232
doi: 10.1371/journal.pcbi.1011488
pii: PCOMPBIOL-D-23-00145
pmc: PMC10522036
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

e1011488

Informations de copyright

Copyright: © 2023 Boutry et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Déclaration de conflit d'intérêts

The authors have declared that no competing interests exist.

Références

Genome Biol. 2016 Jun 06;17(1):122
pubmed: 27268795
Genetics. 2014 Aug;197(4):1081-95
pubmed: 24831820
Bioinformatics. 2014 Dec 1;30(23):3427-9
pubmed: 25150247
PLoS One. 2014 Jan 15;9(1):e85728
pubmed: 24454922
Nat Commun. 2020 Nov 19;11(1):5900
pubmed: 33214558
Curr Protoc Hum Genet. 2019 Apr;101(1):e83
pubmed: 30849219
Biostatistics. 2012 Sep;13(4):724-33
pubmed: 22389176
Am J Hum Genet. 2019 Feb 7;104(2):260-274
pubmed: 30639324
Am J Hum Genet. 2008 Sep;83(3):311-21
pubmed: 18691683
PLoS One. 2010 Nov 03;5(11):e13584
pubmed: 21072163
BMC Genomics. 2013;14 Suppl 3:S3
pubmed: 23819870
PLoS Genet. 2015 Apr 23;11(4):e1005165
pubmed: 25906071
Nat Genet. 2016 Feb;48(2):214-20
pubmed: 26727659
Am J Hum Genet. 2010 Jun 11;86(6):832-8
pubmed: 20471002
Am J Hum Genet. 2016 Apr 7;98(4):653-66
pubmed: 27018471
Am J Hum Genet. 2010 Nov 12;87(5):604-17
pubmed: 21070896
Genetics. 2017 Dec;207(4):1275-1283
pubmed: 29025915
PLoS One. 2018 Dec 6;13(12):e0207677
pubmed: 30521541
PLoS Comput Biol. 2019 Feb 19;15(2):e1006722
pubmed: 30779729
Ind Psychiatry J. 2009 Jul;18(2):127-31
pubmed: 21180491
Am J Hum Genet. 2006 Nov;79(5):792-806
pubmed: 17033957
Eur J Epidemiol. 2016 Apr;31(4):337-50
pubmed: 27209009
Annu Rev Genet. 2010;44:293-308
pubmed: 21047260
Genet Epidemiol. 2014 Nov;38(7):579-90
pubmed: 25132070
Bioinformatics. 2018 Jun 15;34(12):2144-2146
pubmed: 29438558
PLoS One. 2014 Sep 15;9(9):e106918
pubmed: 25221983
Genet Epidemiol. 2019 Mar;43(2):122-136
pubmed: 30604442
PLoS One. 2017 Jul 24;12(7):e0179364
pubmed: 28742119
Hum Hered. 2010;69(2):120-30
pubmed: 19996609
PLoS Genet. 2011 Feb 03;7(2):e1001289
pubmed: 21304886
Genet Epidemiol. 2018 Sep;42(6):516-527
pubmed: 29932245
Mutat Res. 2007 Feb 3;615(1-2):28-56
pubmed: 17101154
Genet Epidemiol. 2019 Feb;43(1):4-23
pubmed: 30298564
Genet Epidemiol. 2011 Jul;35(5):398-409
pubmed: 21594893
PLoS One. 2012;7(8):e41694
pubmed: 22916111
Genet Epidemiol. 2011 Jul;35(5):381-8
pubmed: 21520272
PLoS Genet. 2009 Feb;5(2):e1000384
pubmed: 19214210
Genet Epidemiol. 2009 Sep;33(6):497-507
pubmed: 19170135
Biostatistics. 2012 Sep;13(4):762-75
pubmed: 22699862
Genet Epidemiol. 2012 Nov;36(7):675-85
pubmed: 22865616
Genet Epidemiol. 2018 Oct;42(7):673-683
pubmed: 29931698
Genet Epidemiol. 2010 Feb;34(2):188-93
pubmed: 19810025
PLoS Genet. 2011 Mar;7(3):e1001322
pubmed: 21408211
Bioinformatics. 2016 Feb 15;32(4):624-6
pubmed: 26508760
Sci Rep. 2016 Feb 23;6:21824
pubmed: 26903168
Am J Hum Genet. 2016 Jul 7;99(1):104-14
pubmed: 27292111
Am J Hum Genet. 2011 Sep 9;89(3):354-67
pubmed: 21885029
Bioinformatics. 2016 May 1;32(9):1423-6
pubmed: 27153000
Am J Hum Genet. 2016 Sep 1;99(3):527-539
pubmed: 27545677
Brief Bioinform. 2015 Sep;16(5):759-68
pubmed: 25596401
Genome Res. 2005 Nov;15(11):1576-83
pubmed: 16251467
Genet Epidemiol. 2008 Sep;32(6):560-6
pubmed: 18428428
Genome Biol. 2020 Aug 26;21(1):217
pubmed: 32847609
Bioinformatics. 2015 Mar 1;31(5):761-3
pubmed: 25338716
Annu Rev Genomics Hum Genet. 2016 Aug 31;17:117-30
pubmed: 27147090
Nucleic Acids Res. 2017 Jan 4;45(D1):D896-D901
pubmed: 27899670
Am J Hum Genet. 2016 Oct 6;99(4):877-885
pubmed: 27666373
Bioinformatics. 2016 Aug 1;32(15):2392-3
pubmed: 27153598
Nat Genet. 2014 Mar;46(3):310-5
pubmed: 24487276
Am J Hum Genet. 2020 Jan 2;106(1):3-12
pubmed: 31866045
Genetics. 2011 May;188(1):181-8
pubmed: 21368279
Am J Hum Genet. 2014 Jul 3;95(1):5-23
pubmed: 24995866
Hum Hered. 2019;84(4-5):170-196
pubmed: 32417835
PLoS Genet. 2010 Oct 14;6(10):e1001156
pubmed: 20976247
Bioinformatics. 2014 Nov 15;30(22):3197-205
pubmed: 25075118
Genet Epidemiol. 2013 May;37(4):334-44
pubmed: 23483651
Am J Hum Genet. 2007 Feb;80(2):353-60
pubmed: 17236140
Am J Hum Genet. 2002 May;70(5):1257-68
pubmed: 11923914
Nat Rev Genet. 2019 Dec;20(12):747-759
pubmed: 31605095
Genet Epidemiol. 2013 Jul;37(5):409-18
pubmed: 23650101
Nat Biotechnol. 2013 Sep;31(9):822-6
pubmed: 23792628
Am J Hum Genet. 2018 Oct 4;103(4):522-534
pubmed: 30269813
Hum Hered. 2010;70(1):42-54
pubmed: 20413981
Am J Hum Genet. 2011 Jul 15;89(1):82-93
pubmed: 21737059
Nat Methods. 2013 Nov;10(11):1083-4
pubmed: 24076761
Ann Hum Genet. 2012 Sep;76(5):402-9
pubmed: 22724536
Genet Epidemiol. 2011 Nov;35(7):606-19
pubmed: 21769936
Genet Epidemiol. 2016 Feb;40(2):91-100
pubmed: 26782911
Nat Commun. 2019 Jul 9;10(1):3018
pubmed: 31289270
Genet Epidemiol. 2004 Dec;27(4):415-28
pubmed: 15481099
Am J Hum Genet. 2010 Jun 11;86(6):929-42
pubmed: 20560208
BMC Genet. 2012 Feb 06;13:7
pubmed: 22309429
Am J Hum Genet. 2012 Aug 10;91(2):224-37
pubmed: 22863193
BMC Syst Biol. 2018 Mar 19;12(Suppl 2):19
pubmed: 29560826
Front Genet. 2022 Oct 06;13:1014947
pubmed: 36276986
PLoS Comput Biol. 2010 Oct 14;6(10):e1000954
pubmed: 20976246
Eur J Hum Genet. 2016 May;24(5):767-73
pubmed: 26508571
PLoS Genet. 2012 Feb;8(2):e1002496
pubmed: 22319458
Curr Protoc Hum Genet. 2013 Jan;Chapter 7:Unit7.20
pubmed: 23315928
PLoS One. 2013 Dec 17;8(12):e83057
pubmed: 24358248
Genome Med. 2021 Feb 22;13(1):31
pubmed: 33618777
Bioinformatics. 2017 Dec 01;33(23):3733-3739
pubmed: 28961861
Am J Hum Genet. 2013 Jun 6;92(6):841-53
pubmed: 23684009
Hum Hered. 2012;73(2):84-94
pubmed: 22441326
Genet Epidemiol. 2016 Jan;40(1):5-19
pubmed: 26643881

Auteurs

Simon Boutry (S)

Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium.
Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, Brussels, Belgium.

Raphaël Helaers (R)

Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium.

Tom Lenaerts (T)

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles-Vrije Universiteit Brussels, Brussels, Belgium.
Machine Learning Group, Université Libre de Bruxelles, Brussels, Belgium.
Artificial Intelligence laboratory, Vrije Universiteit Brussel, Brussels, Belgium.

Miikka Vikkula (M)

Human Molecular Genetics, de Duve Institute, University of Louvain, Brussels, Belgium.
WELBIO department, WEL Research Institute, Wavre, Belgium.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Humans Macular Degeneration Mendelian Randomization Analysis Life Style Genome-Wide Association Study
Coal Metagenome Phylogeny Bacteria Genome, Bacterial

Classifications MeSH