Importance of SNP Dependency Correction and Association Integration for Gene Set Analysis in Genome-Wide Association Studies.

gene set analysis genome-wide association study linkage disequilibrium correction single-nucleotide polymorphism statistical integration

Journal

Frontiers in genetics
ISSN: 1664-8021
Titre abrégé: Front Genet
Pays: Switzerland
ID NLM: 101560621

Informations de publication

Date de publication:
2021
Historique:
received: 30 08 2021
accepted: 10 11 2021
entrez: 27 12 2021
pubmed: 28 12 2021
medline: 28 12 2021
Statut: epublish

Résumé

A typical genome-wide association study (GWAS) analyzes millions of single-nucleotide polymorphisms (SNPs), several of which are in a region of the same gene. To conduct gene set analysis (GSA), information from SNPs needs to be unified at the gene level. A widely used practice is to use only the most relevant SNP per gene; however, there are other methods of integration that could be applied here. Also, the problem of nonrandom association of alleles at two or more loci is often neglected. Here, we tested the impact of incorporation of different integrations and linkage disequilibrium (LD) correction on the performance of several GSA methods. Matched normal and breast cancer samples from The Cancer Genome Atlas database were used to evaluate the performance of six GSA algorithms: Coincident Extreme Ranks in Numerical Observations (CERNO), Gene Set Enrichment Analysis (GSEA), GSEA-SNP, improved GSEA for GWAS (i-GSEA4GWAS), Meta-Analysis Gene-set Enrichment of variaNT Associations (MAGENTA), and Over-Representation Analysis (ORA). Association of SNPs to phenotype was calculated using modified McNemar's test. Results for SNPs mapped to the same gene were integrated using Fisher and Stouffer methods and compared with the minimum

Identifiants

pubmed: 34956320
doi: 10.3389/fgene.2021.767358
pii: 767358
pmc: PMC8696167
doi:

Types de publication

Journal Article

Langues

eng

Pagination

767358

Informations de copyright

Copyright © 2021 Marczyk, Macioszek, Tobiasz, Polanska and Zyla.

Déclaration de conflit d'intérêts

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

PLoS One. 2013 Nov 15;8(11):e79217
pubmed: 24260172
Entropy (Basel). 2020 Apr 10;22(4):
pubmed: 33286201
Nucleic Acids Res. 2015 Jan;43(Database issue):D662-9
pubmed: 25352552
Transl Psychiatry. 2018 May 18;8(1):99
pubmed: 29777097
Nucleic Acids Res. 2018 Jun 1;46(10):e60
pubmed: 29562348
Am J Hum Genet. 2007 Dec;81(6):1278-83
pubmed: 17966091
Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361
pubmed: 27899662
Front Physiol. 2013 Oct 10;4:278
pubmed: 24133454
Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50
pubmed: 16199517
Nat Genet. 2018 Mar;50(3):322-328
pubmed: 29511284
Bioinformatics. 2008 Dec 1;24(23):2784-5
pubmed: 18854360
Bioinformatics. 2009 Jan 1;25(1):75-82
pubmed: 18990722
PLoS Comput Biol. 2015 Apr 17;11(4):e1004219
pubmed: 25885710
BMC Genomics. 2020 Jun 29;21(1):447
pubmed: 32600408
N Engl J Med. 2009 Apr 23;360(17):1699-701
pubmed: 19369661
G3 (Bethesda). 2016 Dec 7;6(12):4087-4095
pubmed: 27807048
Genet Epidemiol. 2006 Sep;30(6):459-70
pubmed: 16685721
BMC Bioinformatics. 2011 Apr 15;12:99
pubmed: 21496265
Genome Biol. 2014;15(12):550
pubmed: 25516281
BMC Bioinformatics. 2017 May 12;18(1):256
pubmed: 28499413
PLoS Genet. 2019 Mar 15;15(3):e1007530
pubmed: 30875371
Hum Mol Genet. 2007 Jan 1;16(1):36-49
pubmed: 17135278
Cancers (Basel). 2020 Sep 08;12(9):
pubmed: 32911681
Brief Bioinform. 2021 Jan 18;22(1):545-556
pubmed: 32026945
Front Genet. 2018 Oct 30;9:507
pubmed: 30425729
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jan-Feb;17(1):149-157
pubmed: 30040660
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W90-5
pubmed: 20435672
PLoS Comput Biol. 2012;8(2):e1002375
pubmed: 22383865
PLoS Genet. 2010 Aug 12;6(8):
pubmed: 20714348
Nat Genet. 1999 Jul;22(3):281-5
pubmed: 10391217
Genome Biol. 2019 Oct 9;20(1):203
pubmed: 31597578
Cancer Cell. 2018 Apr 9;33(4):690-705.e9
pubmed: 29622464
BMC Bioinformatics. 2020 Dec 7;21(1):561
pubmed: 33287694
BMC Bioinformatics. 2012 Jun 19;13:136
pubmed: 22713124
BMC Bioinformatics. 2021 Apr 15;22(1):191
pubmed: 33858350
Ann N Y Acad Sci. 2010 Nov;1212:59-77
pubmed: 21091714
Hum Genomics. 2019 Oct 22;13(Suppl 1):42
pubmed: 31639047
Sci Rep. 2016 Jan 11;6:18871
pubmed: 26750448
Genet Epidemiol. 2009 Dec;33(8):700-9
pubmed: 19333968
Brief Bioinform. 2014 Jul;15(4):504-18
pubmed: 23413432
Nat Rev Cancer. 2017 Nov;17(11):692-704
pubmed: 29026206
Bioinformatics. 2019 Dec 15;35(24):5146-5154
pubmed: 31165139
Front Genet. 2020 Jun 30;11:654
pubmed: 32695141

Auteurs

Michal Marczyk (M)

Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland.
Yale Cancer Center, Yale School of Medicine, New Haven, CT, United States.

Agnieszka Macioszek (A)

Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland.

Joanna Tobiasz (J)

Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland.

Joanna Polanska (J)

Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland.

Joanna Zyla (J)

Department of Data Science and Engineering, Silesian University of Technology, Gliwice, Poland.

Classifications MeSH