Cluster effect for SNP-SNP interaction pairs for predicting complex traits.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
12 Aug 2024
Historique:
received: 08 02 2024
accepted: 01 07 2024
medline: 13 8 2024
pubmed: 13 8 2024
entrez: 12 8 2024
Statut: epublish

Résumé

Single nucleotide polymorphism (SNP) interactions are the key to improving polygenic risk scores. Previous studies reported several significant SNP-SNP interaction pairs that shared a common SNP to form a cluster, but some identified pairs might be false positives. This study aims to identify factors associated with the cluster effect of false positivity and develop strategies to enhance the accuracy of SNP-SNP interactions. The results showed the cluster effect is a major cause of false-positive findings of SNP-SNP interactions. This cluster effect is due to high correlations between a causal pair and null pairs in a cluster. The clusters with a hub SNP with a significant main effect and a large minor allele frequency (MAF) tended to have a higher false-positive rate. In addition, peripheral null SNPs in a cluster with a small MAF tended to enhance false positivity. We also demonstrated that using the modified significance criterion based on the 3 p-value rules and the bootstrap approach (3pRule + bootstrap) can reduce false positivity and maintain high true positivity. In addition, our results also showed that a pair without a significant main effect tends to have weak or no interaction. This study identified the cluster effect and suggested using the 3pRule + bootstrap approach to enhance SNP-SNP interaction detection accuracy.

Identifiants

pubmed: 39134575
doi: 10.1038/s41598-024-66311-7
pii: 10.1038/s41598-024-66311-7
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

18677

Subventions

Organisme : U.S. Department of Defense
ID : PC220560

Informations de copyright

© 2024. The Author(s).

Références

Loos, R. J. F. 15 years of genome-wide association studies and no signs of slowing down. Nat. Commun. 11, 5900. https://doi.org/10.1038/s41467-020-19653-5 (2020).
doi: 10.1038/s41467-020-19653-5 pubmed: 33214558 pmcid: 7677394
Mortezaei, Z. & Tavallaei, M. Recent innovations and in-depth aspects of post-genome wide association study (Post-GWAS) to understand the genetic basis of complex phenotypes. Heredity (Edinb) 127, 485–497. https://doi.org/10.1038/s41437-021-00479-w (2021).
doi: 10.1038/s41437-021-00479-w pubmed: 34689168
Wray, N. R. et al. From basic science to clinical application of polygenic risk scores: A primer. JAMA Psychiatry 78, 101–109. https://doi.org/10.1001/jamapsychiatry.2020.3049 (2021).
doi: 10.1001/jamapsychiatry.2020.3049 pubmed: 32997097
Cordell, H. J. Detecting gene-gene interactions that underlie human diseases. Nat. Rev. Genet. 10, 392–404 (2009).
doi: 10.1038/nrg2579 pubmed: 19434077 pmcid: 2872761
Moore, J. H. The ubiquitous nature of epistasis in determining susceptibility to common human diseases. Hum. Hered. 56, 73–82 (2003).
doi: 10.1159/000073735 pubmed: 14614241
Lin, H. Y. et al. SNP interaction pattern identifier (SIPI): An intensive search for SNP-SNP interaction patterns. Bioinformatics 33, 822–833. https://doi.org/10.1093/bioinformatics/btw762 (2017).
doi: 10.1093/bioinformatics/btw762 pubmed: 28039167
Lin, H. Y. et al. AA9int: SNP interaction pattern search using non-hierarchical additive model set. Bioinformatics 34, 4141–4150. https://doi.org/10.1093/bioinformatics/bty461 (2018).
doi: 10.1093/bioinformatics/bty461 pubmed: 29878078 pmcid: 6289141
Krzywinski, M. & Altman, N. Power and sample size. Nat. Methods 10, 1139–1140 (2013).
doi: 10.1038/nmeth.2738
Lin, H. Y. et al. KLK3 SNP-SNP interactions for prediction of prostate cancer aggressiveness. Sci. Rep. 11, 9264. https://doi.org/10.1038/s41598-021-85169-7 (2021).
doi: 10.1038/s41598-021-85169-7 pubmed: 33927218 pmcid: 8084951
Tuo, S., Liu, H. & Chen, H. Multipopulation harmony search algorithm for the detection of high-order SNP interactions. Bioinformatics 36, 4389–4398. https://doi.org/10.1093/bioinformatics/btaa215 (2020).
doi: 10.1093/bioinformatics/btaa215 pubmed: 32227192
Lee, K. Y. et al. Genome-wide search for SNP interactions in GWAS data: Algorithm, feasibility, replication using schizophrenia datasets. Front. Genet. 11, 1003. https://doi.org/10.3389/fgene.2020.01003 (2020).
doi: 10.3389/fgene.2020.01003 pubmed: 33133133 pmcid: 7505102
Su, W. H. et al. How genome-wide SNP-SNP interactions relate to nasopharyngeal carcinoma susceptibility. PLoS One 8, e83034. https://doi.org/10.1371/journal.pone.0083034 (2013).
doi: 10.1371/journal.pone.0083034 pubmed: 24376627 pmcid: 3871583
Sengupta Chattopadhyay, A., Hsiao, C. L., Chang, C. C., Lian Ie, B. & Fann, C. S. Summarizing techniques that combine three non-parametric scores to detect disease-associated 2-way SNP-SNP interactions. Gene 533, 304–312. https://doi.org/10.1016/j.gene.2013.09.041 (2014).
Vaidyanathan, V. et al. SNP-SNP interactions as risk factors for aggressive prostate cancer. F1000Res 6, 621. https://doi.org/10.12688/f1000research.11027.1 (2017).
Tuo, S., Zhang, J., Yuan, X., Zhang, Y. & Liu, Z. FHSA-SED: Two-locus model detection for genome-wide association study with harmony search algorithm. PLoS One 11, e0150669. https://doi.org/10.1371/journal.pone.0150669 (2016).
doi: 10.1371/journal.pone.0150669 pubmed: 27014873 pmcid: 4807955
Tuo, S. et al. Niche harmony search algorithm for detecting complex disease associated high-order SNP combinations. Sci. Rep. 7, 11529. https://doi.org/10.1038/s41598-017-11064-9 (2017).
doi: 10.1038/s41598-017-11064-9 pubmed: 28912584 pmcid: 5599559
Tuo, S. H. et al. MTHSA-DHEI: multitasking harmony search algorithm for detecting high-order SNP epistatic interactions. Complex Intell. Syst. 9, 637–658. https://doi.org/10.1007/s40747-022-00813-7 (2023).
doi: 10.1007/s40747-022-00813-7
Ritchie, M. D. et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am. J. Hum. Genet. 69, 138–147. https://doi.org/10.1086/321276 (2001).
doi: 10.1086/321276 pubmed: 11404819 pmcid: 1226028
Motsinger, A. A. & Ritchie, M. D. The effect of reduction in cross-validation intervals on the performance of multifactor dimensionality reduction. Genet. Epidemiol. 30, 546–555. https://doi.org/10.1002/gepi.20166 (2006).
doi: 10.1002/gepi.20166 pubmed: 16800004
Edwards, T. L., Lewis, K., Velez, D. R., Dudek, S. & Ritchie, M. D. Exploring the performance of Multifactor Dimensionality Reduction in large scale SNP studies and in the presence of genetic heterogeneity among epistatic disease models. Hum. Hered. 67, 183–192. https://doi.org/10.1159/000181157 (2009).
doi: 10.1159/000181157 pubmed: 19077437
Gui, J. et al. A novel survival multifactor dimensionality reduction method for detecting gene-gene interactions with application to bladder cancer prognosis. Hum. Genet. 129, 101–110. https://doi.org/10.1007/s00439-010-0905-5 (2011).
doi: 10.1007/s00439-010-0905-5 pubmed: 20981448
Gola, D., Mahachie John, J. M., van Steen, K. & Konig, I. R. A roadmap to multifactor dimensionality reduction methods. Brief. Bioinform. 17, 293–308. https://doi.org/10.1093/bib/bbv038 (2016).
Curtis, A. et al. Examining SNP–SNP interactions and risk of clinical outcomes in colorectal cancer using multifactor dimensionality reduction based methods. Front. Genet. 13, 902217. https://doi.org/10.3389/fgene.2022.902217 (2022).
doi: 10.3389/fgene.2022.902217 pubmed: 35991579 pmcid: 9385108
Laurin, C., Boomsma, D. & Lubke, G. The use of vector bootstrapping to improve variable selection precision in Lasso models. Stat. Appl. Genet. Mol. Biol. 15, 305–320. https://doi.org/10.1515/sagmb-2015-0043 (2016).
doi: 10.1515/sagmb-2015-0043 pubmed: 27248122
Milne, R. L., Fagerholm, R., Nevanlinna, H. & Benitez, J. The importance of replication in gene-gene interaction studies: multifactor dimensionality reduction applied to a two-stage breast cancer case-control study. Carcinogenesis 29, 1215–1218 (2008).
doi: 10.1093/carcin/bgn120 pubmed: 18482998
Heymans, M. W., van Buuren, S., Knol, D. L., van Mechelen, W. & de Vet, H. C. Variable selection under multiple imputation using the bootstrap in a prognostic study. BMC Med. Res. Methodol. 7, 33. https://doi.org/10.1186/1471-2288-7-33 (2007).
doi: 10.1186/1471-2288-7-33 pubmed: 17629912 pmcid: 1945032
Buniello, A. et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 47, D1005–D1012. https://doi.org/10.1093/nar/gky1120 (2019).
doi: 10.1093/nar/gky1120 pubmed: 30445434
Hofner, B., Boccuto, L. & Goker, M. Controlling false discoveries in high-dimensional situations: boosting with stability selection. BMC Bioinf. 16, 144. https://doi.org/10.1186/s12859-015-0575-3 (2015).
doi: 10.1186/s12859-015-0575-3
Austin, P. C. & Tu, J. V. Bootstrap methods for developing predictive models. Am. Stat. 58, 131–137 (2004).
doi: 10.1198/0003130043277
Sheppard, B. et al. A model and test for coordinated polygenic epistasis in complex traits. Proc. Natl. Acad. Sci. USA 118, 1. https://doi.org/10.1073/pnas.1922305118 (2021).
doi: 10.1073/pnas.1922305118
Tang, D., Freudenberg, J. & Dahl, A. Factorizing polygenic epistasis improves prediction and uncovers biological pathways in complex traits. Am. J. Hum. Genet. 110, 1875–1887. https://doi.org/10.1016/j.ajhg.2023.10.002 (2023).
doi: 10.1016/j.ajhg.2023.10.002 pubmed: 37922884 pmcid: 10645564

Auteurs

Hui-Yi Lin (HY)

Biostatistics and Data Science Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, LA, 70112, USA. hlin1@lsuhsc.edu.

Harun Mazumder (H)

Biostatistics and Data Science Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, LA, 70112, USA.

Indrani Sarkar (I)

Biostatistics and Data Science Program, School of Public Health, Louisiana State University Health Sciences Center, New Orleans, LA, 70112, USA.

Po-Yu Huang (PY)

Information and Communications Research Laboratories, Industrial Technology Research Institute, Hsinchu, Taiwan.

Rosalind A Eeles (RA)

The Institute of Cancer Research, London, SM2 5NG, UK.
Royal Marsden NHS Foundation Trust, London, SW3 6JJ, UK.

Zsofia Kote-Jarai (Z)

The Institute of Cancer Research, London, SM2 5NG, UK.

Kenneth R Muir (KR)

Division of Population Health, Health Services Research and Primary Care, University of Manchester, Oxford Road, Manchester, M13 9PL, UK.

Johanna Schleutker (J)

Institute of Biomedicine, University of Turku, Turku, Finland.
Department of Medical Genetics, Genomics, Laboratory Division, Turku University Hospital, PO Box 52, 20521, Turku, Finland.

Nora Pashayan (N)

Department of Applied Health Research, University College London, London, WC1E 7HB, UK.
Centre for Cancer Genetic Epidemiology, Department of Oncology, University of Cambridge, Strangeways Laboratory, Worts Causeway, Cambridge, CB1 8RN, UK.

Jyotsna Batra (J)

Australian Prostate Cancer Research Centre-Qld, Institute of Health and Biomedical Innovation and School of Biomedical Science, Queensland University of Technology, Brisbane, QLD, 4059, Australia.
Translational Research Institute, Brisbane, QLD, 4102, Australia.

David E Neal (DE)

Nuffield Department of Surgical Sciences, University of Oxford, John Radcliffe Hospital, Room 6603, Level 6, Headley Way, Headington, Oxford, OX3 9DU, UK.
Department of Oncology, University of Cambridge, Addenbrooke's Hospital, Hills Road, Box 279, Cambridge, CB2 0QQ, UK.
Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Cambridge, CB2 0RE, UK.

Sune F Nielsen (SF)

Faculty of Health and Medical Sciences, University of Copenhagen, 2200, Copenhagen, Denmark.
Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev, 2200, Copenhagen, Denmark.

Børge G Nordestgaard (BG)

Faculty of Health and Medical Sciences, University of Copenhagen, 2200, Copenhagen, Denmark.
Department of Clinical Biochemistry, Herlev and Gentofte Hospital, Copenhagen University Hospital, Herlev, 2200, Copenhagen, Denmark.

Henrik Grönberg (H)

Department of Medical Epidemiology and Biostatistics, Karolinska Institute, 171 77, Stockholm, Sweden.

Fredrik Wiklund (F)

Department of Medical Epidemiology and Biostatistics, Karolinska Institute, 171 77, Stockholm, Sweden.

Robert J MacInnis (RJ)

Cancer Epidemiology Division, Cancer Council Victoria, 200 Victoria Parade, East Melbourne, 3002, Australia.
Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, The University of Melbourne, Grattan Street, Parkville, VIC, 3010, Australia.

Christopher A Haiman (CA)

Center for Genetic Epidemiology, Department of Preventive Medicine, Keck School of Medicine, University of Southern California/Norris Comprehensive Cancer Center, Los Angeles, CA, 90015, USA.

Ruth C Travis (RC)

Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford, Oxford, OX3 7LF, UK.

Janet L Stanford (JL)

Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, 98109-1024, USA.
Department of Epidemiology, School of Public Health, University of Washington, Seattle, WA, 98195, USA.

Adam S Kibel (AS)

Division of Urologic Surgery, Brigham and Womens Hospital, 75 Francis Street, Boston, MA, 02115, USA.

Cezary Cybulski (C)

International Hereditary Cancer Center, Department of Genetics and Pathology, Pomeranian Medical University, 70-115, Szczecin, Poland.

Kay-Tee Khaw (KT)

Clinical Gerontology Unit, University of Cambridge, Cambridge, CB2 2QQ, UK.

Christiane Maier (C)

Humangenetik Tuebingen, Paul-Ehrlich-Str 23, 72076, Tuebingen, Germany.

Stephen N Thibodeau (SN)

Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, 55905, USA.

Manuel R Teixeira (MR)

Department of Laboratory Genetics, Portuguese Oncology Institute of Porto (IPO Porto)/Porto Comprehensive Cancer Center, Porto, Portugal.
Cancer Genetics Group, IPO Porto Research Center (CI-IPOP)/RISE@CI-IPOP (Health Research Network), Portuguese Oncology Institute of Porto (IPO Porto)/Porto Comprehensive Cancer Center, Porto, Portugal.
School of Medicine and Biomedical Sciences (ICBAS), University of Porto, Porto, Portugal.

Lisa Cannon-Albright (L)

Division of Epidemiology, Department of Internal Medicine, University of Utah School of Medicine, Salt Lake City, UT, 84132, USA.
George E. Wahlen Department of Veterans Affairs Medical Center, Salt Lake City, UT, 84148, USA.

Hermann Brenner (H)

Division of Clinical Epidemiology and Aging Research, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany.
German Cancer Consortium (DKTK), German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany.
Division of Preventive Oncology, German Cancer Research Center (DKFZ) and National Center for Tumor Diseases (NCT), Im Neuenheimer Feld 460, 69120, Heidelberg, Germany.

Radka Kaneva (R)

Molecular Medicine Center, Department of Medical Chemistry and Biochemistry, Medical University of Sofia, Sofia, 2 Zdrave Str., 1431, Sofia, Bulgaria.

Hardev Pandha (H)

The University of Surrey, Guildford, Surrey, GU2 7XH, UK.

Jong Y Park (JY)

Department of Cancer Epidemiology, Moffitt Cancer Center, 12902 Magnolia Drive, Tampa, FL, 33612, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH