Bayesian copy number detection and association in large-scale studies.


Journal

BMC cancer
ISSN: 1471-2407
Titre abrégé: BMC Cancer
Pays: England
ID NLM: 100967800

Informations de publication

Date de publication:
07 Sep 2020
Historique:
received: 21 02 2020
accepted: 17 08 2020
entrez: 7 9 2020
pubmed: 8 9 2020
medline: 20 4 2021
Statut: epublish

Résumé

Germline copy number variants (CNVs) increase risk for many diseases, yet detection of CNVs and quantifying their contribution to disease risk in large-scale studies is challenging due to biological and technical sources of heterogeneity that vary across the genome within and between samples. We developed an approach called CNPBayes to identify latent batch effects in genome-wide association studies involving copy number, to provide probabilistic estimates of integer copy number across the estimated batches, and to fully integrate the copy number uncertainty in the association model for disease. Applying a hidden Markov model (HMM) to identify CNVs in a large multi-site Pancreatic Cancer Case Control study (PanC4) of 7598 participants, we found CNV inference was highly sensitive to technical noise that varied appreciably among participants. Applying CNPBayes to this dataset, we found that the major sources of technical variation were linked to sample processing by the centralized laboratory and not the individual study sites. Modeling the latent batch effects at each CNV region hierarchically, we developed probabilistic estimates of copy number that were directly incorporated in a Bayesian regression model for pancreatic cancer risk. Candidate associations aided by this approach include deletions of 8q24 near regulatory elements of the tumor oncogene MYC and of Tumor Suppressor Candidate 3 (TUSC3). Laboratory effects may not account for the major sources of technical variation in genome-wide association studies. This study provides a robust Bayesian inferential framework for identifying latent batch effects, estimating copy number, and evaluating the role of copy number in heritable diseases.

Sections du résumé

BACKGROUND BACKGROUND
Germline copy number variants (CNVs) increase risk for many diseases, yet detection of CNVs and quantifying their contribution to disease risk in large-scale studies is challenging due to biological and technical sources of heterogeneity that vary across the genome within and between samples.
METHODS METHODS
We developed an approach called CNPBayes to identify latent batch effects in genome-wide association studies involving copy number, to provide probabilistic estimates of integer copy number across the estimated batches, and to fully integrate the copy number uncertainty in the association model for disease.
RESULTS RESULTS
Applying a hidden Markov model (HMM) to identify CNVs in a large multi-site Pancreatic Cancer Case Control study (PanC4) of 7598 participants, we found CNV inference was highly sensitive to technical noise that varied appreciably among participants. Applying CNPBayes to this dataset, we found that the major sources of technical variation were linked to sample processing by the centralized laboratory and not the individual study sites. Modeling the latent batch effects at each CNV region hierarchically, we developed probabilistic estimates of copy number that were directly incorporated in a Bayesian regression model for pancreatic cancer risk. Candidate associations aided by this approach include deletions of 8q24 near regulatory elements of the tumor oncogene MYC and of Tumor Suppressor Candidate 3 (TUSC3).
CONCLUSIONS CONCLUSIONS
Laboratory effects may not account for the major sources of technical variation in genome-wide association studies. This study provides a robust Bayesian inferential framework for identifying latent batch effects, estimating copy number, and evaluating the role of copy number in heritable diseases.

Identifiants

pubmed: 32894098
doi: 10.1186/s12885-020-07304-3
pii: 10.1186/s12885-020-07304-3
pmc: PMC7487704
doi:

Substances chimiques

MYC protein, human 0
Membrane Proteins 0
Proto-Oncogene Proteins c-myc 0
TUSC3 protein, human 0
Tumor Suppressor Proteins 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

856

Subventions

Organisme : NCI NIH HHS
ID : P30 CA008748
Pays : United States
Organisme : NCI NIH HHS
ID : P50 CA062924
Pays : United States
Organisme : NCI NIH HHS
ID : R01 CA154823
Pays : United States
Organisme : NCATS NIH HHS
ID : UL1 TR001863
Pays : United States

Références

Nat Genet. 2019 Jan;51(1):106-116
pubmed: 30559488
Genes (Basel). 2014 Dec 11;5(4):1064-94
pubmed: 25513881
Nucleic Acids Res. 2012 May;40(10):e72
pubmed: 22323520
Genome Biol. 2007;8(10):R228
pubmed: 17961237
Genome Res. 2013 Jan;23(1):152-8
pubmed: 23028187
Nucleic Acids Res. 2008 Aug;36(13):e80
pubmed: 18559357
Am J Hum Genet. 2016 Mar 3;98(3):571-578
pubmed: 26942289
Cancer Biol Ther. 2007 Oct;6(10):1592-9
pubmed: 17912030
Genes Cancer. 2010 Jun;1(6):555-9
pubmed: 21779458
Nat Genet. 2008 Oct;40(10):1253-60
pubmed: 18776909
J Mol Diagn. 2012 Nov;14(6):550-9
pubmed: 22922130
Nature. 2010 Apr 1;464(7289):713-20
pubmed: 20360734
Am J Hum Genet. 2012 Oct 5;91(4):597-607
pubmed: 23040492
Nat Genet. 2014 Sep;46(9):994-1000
pubmed: 25086665
Nat Genet. 2012 May 06;44(6):642-50
pubmed: 22561516
Head Neck. 2012 Jun;34(6):830-9
pubmed: 22127891
Nat Genet. 2008 Oct;40(10):1245-52
pubmed: 18776912
Ann Appl Stat. 2008 Jun 1;2(2):687-713
pubmed: 19609370
Front Genet. 2014 Feb 13;5:29
pubmed: 24592275
Cell. 2016 Jul 28;166(3):755-765
pubmed: 27372738
Nat Genet. 2015 Aug;47(8):911-6
pubmed: 26098869
J Pathol. 2016 May;239(1):60-71
pubmed: 27071482
Biochim Biophys Acta Mol Basis Dis. 2017 Jul;1863(7):1749-1760
pubmed: 28487226
Genet Epidemiol. 2011 Sep;35(6):536-48
pubmed: 21769931
Am J Hum Genet. 2006 Apr;78(4):629-44
pubmed: 16532393
Cell Rep. 2018 Sep 11;24(11):2838-2856
pubmed: 30208311
Cell. 2018 Apr 5;173(2):355-370.e14
pubmed: 29625052
Nat Biotechnol. 2011 May 08;29(6):512-20
pubmed: 21552272
Nature. 2015 Oct 1;526(7571):68-74
pubmed: 26432245
Bioinformatics. 2016 Jan 1;32(1):133-5
pubmed: 26382196
Nucleic Acids Res. 2008 Nov;36(19):e126
pubmed: 18784189
Genet Epidemiol. 2011 Dec;35(8):831-44
pubmed: 22125222
J Neurodev Disord. 2019 Feb 7;11(1):3
pubmed: 30732576
Nature. 2007 Oct 18;449(7164):851-61
pubmed: 17943122
Bioinformatics. 2009 May 1;25(9):1099-104
pubmed: 19276148
PLoS Genet. 2007 Sep;3(9):1724-35
pubmed: 17907809
Cell Mol Life Sci. 2018 Mar;75(5):849-857
pubmed: 28929175
BMC Genet. 2014 Jul 09;15:81
pubmed: 25007794

Auteurs

Stephen Cristiano (S)

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

David McKean (D)

Department of Oncology The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA.

Jacob Carey (J)

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Paige Bracci (P)

Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, USA.

Paul Brennan (P)

Genetics Section, International Agency for Research on Cancer, Lyon, France.

Michael Chou (M)

Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Mengmeng Du (M)

Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, 10065, NY, USA.

Steven Gallinger (S)

Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, M5G 1x5, Ontario, Canada.

Michael G Goggins (MG)

Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
Department of Pathology, Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins School of Medicine, Baltimore, MD, USA.

Manal M Hassan (MM)

Department of Epidemiology, Cancer Prevention & Population Sciences, UT MD Anderson Cancer Center, Houston, 77030, TX, USA.

Rayjean J Hung (RJ)

Lunenfeld-Tanenbaum Research Institute of Mount Sinai Hospital, Toronto, M5G 1x5, Ontario, Canada.

Robert C Kurtz (RC)

Department of Gastroenterology, Hepatology, and Nutrition Service, Memorial Sloan Kettering Cancer Center, New York, 10065, NY, USA.

Donghui Li (D)

Department of Gastrointestinal Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, 77030, TX, USA.

Lingeng Lu (L)

Department of Chronic Disease Epidemiology, Yale School of Public Health, Yale Cancer Center, New Haven, CT, USA.

Rachel Neale (R)

Population Health Department, QIMR Berghofer Medical Research Institute, Brisbane, 4029, Australia.

Sara Olson (S)

Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, New York, 10065, NY, USA.

Gloria Petersen (G)

Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, 55905, MN, USA.

Kari G Rabe (KG)

Department of Health Sciences Research, Mayo Clinic College of Medicine, Rochester, 55905, MN, USA.

Jack Fu (J)

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Harvey Risch (H)

Department of Chronic Disease Epidemiology, Yale School of Public Health, Yale Cancer Center, New Haven, CT, USA.

Gary L Rosner (GL)

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.
Department of Epidemiology, Cancer Prevention & Population Sciences, UT MD Anderson Cancer Center, Houston, 77030, TX, USA.

Ingo Ruczinski (I)

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA.

Alison P Klein (AP)

Department of Oncology The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA. aklein1@jhmi.edu.
Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA. aklein1@jhmi.edu.
Department of Pathology, Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins School of Medicine, Baltimore, MD, USA. aklein1@jhmi.edu.

Robert B Scharpf (RB)

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA. rscharpf@jhu.edu.
Department of Oncology The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA. rscharpf@jhu.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH