CopyDetective: Detection threshold-aware copy number variant calling in whole-exome sequencing data.

cell fraction copy number variant polymorphism

Journal

GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872

Informations de publication

Date de publication:
02 11 2020
Historique:
received: 27 05 2020
revised: 17 08 2020
accepted: 02 10 2020
entrez: 2 11 2020
pubmed: 3 11 2020
medline: 26 10 2021
Statut: ppublish

Résumé

Copy number variants (CNVs) are known to play an important role in the development and progression of several diseases. However, detection of CNVs with whole-exome sequencing (WES) experiments is challenging. Usually, additional experiments have to be performed. We developed a novel algorithm for somatic CNV calling in matched WES data called "CopyDetective". Different from other approaches, CNV calling with CopyDetective consists of a 2-step procedure: first, quality analysis is performed, determining individual detection thresholds for every sample. Second, actual CNV calling on the basis of the previously determined thresholds is performed. Our algorithm evaluates the change in variant allele frequency of polymorphisms and reports the fraction of affected cells for every CNV. Analyzing 4 WES data sets (n = 100) we observed superior performance of CopyDetective compared with ExomeCNV, VarScan2, ControlFREEC, ExomeDepth, and CNV-seq. Individual detection thresholds reveal that not every WES data set is equally apt for CNV calling. Initial quality analyses, determining individual detection thresholds-as realized by CopyDetective-can and should be performed prior to actual variant calling.

Sections du résumé

BACKGROUND
Copy number variants (CNVs) are known to play an important role in the development and progression of several diseases. However, detection of CNVs with whole-exome sequencing (WES) experiments is challenging. Usually, additional experiments have to be performed.
FINDINGS
We developed a novel algorithm for somatic CNV calling in matched WES data called "CopyDetective". Different from other approaches, CNV calling with CopyDetective consists of a 2-step procedure: first, quality analysis is performed, determining individual detection thresholds for every sample. Second, actual CNV calling on the basis of the previously determined thresholds is performed. Our algorithm evaluates the change in variant allele frequency of polymorphisms and reports the fraction of affected cells for every CNV. Analyzing 4 WES data sets (n = 100) we observed superior performance of CopyDetective compared with ExomeCNV, VarScan2, ControlFREEC, ExomeDepth, and CNV-seq.
CONCLUSIONS
Individual detection thresholds reveal that not every WES data set is equally apt for CNV calling. Initial quality analyses, determining individual detection thresholds-as realized by CopyDetective-can and should be performed prior to actual variant calling.

Identifiants

pubmed: 33135740
pii: 5949275
doi: 10.1093/gigascience/giaa118
pmc: PMC7604644
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press GigaScience.

Références

Sci Rep. 2017 Feb 24;7:43169
pubmed: 28233799
Proc Natl Acad Sci U S A. 2010 Sep 28;107(39):16910-5
pubmed: 20837533
Sci Rep. 2016 Oct 31;6:36158
pubmed: 27796336
Ann Intern Med. 2018 Feb 6;168(3):221-222
pubmed: 29310131
Bioinformatics. 2012 Feb 1;28(3):423-5
pubmed: 22155870
Nat Rev Genet. 2016 Aug 16;17(9):507-22
pubmed: 27528417
Bioinformatics. 2011 Oct 1;27(19):2648-54
pubmed: 21828086
Bioinformatics. 2018 Jul 15;34(14):2349-2355
pubmed: 29992253
Bioinformatics. 2012 Jan 1;28(1):40-7
pubmed: 22039209
Bioinformatics. 2014 Dec 15;30(24):3532-40
pubmed: 25297070
Nat Biotechnol. 2013 Nov;31(11):990-2
pubmed: 24213773
BMC Bioinformatics. 2017 May 31;18(1):286
pubmed: 28569140
Genome Res. 2012 Mar;22(3):568-76
pubmed: 22300766
Nat Commun. 2017 Apr 21;8:15099
pubmed: 28429724
BMC Bioinformatics. 2009 Mar 06;10:80
pubmed: 19267900
Bioinformatics. 2012 Nov 1;28(21):2747-54
pubmed: 22942019
Biochim Biophys Acta. 2014 Nov;1843(11):2698-2704
pubmed: 25110350
Brief Bioinform. 2015 May;16(3):380-92
pubmed: 25169955
Cancer Inform. 2014 Sep 21;13(Suppl 2):67-82
pubmed: 25288881
Nat Biotechnol. 2012 May;30(5):413-21
pubmed: 22544022
Cancers (Basel). 2015 Oct 14;7(4):2023-36
pubmed: 26473927
BMC Bioinformatics. 2013;14 Suppl 11:S1
pubmed: 24564169
Blood. 2016 Sep 8;128(10):1362-73
pubmed: 27335277
Cancer Treat Rev. 2017 Apr;55:136-149
pubmed: 28371665
Nucleic Acids Res. 2016 Jun 20;44(11):e108
pubmed: 27060149
Leukemia. 2020 May 14;:
pubmed: 32404974

Auteurs

Sarah Sandmann (S)

Institute of Medical Informatics, University of Münster, Albert-Schweitzer-Campus 1, Building A11, Münster 48149, Germany.

Marius Wöste (M)

Institute of Medical Informatics, University of Münster, Albert-Schweitzer-Campus 1, Building A11, Münster 48149, Germany.

Aniek O de Graaf (AO)

Laboratory Hematology, RadboudUMC, Geert Grooteplein Zuid 10, Nijmegen 6525 GA, Netherlands.

Birgit Burkhardt (B)

Paediatric Hematology & Oncology, University Hospital Münster, Albert-Schweitzer-Campus 1, Building A1, Münster 48149, Germany.

Joop H Jansen (JH)

Laboratory Hematology, RadboudUMC, Geert Grooteplein Zuid 10, Nijmegen 6525 GA, Netherlands.

Martin Dugas (M)

Institute of Medical Informatics, University of Münster, Albert-Schweitzer-Campus 1, Building A11, Münster 48149, Germany.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Humans Algorithms Software Artificial Intelligence Computer Simulation

Unsupervised learning for real-time and continuous gait phase detection.

Dollaporn Anopas, Yodchanan Wongsawat, Jetsada Arnin
1.00
Humans Gait Neural Networks, Computer Unsupervised Machine Learning Walking

Classifications MeSH