CopyDetective: Detection threshold-aware copy number variant calling in whole-exome sequencing data.
cell fraction
copy number variant
polymorphism
Journal
GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872
Informations de publication
Date de publication:
02 11 2020
02 11 2020
Historique:
received:
27
05
2020
revised:
17
08
2020
accepted:
02
10
2020
entrez:
2
11
2020
pubmed:
3
11
2020
medline:
26
10
2021
Statut:
ppublish
Résumé
Copy number variants (CNVs) are known to play an important role in the development and progression of several diseases. However, detection of CNVs with whole-exome sequencing (WES) experiments is challenging. Usually, additional experiments have to be performed. We developed a novel algorithm for somatic CNV calling in matched WES data called "CopyDetective". Different from other approaches, CNV calling with CopyDetective consists of a 2-step procedure: first, quality analysis is performed, determining individual detection thresholds for every sample. Second, actual CNV calling on the basis of the previously determined thresholds is performed. Our algorithm evaluates the change in variant allele frequency of polymorphisms and reports the fraction of affected cells for every CNV. Analyzing 4 WES data sets (n = 100) we observed superior performance of CopyDetective compared with ExomeCNV, VarScan2, ControlFREEC, ExomeDepth, and CNV-seq. Individual detection thresholds reveal that not every WES data set is equally apt for CNV calling. Initial quality analyses, determining individual detection thresholds-as realized by CopyDetective-can and should be performed prior to actual variant calling.
Sections du résumé
BACKGROUND
Copy number variants (CNVs) are known to play an important role in the development and progression of several diseases. However, detection of CNVs with whole-exome sequencing (WES) experiments is challenging. Usually, additional experiments have to be performed.
FINDINGS
We developed a novel algorithm for somatic CNV calling in matched WES data called "CopyDetective". Different from other approaches, CNV calling with CopyDetective consists of a 2-step procedure: first, quality analysis is performed, determining individual detection thresholds for every sample. Second, actual CNV calling on the basis of the previously determined thresholds is performed. Our algorithm evaluates the change in variant allele frequency of polymorphisms and reports the fraction of affected cells for every CNV. Analyzing 4 WES data sets (n = 100) we observed superior performance of CopyDetective compared with ExomeCNV, VarScan2, ControlFREEC, ExomeDepth, and CNV-seq.
CONCLUSIONS
Individual detection thresholds reveal that not every WES data set is equally apt for CNV calling. Initial quality analyses, determining individual detection thresholds-as realized by CopyDetective-can and should be performed prior to actual variant calling.
Identifiants
pubmed: 33135740
pii: 5949275
doi: 10.1093/gigascience/giaa118
pmc: PMC7604644
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2020. Published by Oxford University Press GigaScience.
Références
Sci Rep. 2017 Feb 24;7:43169
pubmed: 28233799
Proc Natl Acad Sci U S A. 2010 Sep 28;107(39):16910-5
pubmed: 20837533
Sci Rep. 2016 Oct 31;6:36158
pubmed: 27796336
Ann Intern Med. 2018 Feb 6;168(3):221-222
pubmed: 29310131
Bioinformatics. 2012 Feb 1;28(3):423-5
pubmed: 22155870
Nat Rev Genet. 2016 Aug 16;17(9):507-22
pubmed: 27528417
Bioinformatics. 2011 Oct 1;27(19):2648-54
pubmed: 21828086
Bioinformatics. 2018 Jul 15;34(14):2349-2355
pubmed: 29992253
Bioinformatics. 2012 Jan 1;28(1):40-7
pubmed: 22039209
Bioinformatics. 2014 Dec 15;30(24):3532-40
pubmed: 25297070
Nat Biotechnol. 2013 Nov;31(11):990-2
pubmed: 24213773
BMC Bioinformatics. 2017 May 31;18(1):286
pubmed: 28569140
Genome Res. 2012 Mar;22(3):568-76
pubmed: 22300766
Nat Commun. 2017 Apr 21;8:15099
pubmed: 28429724
BMC Bioinformatics. 2009 Mar 06;10:80
pubmed: 19267900
Bioinformatics. 2012 Nov 1;28(21):2747-54
pubmed: 22942019
Biochim Biophys Acta. 2014 Nov;1843(11):2698-2704
pubmed: 25110350
Brief Bioinform. 2015 May;16(3):380-92
pubmed: 25169955
Cancer Inform. 2014 Sep 21;13(Suppl 2):67-82
pubmed: 25288881
Nat Biotechnol. 2012 May;30(5):413-21
pubmed: 22544022
Cancers (Basel). 2015 Oct 14;7(4):2023-36
pubmed: 26473927
BMC Bioinformatics. 2013;14 Suppl 11:S1
pubmed: 24564169
Blood. 2016 Sep 8;128(10):1362-73
pubmed: 27335277
Cancer Treat Rev. 2017 Apr;55:136-149
pubmed: 28371665
Nucleic Acids Res. 2016 Jun 20;44(11):e108
pubmed: 27060149
Leukemia. 2020 May 14;:
pubmed: 32404974