Assessing the performance of methods for copy number aberration detection from single-cell DNA sequencing data.
Algorithms
Chromosome Aberrations
Computational Biology
Computer Simulation
DNA Copy Number Variations
Gene Dosage
Genome, Human
Humans
Mutation
Neoplasms
/ genetics
Ploidies
Poisson Distribution
ROC Curve
Reproducibility of Results
Sequence Analysis, DNA
/ methods
Single-Cell Analysis
/ methods
Software
Journal
PLoS computational biology
ISSN: 1553-7358
Titre abrégé: PLoS Comput Biol
Pays: United States
ID NLM: 101238922
Informations de publication
Date de publication:
07 2020
07 2020
Historique:
received:
07
11
2019
accepted:
03
06
2020
revised:
23
07
2020
pubmed:
14
7
2020
medline:
4
9
2020
entrez:
14
7
2020
Statut:
epublish
Résumé
Single-cell DNA sequencing technologies are enabling the study of mutations and their evolutionary trajectories in cancer. Somatic copy number aberrations (CNAs) have been implicated in the development and progression of various types of cancer. A wide array of methods for CNA detection has been either developed specifically for or adapted to single-cell DNA sequencing data. Understanding the strengths and limitations that are unique to each of these methods is very important for obtaining accurate copy number profiles from single-cell DNA sequencing data. We benchmarked three widely used methods-Ginkgo, HMMcopy, and CopyNumber-on simulated as well as real datasets. To facilitate this, we developed a novel simulator of single-cell genome evolution in the presence of CNAs. Furthermore, to assess performance on empirical data where the ground truth is unknown, we introduce a phylogeny-based measure for identifying potentially erroneous inferences. While single-cell DNA sequencing is very promising for elucidating and understanding CNAs, our findings show that even the best existing method does not exceed 80% accuracy. New methods that significantly improve upon the accuracy of these three methods are needed. Furthermore, with the large datasets being generated, the methods must be computationally efficient.
Identifiants
pubmed: 32658894
doi: 10.1371/journal.pcbi.1008012
pii: PCOMPBIOL-D-19-01949
pmc: PMC7377518
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
e1008012Déclaration de conflit d'intérêts
The authors have declared that no competing interests exist.
Références
PLoS One. 2014 Jan 21;9(1):e85907
pubmed: 24465780
Genome Biol. 2016 May 31;17(1):116
pubmed: 27246599
CSH Protoc. 2008 Jan 01;2008:pdb.prot4919
pubmed: 21356673
Nat Genet. 2007 Jul;39(7 Suppl):S16-21
pubmed: 17597776
Nat Genet. 2013 Oct;45(10):1134-40
pubmed: 24071852
Genomics. 1992 Jul;13(3):718-25
pubmed: 1639399
Genome Biol. 2016 May 31;17(1):115
pubmed: 27246460
Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Cell. 2018 Jun 14;173(7):1823
pubmed: 29906452
Nat Genet. 2016 Oct;48(10):1119-30
pubmed: 27526321
PLoS One. 2017 Jul 19;12(7):e0181163
pubmed: 28723968
Biochim Biophys Acta. 2014 Nov;1843(11):2698-2704
pubmed: 25110350
Nat Methods. 2009 Jan;6(1):99-103
pubmed: 19043412
Genome Biol. 2014 Aug 30;15(8):452
pubmed: 25222669
Bioinformatics. 2006 Jul 15;22(14):e431-9
pubmed: 16873504
BMC Genomics. 2012 Nov 04;13:591
pubmed: 23442169
Sci Rep. 2018 Mar 5;8(1):4009
pubmed: 29507384
Nat Rev Genet. 2006 Feb;7(2):85-97
pubmed: 16418744
R Soc Open Sci. 2016 May 11;3(5):160016
pubmed: 27293780
Nat Commun. 2020 Jan 3;11(1):89
pubmed: 31900397
Annu Rev Genomics Hum Genet. 2006;7:407-42
pubmed: 16780417
Oncol Rep. 2018 May;39(5):2147-2159
pubmed: 29565466
Bioinformatics. 2012 Feb 1;28(3):423-5
pubmed: 22155870
Brief Bioinform. 2018 Sep 28;19(5):731-736
pubmed: 28159966
Syst Biol. 2006 Aug;55(4):685-91
pubmed: 16969944
Bioinformatics. 2019 Oct 15;35(20):3890-3897
pubmed: 30865265
Nat Methods. 2015 Nov;12(11):1058-60
pubmed: 26344043
Algorithms Mol Biol. 2017 May 16;12:13
pubmed: 28515774
Cell. 2018 May 3;173(4):879-893.e13
pubmed: 29681456
Nature. 2010 Feb 18;463(7283):899-905
pubmed: 20164920
Annu Rev Genet. 2011;45:203-26
pubmed: 21854229
Bioinformatics. 2010 Dec 15;26(24):3051-8
pubmed: 20966003
Nat Protoc. 2012 May 03;7(6):1024-41
pubmed: 22555242
Bioinformatics. 2012 Jan 1;28(1):40-7
pubmed: 22039209
Science. 1976 Oct 1;194(4260):23-8
pubmed: 959840
Proc Natl Acad Sci U S A. 1996 Dec 10;93(25):14800-3
pubmed: 8962135
Biostatistics. 2004 Oct;5(4):557-72
pubmed: 15475419
Annu Rev Genet. 1982;16:21-59
pubmed: 6760799
BMC Genomics. 2014 Mar 05;15:178
pubmed: 24597965
Biochim Biophys Acta Rev Cancer. 2017 Apr;1867(2):151-161
pubmed: 28110020
Nature. 2011 Apr 7;472(7341):90-4
pubmed: 21399628
Nature. 2014 Aug 14;512(7513):155-60
pubmed: 25079324
PLoS One. 2011 Jan 31;6(1):e16327
pubmed: 21305028
N Engl J Med. 2007 Mar 15;356(11):1169-71
pubmed: 17360997
Genome Res. 2011 Jun;21(6):974-84
pubmed: 21324876
Nat Methods. 2017 Feb;14(2):167-173
pubmed: 28068316
Genome Res. 2010 Nov;20(11):1613-22
pubmed: 20805290
Physiol Genomics. 2013 Jan 7;45(1):1-16
pubmed: 23132758
Genome Res. 2012 Oct;22(10):1995-2007
pubmed: 22637570
Genome Res. 2016 Mar;26(3):376-84
pubmed: 26772196