Identification of single nucleotide variants using position-specific error estimation in deep sequencing data.
Cancer genomics
Deep sequencing
Error correction
Ion torrent
Liquid biopsies
Next generation sequencing (NGS)
Targeted sequencing
Variant calling
Journal
BMC medical genomics
ISSN: 1755-8794
Titre abrégé: BMC Med Genomics
Pays: England
ID NLM: 101319628
Informations de publication
Date de publication:
02 08 2019
02 08 2019
Historique:
received:
21
11
2018
accepted:
15
07
2019
entrez:
4
8
2019
pubmed:
4
8
2019
medline:
31
1
2020
Statut:
epublish
Résumé
Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs). To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection. Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments. AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve .
Sections du résumé
BACKGROUND
Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs).
METHODS
To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection.
RESULTS
Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments.
CONCLUSIONS
AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve .
Identifiants
pubmed: 31375105
doi: 10.1186/s12920-019-0557-9
pii: 10.1186/s12920-019-0557-9
pmc: PMC6679440
doi:
Substances chimiques
Circulating Tumor DNA
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
115Subventions
Organisme : Cancer Research UK
ID : A13239
Pays : United Kingdom
Organisme : Prostate Cancer UK
ID : PG12-49
Pays : United Kingdom
Organisme : Wellcome Trust
ID : 105104/Z/14/
Pays : United Kingdom
Références
NPJ Precis Oncol. 2017 Oct 17;1(1):36
pubmed: 29872715
Cell. 2017 Feb 9;168(4):613-628
pubmed: 28187284
Nat Biotechnol. 2013 Mar;31(3):213-9
pubmed: 23396013
BMC Bioinformatics. 2014 Jun 12;15:182
pubmed: 24925680
PLoS Comput Biol. 2013 Apr;9(4):e1003031
pubmed: 23592973
BMC Med Genomics. 2015 Mar 01;8:9
pubmed: 25889339
BMC Bioinformatics. 2018 Jan 04;19(1):5
pubmed: 29301485
Nat Rev Genet. 2016 May 17;17(6):333-51
pubmed: 27184599
Bioinformatics. 2018 Apr 1;34(7):1232-1234
pubmed: 29126106
Nat Commun. 2012 May 01;3:811
pubmed: 22549840
BMC Genomics. 2012 Jul 24;13:341
pubmed: 22827831
Nat Rev Cancer. 2017 Apr;17(4):223-238
pubmed: 28233803
Nat Rev Genet. 2011 Jun;12(6):443-51
pubmed: 21587300
Nat Genet. 2014 Aug;46(8):912-918
pubmed: 25017105
Clin Chem. 2018 Nov;64(11):1626-1635
pubmed: 30150316
Comput Struct Biotechnol J. 2018 Feb 06;16:15-24
pubmed: 29552334
Eur Urol. 2018 Nov;74(5):562-572
pubmed: 30049486
Bioinformatics. 2017 Jan 1;33(1):26-34
pubmed: 27531099
Ann Oncol. 2017 Jul 1;28(7):1508-1516
pubmed: 28472366
Sci Transl Med. 2014 Sep 17;6(254):254ra125
pubmed: 25232177
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Clin Cancer Res. 2018 Oct 1;24(19):4763-4770
pubmed: 29891724
Nat Biotechnol. 2016 May;34(5):547-555
pubmed: 27018799
Sci Transl Med. 2015 Nov 4;7(312):312re10
pubmed: 26537258