Identification of single nucleotide variants using position-specific error estimation in deep sequencing data.

Cancer genomics Deep sequencing Error correction Ion torrent Liquid biopsies Next generation sequencing (NGS) Targeted sequencing Variant calling

Journal

BMC medical genomics
ISSN: 1755-8794
Titre abrégé: BMC Med Genomics
Pays: England
ID NLM: 101319628

Informations de publication

Date de publication:
02 08 2019
Historique:
received: 21 11 2018
accepted: 15 07 2019
entrez: 4 8 2019
pubmed: 4 8 2019
medline: 31 1 2020
Statut: epublish

Résumé

Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs). To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection. Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments. AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve .

Sections du résumé

BACKGROUND
Targeted deep sequencing is a highly effective technology to identify known and novel single nucleotide variants (SNVs) with many applications in translational medicine, disease monitoring and cancer profiling. However, identification of SNVs using deep sequencing data is a challenging computational problem as different sequencing artifacts limit the analytical sensitivity of SNV detection, especially at low variant allele frequencies (VAFs).
METHODS
To address the problem of relatively high noise levels in amplicon-based deep sequencing data (e.g. with the Ion AmpliSeq technology) in the context of SNV calling, we have developed a new bioinformatics tool called AmpliSolve. AmpliSolve uses a set of normal samples to model position-specific, strand-specific and nucleotide-specific background artifacts (noise), and deploys a Poisson model-based statistical framework for SNV detection.
RESULTS
Our tests on both synthetic and real data indicate that AmpliSolve achieves a good trade-off between precision and sensitivity, even at VAF below 5% and as low as 1%. We further validate AmpliSolve by applying it to the detection of SNVs in 96 circulating tumor DNA samples at three clinically relevant genomic positions and compare the results to digital droplet PCR experiments.
CONCLUSIONS
AmpliSolve is a new tool for in-silico estimation of background noise and for detection of low frequency SNVs in targeted deep sequencing data. Although AmpliSolve has been specifically designed for and tested on amplicon-based libraries sequenced with the Ion Torrent platform it can, in principle, be applied to other sequencing platforms as well. AmpliSolve is freely available at https://github.com/dkleftogi/AmpliSolve .

Identifiants

pubmed: 31375105
doi: 10.1186/s12920-019-0557-9
pii: 10.1186/s12920-019-0557-9
pmc: PMC6679440
doi:

Substances chimiques

Circulating Tumor DNA 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

115

Subventions

Organisme : Cancer Research UK
ID : A13239
Pays : United Kingdom
Organisme : Prostate Cancer UK
ID : PG12-49
Pays : United Kingdom
Organisme : Wellcome Trust
ID : 105104/Z/14/
Pays : United Kingdom

Références

NPJ Precis Oncol. 2017 Oct 17;1(1):36
pubmed: 29872715
Cell. 2017 Feb 9;168(4):613-628
pubmed: 28187284
Nat Biotechnol. 2013 Mar;31(3):213-9
pubmed: 23396013
BMC Bioinformatics. 2014 Jun 12;15:182
pubmed: 24925680
PLoS Comput Biol. 2013 Apr;9(4):e1003031
pubmed: 23592973
BMC Med Genomics. 2015 Mar 01;8:9
pubmed: 25889339
BMC Bioinformatics. 2018 Jan 04;19(1):5
pubmed: 29301485
Nat Rev Genet. 2016 May 17;17(6):333-51
pubmed: 27184599
Bioinformatics. 2018 Apr 1;34(7):1232-1234
pubmed: 29126106
Nat Commun. 2012 May 01;3:811
pubmed: 22549840
BMC Genomics. 2012 Jul 24;13:341
pubmed: 22827831
Nat Rev Cancer. 2017 Apr;17(4):223-238
pubmed: 28233803
Nat Rev Genet. 2011 Jun;12(6):443-51
pubmed: 21587300
Nat Genet. 2014 Aug;46(8):912-918
pubmed: 25017105
Clin Chem. 2018 Nov;64(11):1626-1635
pubmed: 30150316
Comput Struct Biotechnol J. 2018 Feb 06;16:15-24
pubmed: 29552334
Eur Urol. 2018 Nov;74(5):562-572
pubmed: 30049486
Bioinformatics. 2017 Jan 1;33(1):26-34
pubmed: 27531099
Ann Oncol. 2017 Jul 1;28(7):1508-1516
pubmed: 28472366
Sci Transl Med. 2014 Sep 17;6(254):254ra125
pubmed: 25232177
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Clin Cancer Res. 2018 Oct 1;24(19):4763-4770
pubmed: 29891724
Nat Biotechnol. 2016 May;34(5):547-555
pubmed: 27018799
Sci Transl Med. 2015 Nov 4;7(312):312re10
pubmed: 26537258

Auteurs

Dimitrios Kleftogiannis (D)

Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK.
Present address: Genome Institute of Singapore (GIS), Agency of Science Research and Technology (A*STAR), Singapore, 138672, Singapore.

Marco Punta (M)

Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK.

Anuradha Jayaram (A)

UCL Cancer Institute, University College London, London, UK.

Shahneen Sandhu (S)

Peter MacCallum Cancer Centre and University of Melbourne, Melbourne, Victoria, Australia.

Stephen Q Wong (SQ)

Peter MacCallum Cancer Centre and University of Melbourne, Melbourne, Victoria, Australia.

Delila Gasi Tandefelt (D)

Department of Urology, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden.

Vincenza Conteduca (V)

Department of Medical Oncology, Istituto Scientifico Romagnolo per lo Studio e la Cura dei Tumori (IRST) IRCCS, 47014, Meldola, Italy.

Daniel Wetterskog (D)

UCL Cancer Institute, University College London, London, UK.

Gerhardt Attard (G)

UCL Cancer Institute, University College London, London, UK. g.attard@ucl.ac.uk.

Stefano Lise (S)

Centre for Evolution and Cancer, The Institute of Cancer Research, London, UK. Stefano.Lise@icr.ac.uk.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH