Needlestack: an ultra-sensitive variant caller for multi-sample next generation sequencing data.

Journal

NAR genomics and bioinformatics

ISSN: 2631-9268

Titre abrégé: NAR Genom Bioinform

Pays: England

ID NLM: 101756213

Informations de publication

Date de publication:
Jun 2020

Historique:

received: 06 06 2019

revised: 28 01 2020

accepted: 16 04 2020

entrez: 5 5 2020

pubmed: 5 5 2020

medline: 5 5 2020

Statut: ppublish

Résumé

The emergence of next-generation sequencing (NGS) has revolutionized the way of reaching a genome sequence, with the promise of potentially providing a comprehensive characterization of DNA variations. Nevertheless, detecting somatic mutations is still a difficult problem, in particular when trying to identify low abundance mutations, such as subclonal mutations, tumour-derived alterations in body fluids or somatic mutations from histological normal tissue. The main challenge is to precisely distinguish between sequencing artefacts and true mutations, particularly when the latter are so rare they reach similar abundance levels as artefacts. Here, we present needlestack, a highly sensitive variant caller, which directly learns from the data the level of systematic sequencing errors to accurately call mutations. Needlestack is based on the idea that the sequencing error rate can be dynamically estimated from analysing multiple samples together. We show that the sequencing error rate varies across alterations, illustrating the need to precisely estimate it. We evaluate the performance of needlestack for various types of variations, and we show that needlestack is robust among positions and outperforms existing state-of-the-art method for low abundance mutations. Needlestack, along with its source code is freely available on the GitHub platform: https://github.com/IARCbioinfo/needlestack.

Identifiants

DOI: 10.1093/nargab/lqaa021 PMID: 32363341 PMC: PMC7182099

pubmed: 32363341

doi: 10.1093/nargab/lqaa021

pii: lqaa021

pmc: PMC7182099

doi:

Types de publication

Journal Article

Langues

eng

Pagination

lqaa021

Informations de copyright

Références

Nat Methods. 2011 Nov 20;9(1):72-4

pubmed: 22101854

BMC Bioinformatics. 2013;14 Suppl 5:S1

pubmed: 23735080

Bioinformatics. 2018 Sep 1;34(17):3038-3040

pubmed: 29668842

Sci Rep. 2018 Jul 19;8(1):10950

pubmed: 30026539

EBioMedicine. 2016 Aug;10:117-23

pubmed: 27377626

Comput Struct Biotechnol J. 2018 Feb 06;16:15-24

pubmed: 29552334

Nat Biotechnol. 2019 May;37(5):561-566

pubmed: 30936564

Nat Methods. 2015 Jul;12(7):623-30

pubmed: 25984700

Bioinformatics. 2009 Aug 15;25(16):2078-9

pubmed: 19505943

Bioinformatics. 2014 Oct;30(19):2813-5

pubmed: 24907369

PLoS One. 2016 Nov 28;11(11):e0167047

pubmed: 27893777

Nature. 2012 Jan 18;481(7381):306-13

pubmed: 22258609

Science. 2018 Nov 23;362(6417):911-917

pubmed: 30337457

Brief Bioinform. 2016 Jan;17(1):154-79

pubmed: 26026159

Nat Rev Cancer. 2011 Jun;11(6):426-37

pubmed: 21562580

Cell Rep. 2018 Nov 6;25(6):1446-1457

pubmed: 30404001

Next Gener Seq Appl. 2014;1:

pubmed: 25699289

PLoS Comput Biol. 2013 Apr;9(4):e1003031

pubmed: 23592973

Science. 2015 May 22;348(6237):880-6

pubmed: 25999502

Nat Commun. 2018 Aug 6;9(1):3114

pubmed: 30082701

PLoS One. 2017 May 11;12(5):e0177459

pubmed: 28494014

Nat Commun. 2015 Dec 09;6:10001

pubmed: 26647970

Bioinformatics. 2014 May 1;30(9):1198-204

pubmed: 24443148

Science. 2017 Feb 17;355(6326):752-756

pubmed: 28209900

Am J Hum Genet. 2016 Oct 6;99(4):877-885

pubmed: 27666373

Nat Biotechnol. 2017 Apr 11;35(4):316-319

pubmed: 28398311

Nature. 2015 Aug 6;524(7563):47-53

pubmed: 26168399

Nucleic Acids Res. 2009 Jul;37(13):4181-93

pubmed: 19570852

Biometrics. 2014 Dec;70(4):920-31

pubmed: 25156188

Nat Rev Cancer. 2017 Apr;17(4):223-238

pubmed: 28233803

Nature. 2012 Sep 27;489(7417):519-25

pubmed: 22960745

Needlestack: an ultra-sensitive variant caller for multi-sample next generation sequencing data.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Références

Auteurs

Tiffany M Delhomme (TM)

Patrice H Avogbe (PH)

Aurélie A G Gabriel (AAG)

Nicolas Alcala (N)

Noemie Leblay (N)

Catherine Voegele (C)

Maxime Vallée (M)

Priscilia Chopard (P)

Amélie Chabrier (A)

Behnoush Abedi-Ardekani (B)

Valérie Gaborieau (V)

Ivana Holcatova (I)

Vladimir Janout (V)

Lenka Foretová (L)

Sasa Milosavljevic (S)

David Zaridze (D)

Anush Mukeriya (A)

Elisabeth Brambilla (E)

Paul Brennan (P)

Ghislaine Scelo (G)

Lynnette Fernandez-Cuesta (L)

Graham Byrnes (G)

Florence L Calvez-Kelm (FL)

James D McKay (JD)

Matthieu Foll (M)

Classifications MeSH