Best practices for benchmarking germline small-variant calls in human genomes.
Journal
Nature biotechnology
ISSN: 1546-1696
Titre abrégé: Nat Biotechnol
Pays: United States
ID NLM: 9604648
Informations de publication
Date de publication:
05 2019
05 2019
Historique:
received:
23
05
2018
accepted:
10
01
2019
pubmed:
13
3
2019
medline:
4
7
2019
entrez:
13
3
2019
Statut:
ppublish
Résumé
Standardized benchmarking approaches are required to assess the accuracy of variants called from sequence data. Although variant-calling tools and the metrics used to assess their performance continue to improve, important challenges remain. Here, as part of the Global Alliance for Genomics and Health (GA4GH), we present a benchmarking framework for variant calling. We provide guidance on how to match variant calls with different representations, define standard performance metrics, and stratify performance by variant type and genome context. We describe limitations of high-confidence calls and regions that can be used as truth sets (for example, single-nucleotide variant concordance of two methods is 99.7% inside versus 76.5% outside high-confidence regions). Our web-based app enables comparison of variant calls against truth sets to obtain a standardized performance report. Our approach has been piloted in the PrecisionFDA variant-calling challenges to identify the best-in-class variant-calling methods within high-confidence regions. Finally, we recommend a set of best practices for using our tools and evaluating the results.
Identifiants
pubmed: 30858580
doi: 10.1038/s41587-019-0054-x
pii: 10.1038/s41587-019-0054-x
pmc: PMC6699627
mid: NIHMS1533783
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
555-560Subventions
Organisme : Intramural NIST DOC
ID : 9999-NIST
Pays : United States
Commentaires et corrections
Type : ErratumIn
Références
Nat Biotechnol. 2012 Nov;30(11):1033-6
pubmed: 23138292
Eur J Hum Genet. 2010 Dec;18(12):1276-88
pubmed: 20664632
Nat Biotechnol. 2014 Mar;32(3):246-51
pubmed: 24531798
JAMA. 2014 Nov 12;312(18):1870-9
pubmed: 25326635
J Mol Diagn. 2018 Jan;20(1):4-27
pubmed: 29154853
Genome Res. 2017 Jan;27(1):157-164
pubmed: 27903644
Bioinformatics. 2015 Jul 1;31(13):2202-4
pubmed: 25701572
Bioinformatics. 2017 May 1;33(9):1301-1308
pubmed: 28011786
Sci Rep. 2017 Oct 26;7(1):14106
pubmed: 29074871
MMWR Recomm Rep. 2009 Jun 12;58(RR-6):1-37; quiz CE-1-4
pubmed: 19521335
Nat Methods. 2015 Jul;12(7):623-30
pubmed: 25984700
Sci Data. 2016 Jun 07;3:160025
pubmed: 27271295
Arch Pathol Lab Med. 2015 Apr;139(4):481-93
pubmed: 25152313
Proc Natl Acad Sci U S A. 2012 Jul 24;109(30):11920-7
pubmed: 22797899
Genet Med. 2015 Jun;17(6):444-51
pubmed: 25232854
Nat Methods. 2018 Aug;15(8):595-597
pubmed: 30013044
Genome Res. 2017 May;27(5):849-864
pubmed: 28396521
Nat Commun. 2015 Feb 25;6:6275
pubmed: 25711446
Bioinformatics. 2014 Oct;30(19):2787-95
pubmed: 24894505
Genet Med. 2013 Sep;15(9):733-47
pubmed: 23887774