Performance assessment of DNA sequencing platforms in the ABRF Next-Generation Sequencing Study.

Base Pair Mismatch Benchmarking DNA / genetics DNA, Bacterial / genetics Genome, Bacterial Genome, Human High-Throughput Nucleotide Sequencing / methods Humans Sequence Analysis, DNA / methods

Journal

Nature biotechnology

ISSN: 1546-1696

Titre abrégé: Nat Biotechnol

Pays: United States

ID NLM: 9604648

Informations de publication

Date de publication:
09 2021

Historique:

received: 31 07 2020

accepted: 05 08 2021

entrez: 10 9 2021

pubmed: 11 9 2021

medline: 23 9 2021

Statut: ppublish

Résumé

Assessing the reproducibility, accuracy and utility of massively parallel DNA sequencing platforms remains an ongoing challenge. Here the Association of Biomolecular Resource Facilities (ABRF) Next-Generation Sequencing Study benchmarks the performance of a set of sequencing instruments (HiSeq/NovaSeq/paired-end 2 × 250-bp chemistry, Ion S5/Proton, PacBio circular consensus sequencing (CCS), Oxford Nanopore Technologies PromethION/MinION, BGISEQ-500/MGISEQ-2000 and GS111) on human and bacterial reference DNA samples. Among short-read instruments, HiSeq 4000 and X10 provided the most consistent, highest genome coverage, while BGI/MGISEQ provided the lowest sequencing error rates. The long-read instrument PacBio CCS had the highest reference-based mapping rate and lowest non-mapping rate. The two long-read platforms PacBio CCS and PromethION/MinION showed the best sequence mapping in repeat-rich areas and across homopolymers. NovaSeq 6000 using 2 × 250-bp read chemistry was the most robust instrument for capturing known insertion/deletion events. This study serves as a benchmark for current genomics technologies, as well as a resource to inform experimental design and next-generation sequencing variant calling.

Identifiants

DOI: 10.1038/s41587-021-01049-5 PMID: 34504351 PMC: PMC8985210

pubmed: 34504351

doi: 10.1038/s41587-021-01049-5

pii: 10.1038/s41587-021-01049-5

pmc: PMC8985210

mid: NIHMS1782172

doi:

Substances chimiques

DNA, Bacterial 0

DNA 9007-49-2

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S. Validation Study

Langues

eng

Sous-ensembles de citation

Pagination

1129-1140

Subventions

Organisme : NIAID NIH HHS

ID : R01 AI125416

Pays : United States

Organisme : NIEHS NIH HHS

ID : R01 ES021006

Pays : United States

Organisme : NIAID NIH HHS

ID : R21 AI129851

Pays : United States

Organisme : NIDA NIH HHS

ID : U01 DA053941

Pays : United States

Organisme : NIMH NIH HHS

ID : R01 MH117406

Pays : United States

Organisme : NIBIB NIH HHS

ID : R25 EB020393

Pays : United States

Organisme : NIAID NIH HHS

ID : R01 AI151059

Pays : United States

Organisme : NINDS NIH HHS

ID : R01 NS076465

Pays : United States

Organisme : NHGRI NIH HHS

ID : UM1 HG008898

Pays : United States

Commentaires et corrections

Type : ErratumIn

Informations de copyright

Références

Schuster, S. C. Next-generation sequencing transforms today’s biology. Nat. Methods 5, 16–18 (2008).

pubmed: 18165802 doi: 10.1038/nmeth1156

Shendure, J. & Ji, H. Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008).

pubmed: 18846087 doi: 10.1038/nbt1486

DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).

pubmed: 21478889 pmcid: 3083463 doi: 10.1038/ng.806

Mardis, E. R. The impact of next-generation sequencing technology on genetics. Trends Genet. 24, 133–141 (2008).

pubmed: 18262675 doi: 10.1016/j.tig.2007.12.007

MacLean, D., Jones, J. D. & Studholme, D. J. Application of ‘next-generation’ sequencing technologies to microbial genetics. Nature Rev. Microbiol. 7, 96–97 (2009).

doi: 10.1038/nrmicro2088

Glenn, T. C. Field guide to next-generation DNA sequencers. Mol. Ecol. Resour. 11, 759–769 (2011).

pubmed: 21592312 doi: 10.1111/j.1755-0998.2011.03024.x

Aziz, N. et al. College of American Pathologists’ laboratory standards for next-generation sequencing clinical tests. Arch. Pathol. Lab. Med. 139, 481–493 (2015).

pubmed: 25152313 doi: 10.5858/arpa.2014-0250-CP

Schlaberg, R. et al. Validation of metagenomic next-generation sequencing tests for universal pathogen detection. Arch. Pathol. Lab. Med. 141, 776–786 (2017).

pubmed: 28169558 doi: 10.5858/arpa.2016-0539-RA

Zhou, J. et al. Reproducibility and quantitation of amplicon sequencing-based detection. ISME J. 5, 1303–1313 (2011).

pubmed: 21346791 pmcid: 3146266 doi: 10.1038/ismej.2011.11

Mellmann, A. et al. High interlaboratory reproducibility and accuracy of next-generation-sequencing-based bacterial genotyping in a ring trial. J. Clin. Microbiol. 55, 908–913 (2017).

pubmed: 28053217 pmcid: 5328459 doi: 10.1128/JCM.02242-16

Quail, M. A. et al. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13, 341 (2012).

pubmed: 22827831 pmcid: 3431227 doi: 10.1186/1471-2164-13-341

Shi, L. et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151–1161 (2006).

pubmed: 16964229 doi: 10.1038/nbt1239

Shi, L. et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat. Biotechnol. 28, 827–838 (2010).

pubmed: 20676074 doi: 10.1038/nbt.1665

Li, S. et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat. Biotechnol. 32, 915–925 (2014).

pubmed: 25150835 pmcid: 4167418 doi: 10.1038/nbt.2972

Su, Z. et al. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 32, 903–914 (2014).

doi: 10.1038/nbt.2957

Wang, C. et al. The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat. Biotechnol. 32, 926–932 (2014).

pubmed: 25150839 pmcid: 4243706 doi: 10.1038/nbt.3001

Li, S. et al. Detecting and correcting systematic variation in large-scale RNA sequencing data. Nat. Biotechnol. 32, 888–895 (2014).

pubmed: 25150837 pmcid: 4160374 doi: 10.1038/nbt.3000

Risso, D., Ngai, J., Speed, T. P. & Dudoit, S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat. Biotechnol. 32, 896–902 (2014).

pubmed: 25150836 pmcid: 4404308 doi: 10.1038/nbt.2931

Merker, J. D. et al. Proficiency testing of standardized samples shows very high interlaboratory agreement for clinical next-generation sequencing–based oncology assays. Arch. Pathol. Lab. Med. 143, 463–471 (2019).

pubmed: 30376374 doi: 10.5858/arpa.2018-0336-CP

Mahamdallie, S. et al. The ICR639 CPG NGS validation series: a resource to assess analytical sensitivity of cancer predisposition gene testing. Wellcome Open Res. 3, 68 (2018).

pubmed: 30175241 pmcid: 6081973 doi: 10.12688/wellcomeopenres.14594.1

Zhong, Q. et al. Multi-laboratory proficiency testing of clinical cancer genomic profiling by next-generation sequencing. Pathol. Res. Pract. 214, 957–963 (2018).

pubmed: 29807778 doi: 10.1016/j.prp.2018.05.020

Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37, 561–566 (2019).

pubmed: 30936564 pmcid: 6500473 doi: 10.1038/s41587-019-0074-6

Krusche, P. et al. Best practices for benchmarking germline small-variant calls in human genomes. Nat. Biotechnol. 37, 555–560 (2019).

pubmed: 30858580 pmcid: 6699627 doi: 10.1038/s41587-019-0054-x

Zook, J. M. et al. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci. Data 3, 160025 (2016).

pubmed: 27271295 pmcid: 4896128 doi: 10.1038/sdata.2016.25

Zook, J. M. et al. A robust benchmark for detection of germline large deletions and insertions. Nat. Biotechnol. 38, 1347–1355 (2020).

pubmed: 32541955 pmcid: 8454654 doi: 10.1038/s41587-020-0538-8

Ball, M. P. et al. A public resource facilitating clinical use of genomes. Proc. Natl Acad. Sci. USA 109, 11920–11927 (2012).

pubmed: 22797899 pmcid: 3409785 doi: 10.1073/pnas.1201904109

Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

pubmed: 9862982 pmcid: 148217 doi: 10.1093/nar/27.2.573

Wagner, J. et al. Benchmarking challenging small variants with linked and long reads. Preprint at bioRxiv https://doi.org/10.1101/2020.07.24.212712 (2020).

Landrum, M. J. & Kattman, B. L. ClinVar at five years: delivering on the promise. Hum. Mutat. 39, 1623–1630 (2018).

pubmed: 30311387 doi: 10.1002/humu.23641

Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2015).

pubmed: 25428349 doi: 10.1093/nar/gku1205

Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).

pubmed: 28117401 pmcid: 5286201 doi: 10.1038/ncomms14061

Mahmoud, M. et al. Structural variant calling: the long and the short of it. Genome Biol. 20, 246 (2019).

pubmed: 31747936 pmcid: 6868818 doi: 10.1186/s13059-019-1828-7

Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).

pubmed: 29713083 pmcid: 5990442 doi: 10.1038/s41592-018-0001-7

Olson, N. D. et al. precisionFDA Truth Challenge V2: calling variants from short-and long-reads in difficult-to-map regions. Preprint at bioRxiv https://doi.org/10.1101/2020.11.13.380741 (2020).

Freed, D. N., Aldana, R., Weber, J. A. & Edwards, J. S. The Sentieon Genomics Tools - A fast and accurate solution to variant calling from next-generation sequence data. Preprint at bioRxiv 115717 (2017).

McIntyre, A. B. et al. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers. Genome Biol. 18, 182 (2017).

pubmed: 28934964 pmcid: 5609029 doi: 10.1186/s13059-017-1299-7

Sogin, M. L. in PCR Protocols: A Guide to Methods and Applications (eds Innis, M. et al.) (Elsevier, 2012).

Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

pubmed: 29750242 pmcid: 6137996 doi: 10.1093/bioinformatics/bty191

Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).

pubmed: 29096012 doi: 10.1093/bioinformatics/btx699

Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv https://doi.org/10.1101/201178 (2018).

Kim, S. et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat. Methods 15, 591–594 (2018).

pubmed: 30013048 doi: 10.1038/s41592-018-0051-x

Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).

pubmed: 30247488 doi: 10.1038/nbt.4235

Luo, R. et al. Exploring the limit of using a deep neural network on pileup data for germline variant calling. Nat. Mach. Intell. 2, 220–227 (2020).

doi: 10.1038/s42256-020-0167-4

Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339 (2012).

pubmed: 22962449 pmcid: 3436805 doi: 10.1093/bioinformatics/bts378

Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15, R84 (2014).

pubmed: 24970577 pmcid: 4197822 doi: 10.1186/gb-2014-15-6-r84

Chen, X. et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32, 1220–1222 (2016).

pubmed: 26647377 doi: 10.1093/bioinformatics/btv710

Conway, J. R., Lex, A. & Gehlenborg, N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33, 2938–2940 (2017).

pubmed: 28645171 pmcid: 5870712 doi: 10.1093/bioinformatics/btx364

Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).

pubmed: 27207943 doi: 10.1093/bioinformatics/btw313

Toptaş, B. Ç., Rakocevic, G., Kómár, P. & Kural, D. Comparing complex variants in family trios. Bioinformatics 34, 4241–4247 (2018).

pubmed: 29868720 pmcid: 6289131 doi: 10.1093/bioinformatics/bty443

Performance assessment of DNA sequencing platforms in the ABRF Next-Generation Sequencing Study.

Journal

Informations de publication

Résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Pagination

Subventions

Commentaires et corrections

Informations de copyright

Références

Auteurs

Articles similaires

Classifications MeSH