Performance assessment of DNA sequencing platforms in the ABRF Next-Generation Sequencing Study.


Journal

Nature biotechnology
ISSN: 1546-1696
Titre abrégé: Nat Biotechnol
Pays: United States
ID NLM: 9604648

Informations de publication

Date de publication:
09 2021
Historique:
received: 31 07 2020
accepted: 05 08 2021
entrez: 10 9 2021
pubmed: 11 9 2021
medline: 23 9 2021
Statut: ppublish

Résumé

Assessing the reproducibility, accuracy and utility of massively parallel DNA sequencing platforms remains an ongoing challenge. Here the Association of Biomolecular Resource Facilities (ABRF) Next-Generation Sequencing Study benchmarks the performance of a set of sequencing instruments (HiSeq/NovaSeq/paired-end 2 × 250-bp chemistry, Ion S5/Proton, PacBio circular consensus sequencing (CCS), Oxford Nanopore Technologies PromethION/MinION, BGISEQ-500/MGISEQ-2000 and GS111) on human and bacterial reference DNA samples. Among short-read instruments, HiSeq 4000 and X10 provided the most consistent, highest genome coverage, while BGI/MGISEQ provided the lowest sequencing error rates. The long-read instrument PacBio CCS had the highest reference-based mapping rate and lowest non-mapping rate. The two long-read platforms PacBio CCS and PromethION/MinION showed the best sequence mapping in repeat-rich areas and across homopolymers. NovaSeq 6000 using 2 × 250-bp read chemistry was the most robust instrument for capturing known insertion/deletion events. This study serves as a benchmark for current genomics technologies, as well as a resource to inform experimental design and next-generation sequencing variant calling.

Identifiants

pubmed: 34504351
doi: 10.1038/s41587-021-01049-5
pii: 10.1038/s41587-021-01049-5
pmc: PMC8985210
mid: NIHMS1782172
doi:

Substances chimiques

DNA, Bacterial 0
DNA 9007-49-2

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S. Validation Study

Langues

eng

Sous-ensembles de citation

IM

Pagination

1129-1140

Subventions

Organisme : NIAID NIH HHS
ID : R01 AI125416
Pays : United States
Organisme : NIEHS NIH HHS
ID : R01 ES021006
Pays : United States
Organisme : NIAID NIH HHS
ID : R21 AI129851
Pays : United States
Organisme : NIDA NIH HHS
ID : U01 DA053941
Pays : United States
Organisme : NIMH NIH HHS
ID : R01 MH117406
Pays : United States
Organisme : NIBIB NIH HHS
ID : R25 EB020393
Pays : United States
Organisme : NIAID NIH HHS
ID : R01 AI151059
Pays : United States
Organisme : NINDS NIH HHS
ID : R01 NS076465
Pays : United States
Organisme : NHGRI NIH HHS
ID : UM1 HG008898
Pays : United States

Commentaires et corrections

Type : ErratumIn

Informations de copyright

© 2021. The Author(s), under exclusive licence to Springer Nature America, Inc.

Références

Schuster, S. C. Next-generation sequencing transforms today’s biology. Nat. Methods 5, 16–18 (2008).
pubmed: 18165802 doi: 10.1038/nmeth1156
Shendure, J. & Ji, H. Next-generation DNA sequencing. Nat. Biotechnol. 26, 1135–1145 (2008).
pubmed: 18846087 doi: 10.1038/nbt1486
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
pubmed: 21478889 pmcid: 3083463 doi: 10.1038/ng.806
Mardis, E. R. The impact of next-generation sequencing technology on genetics. Trends Genet. 24, 133–141 (2008).
pubmed: 18262675 doi: 10.1016/j.tig.2007.12.007
MacLean, D., Jones, J. D. & Studholme, D. J. Application of ‘next-generation’ sequencing technologies to microbial genetics. Nature Rev. Microbiol. 7, 96–97 (2009).
doi: 10.1038/nrmicro2088
Glenn, T. C. Field guide to next-generation DNA sequencers. Mol. Ecol. Resour. 11, 759–769 (2011).
pubmed: 21592312 doi: 10.1111/j.1755-0998.2011.03024.x
Aziz, N. et al. College of American Pathologists’ laboratory standards for next-generation sequencing clinical tests. Arch. Pathol. Lab. Med. 139, 481–493 (2015).
pubmed: 25152313 doi: 10.5858/arpa.2014-0250-CP
Schlaberg, R. et al. Validation of metagenomic next-generation sequencing tests for universal pathogen detection. Arch. Pathol. Lab. Med. 141, 776–786 (2017).
pubmed: 28169558 doi: 10.5858/arpa.2016-0539-RA
Zhou, J. et al. Reproducibility and quantitation of amplicon sequencing-based detection. ISME J. 5, 1303–1313 (2011).
pubmed: 21346791 pmcid: 3146266 doi: 10.1038/ismej.2011.11
Mellmann, A. et al. High interlaboratory reproducibility and accuracy of next-generation-sequencing-based bacterial genotyping in a ring trial. J. Clin. Microbiol. 55, 908–913 (2017).
pubmed: 28053217 pmcid: 5328459 doi: 10.1128/JCM.02242-16
Quail, M. A. et al. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13, 341 (2012).
pubmed: 22827831 pmcid: 3431227 doi: 10.1186/1471-2164-13-341
Shi, L. et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151–1161 (2006).
pubmed: 16964229 doi: 10.1038/nbt1239
Shi, L. et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat. Biotechnol. 28, 827–838 (2010).
pubmed: 20676074 doi: 10.1038/nbt.1665
Li, S. et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat. Biotechnol. 32, 915–925 (2014).
pubmed: 25150835 pmcid: 4167418 doi: 10.1038/nbt.2972
Su, Z. et al. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 32, 903–914 (2014).
doi: 10.1038/nbt.2957
Wang, C. et al. The concordance between RNA-seq and microarray data depends on chemical treatment and transcript abundance. Nat. Biotechnol. 32, 926–932 (2014).
pubmed: 25150839 pmcid: 4243706 doi: 10.1038/nbt.3001
Li, S. et al. Detecting and correcting systematic variation in large-scale RNA sequencing data. Nat. Biotechnol. 32, 888–895 (2014).
pubmed: 25150837 pmcid: 4160374 doi: 10.1038/nbt.3000
Risso, D., Ngai, J., Speed, T. P. & Dudoit, S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat. Biotechnol. 32, 896–902 (2014).
pubmed: 25150836 pmcid: 4404308 doi: 10.1038/nbt.2931
Merker, J. D. et al. Proficiency testing of standardized samples shows very high interlaboratory agreement for clinical next-generation sequencing–based oncology assays. Arch. Pathol. Lab. Med. 143, 463–471 (2019).
pubmed: 30376374 doi: 10.5858/arpa.2018-0336-CP
Mahamdallie, S. et al. The ICR639 CPG NGS validation series: a resource to assess analytical sensitivity of cancer predisposition gene testing. Wellcome Open Res. 3, 68 (2018).
pubmed: 30175241 pmcid: 6081973 doi: 10.12688/wellcomeopenres.14594.1
Zhong, Q. et al. Multi-laboratory proficiency testing of clinical cancer genomic profiling by next-generation sequencing. Pathol. Res. Pract. 214, 957–963 (2018).
pubmed: 29807778 doi: 10.1016/j.prp.2018.05.020
Zook, J. M. et al. An open resource for accurately benchmarking small variant and reference calls. Nat. Biotechnol. 37, 561–566 (2019).
pubmed: 30936564 pmcid: 6500473 doi: 10.1038/s41587-019-0074-6
Krusche, P. et al. Best practices for benchmarking germline small-variant calls in human genomes. Nat. Biotechnol. 37, 555–560 (2019).
pubmed: 30858580 pmcid: 6699627 doi: 10.1038/s41587-019-0054-x
Zook, J. M. et al. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci. Data 3, 160025 (2016).
pubmed: 27271295 pmcid: 4896128 doi: 10.1038/sdata.2016.25
Zook, J. M. et al. A robust benchmark for detection of germline large deletions and insertions. Nat. Biotechnol. 38, 1347–1355 (2020).
pubmed: 32541955 pmcid: 8454654 doi: 10.1038/s41587-020-0538-8
Ball, M. P. et al. A public resource facilitating clinical use of genomes. Proc. Natl Acad. Sci. USA 109, 11920–11927 (2012).
pubmed: 22797899 pmcid: 3409785 doi: 10.1073/pnas.1201904109
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
pubmed: 9862982 pmcid: 148217 doi: 10.1093/nar/27.2.573
Wagner, J. et al. Benchmarking challenging small variants with linked and long reads. Preprint at bioRxiv https://doi.org/10.1101/2020.07.24.212712 (2020).
Landrum, M. J. & Kattman, B. L. ClinVar at five years: delivering on the promise. Hum. Mutat. 39, 1623–1630 (2018).
pubmed: 30311387 doi: 10.1002/humu.23641
Amberger, J. S., Bocchini, C. A., Schiettecatte, F., Scott, A. F. & Hamosh, A. OMIM.org: Online Mendelian Inheritance in Man (OMIM), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43, D789–D798 (2015).
pubmed: 25428349 doi: 10.1093/nar/gku1205
Jeffares, D. C. et al. Transient structural variations have strong effects on quantitative traits and reproductive isolation in fission yeast. Nat. Commun. 8, 14061 (2017).
pubmed: 28117401 pmcid: 5286201 doi: 10.1038/ncomms14061
Mahmoud, M. et al. Structural variant calling: the long and the short of it. Genome Biol. 20, 246 (2019).
pubmed: 31747936 pmcid: 6868818 doi: 10.1186/s13059-019-1828-7
Sedlazeck, F. J. et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat. Methods 15, 461–468 (2018).
pubmed: 29713083 pmcid: 5990442 doi: 10.1038/s41592-018-0001-7
Olson, N. D. et al. precisionFDA Truth Challenge V2: calling variants from short-and long-reads in difficult-to-map regions. Preprint at bioRxiv https://doi.org/10.1101/2020.11.13.380741 (2020).
Freed, D. N., Aldana, R., Weber, J. A. & Edwards, J. S. The Sentieon Genomics Tools - A fast and accurate solution to variant calling from next-generation sequence data. Preprint at bioRxiv 115717 (2017).
McIntyre, A. B. et al. Comprehensive benchmarking and ensemble approaches for metagenomic classifiers. Genome Biol. 18, 182 (2017).
pubmed: 28934964 pmcid: 5609029 doi: 10.1186/s13059-017-1299-7
Sogin, M. L. in PCR Protocols: A Guide to Methods and Applications (eds Innis, M. et al.) (Elsevier, 2012).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
pubmed: 29750242 pmcid: 6137996 doi: 10.1093/bioinformatics/bty191
Pedersen, B. S. & Quinlan, A. R. Mosdepth: quick coverage calculation for genomes and exomes. Bioinformatics 34, 867–868 (2018).
pubmed: 29096012 doi: 10.1093/bioinformatics/btx699
Poplin, R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. Preprint at bioRxiv https://doi.org/10.1101/201178 (2018).
Kim, S. et al. Strelka2: fast and accurate calling of germline and somatic variants. Nat. Methods 15, 591–594 (2018).
pubmed: 30013048 doi: 10.1038/s41592-018-0051-x
Poplin, R. et al. A universal SNP and small-indel variant caller using deep neural networks. Nat. Biotechnol. 36, 983–987 (2018).
pubmed: 30247488 doi: 10.1038/nbt.4235
Luo, R. et al. Exploring the limit of using a deep neural network on pileup data for germline variant calling. Nat. Mach. Intell. 2, 220–227 (2020).
doi: 10.1038/s42256-020-0167-4
Rausch, T. et al. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333–i339 (2012).
pubmed: 22962449 pmcid: 3436805 doi: 10.1093/bioinformatics/bts378
Layer, R. M., Chiang, C., Quinlan, A. R. & Hall, I. M. LUMPY: a probabilistic framework for structural variant discovery. Genome Biol. 15, R84 (2014).
pubmed: 24970577 pmcid: 4197822 doi: 10.1186/gb-2014-15-6-r84
Chen, X. et al. Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. Bioinformatics 32, 1220–1222 (2016).
pubmed: 26647377 doi: 10.1093/bioinformatics/btv710
Conway, J. R., Lex, A. & Gehlenborg, N. UpSetR: an R package for the visualization of intersecting sets and their properties. Bioinformatics 33, 2938–2940 (2017).
pubmed: 28645171 pmcid: 5870712 doi: 10.1093/bioinformatics/btx364
Gu, Z., Eils, R. & Schlesner, M. Complex heatmaps reveal patterns and correlations in multidimensional genomic data. Bioinformatics 32, 2847–2849 (2016).
pubmed: 27207943 doi: 10.1093/bioinformatics/btw313
Toptaş, B. Ç., Rakocevic, G., Kómár, P. & Kural, D. Comparing complex variants in family trios. Bioinformatics 34, 4241–4247 (2018).
pubmed: 29868720 pmcid: 6289131 doi: 10.1093/bioinformatics/bty443

Auteurs

Jonathan Foox (J)

Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA.
The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA.

Scott W Tighe (SW)

University of Vermont Cancer Center, Vermont Integrative Genomics Resource, University of Vermont, Burlington, VT, USA.

Charles M Nicolet (CM)

Keck School of Medicine, University of Southern California, Los Angeles, CA, USA.

Justin M Zook (JM)

Biosystems and Biomaterials Division, National Institute of Standards and Technology, Gaithersburg, MD, USA.

Marta Byrska-Bishop (M)

New York Genome Center, New York, NY, USA.

Wayne E Clarke (WE)

New York Genome Center, New York, NY, USA.

Michael M Khayat (MM)

Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.

Medhat Mahmoud (M)

Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.

Phoebe K Laaguiby (PK)

University of Vermont Cancer Center, Vermont Integrative Genomics Resource, University of Vermont, Burlington, VT, USA.

Zachary T Herbert (ZT)

Molecular Biology Core Facilities, Dana-Farber Cancer Institute, Boston, MA, USA.

Derek Warner (D)

DNA Sequencing Core, University of Utah, Salt Lake City, UT, USA.

George S Grills (GS)

Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, USA.

Jin Jen (J)

Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, USA.

Shawn Levy (S)

HudsonAlpha Institute for Biotechnology, Huntsville, AL, USA.

Jenny Xiang (J)

Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA.

Alicia Alonso (A)

Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA.

Xia Zhao (X)

BGI-Shenzhen, Shenzhen, China.
MGI, BGI-Shenzhen, Shenzhen, China.

Wenwei Zhang (W)

BGI-Shenzhen, Shenzhen, China.

Fei Teng (F)

BGI-Shenzhen, Shenzhen, China.

Yonggang Zhao (Y)

BGI-Shenzhen, Shenzhen, China.
Department of Biotechnology and Biomedicine, Technical University of Denmark, Lyngby, Denmark.

Haorong Lu (H)

BGI-Shenzhen, Shenzhen, China.
Guangdong Provincial Key Laboratory of Genome Read and Write, Shenzhen, China.

Gary P Schroth (GP)

Illumina, Inc., San Diego, CA, USA.

Giuseppe Narzisi (G)

New York Genome Center, New York, NY, USA.

William Farmerie (W)

Interdisciplinary Center for Biotechnology Research, University of Florida, Gainesville, FL, USA.

Fritz J Sedlazeck (FJ)

Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA. fritz.sedlazeck@bcm.edu.
Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA. fritz.sedlazeck@bcm.edu.

Don A Baldwin (DA)

Department of Pathology, Fox Chase Cancer Center, Philadelphia, PA, USA. donald.baldwin@fccc.edu.

Christopher E Mason (CE)

Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, USA. chm2042@med.cornell.edu.
The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA. chm2042@med.cornell.edu.
The Feil Family Brain and Mind Research Institute, New York, NY, USA. chm2042@med.cornell.edu.
The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA. chm2042@med.cornell.edu.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH