Systematic dissection of biases in whole-exome and whole-genome sequencing reveals major determinants of coding sequence coverage.
Base Sequence
/ genetics
Data Interpretation, Statistical
Exome
/ genetics
Genome, Human
/ genetics
High-Throughput Nucleotide Sequencing
/ statistics & numerical data
Humans
Machine Learning
Models, Genetic
Open Reading Frames
/ genetics
Regression Analysis
Exome Sequencing
/ statistics & numerical data
Whole Genome Sequencing
/ statistics & numerical data
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
06 02 2020
06 02 2020
Historique:
received:
21
06
2019
accepted:
22
01
2020
entrez:
8
2
2020
pubmed:
8
2
2020
medline:
20
11
2020
Statut:
epublish
Résumé
Advantages and diagnostic effectiveness of the two most widely used resequencing approaches, whole exome (WES) and whole genome (WGS) sequencing, are often debated. WES dominated large-scale resequencing projects because of lower cost and easier data storage and processing. Rapid development of 3
Identifiants
pubmed: 32029882
doi: 10.1038/s41598-020-59026-y
pii: 10.1038/s41598-020-59026-y
pmc: PMC7005158
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
2057Références
van Dijk, E. L., Auger, H., Jaszczyszyn, Y. & Thermes, C. Ten years of next-generation sequencing technology. Trends Genet. 30, 418–426 (2014).
doi: 10.1016/j.tig.2014.07.001
Caspar, S. M. et al. Clinical sequencing: From raw data to diagnosis with lifetime value. Clin. Genet. 93, 508–519 (2018).
doi: 10.1111/cge.13190
Najafi, A. et al. Variant filtering, digenic variants, and other challenges in clinical sequencing: a lesson from fibrillinopathies. Clin. Genet. 97, 235-242 (2020).
Wang, Z., Liu, X., Yang, B.-Z. & Gelernter, J. The Role and Challenges of Exome Sequencing in Studies of Human Diseases. Front. Genet. 4 (2013).
The 1000 Genomes Project Consortium. A global reference for human genetic variation. Nature 526, 68–74 (2015).
doi: 10.1038/nature15393
Fu, W. et al. Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants. Nature 493, 216–220 (2013).
doi: 10.1038/nature11690
Exome Aggregation Consortium C. et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature 536, 285–291 (2016).
doi: 10.1038/nature19057
Cassa, C. A. et al. Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat. Genet. 49, 806–810 (2017).
doi: 10.1038/ng.3831
Clark, M. J. et al. Performance comparison of exome DNA sequencing technologies. Nat. Biotechnol. 29, 908–914 (2011).
doi: 10.1038/nbt.1975
Parla, J. S. et al. A comparative analysis of exome capture. Genome Biol. 12, R97 (2011).
doi: 10.1186/gb-2011-12-9-r97
Sulonen, A.-M. et al. Comparison of solution-based exome capture methods for next generation sequencing. Genome Biol. 12, R94 (2011).
doi: 10.1186/gb-2011-12-9-r94
Chilamakuri, C. S. et al. Performance comparison of four exome capture systems for deep sequencing. BMC Genomics 15, 449 (2014).
doi: 10.1186/1471-2164-15-449
Meienberg, J. et al. New insights into the performance of human whole-exome capture platforms. Nucleic Acids Res. 43, e76–e76 (2015).
doi: 10.1093/nar/gkv216
Wang, Q., Shashikant, C. S., Jensen, M., Altman, N. S. & Girirajan, S. Novel metrics to measure coverage in whole exome sequencing datasets reveal local and global non-uniformity. Sci. Rep. 7 (2017).
Lelieveld, S. H., Spielmann, M., Mundlos, S., Veltman, J. A. & Gilissen, C. Comparison of Exome and Genome Sequencing Technologies for the Complete Capture of Protein-Coding Regions. Hum. Mutat. 36, 815–822 (2015).
doi: 10.1002/humu.22813
Belkadi, A. et al. Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. Proc. Natl. Acad. Sci. 112, 5473–5478 (2015).
doi: 10.1073/pnas.1418631112
Carss, K. J. et al. Comprehensive Rare Variant Analysis via Whole-Genome Sequencing to Determine the Molecular Pathology of Inherited Retinal Disease. Am. J. Hum. Genet. 100, 75–90 (2017).
doi: 10.1016/j.ajhg.2016.12.003
Chaisson, M. J. P. et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat. Commun. 10 (2019).
Ebbert, M. T. W. et al. Systematic analysis of dark and camouflaged genes reveals disease-relevant genes hiding in plain sight. Genome Biol. 20, 97 (2019).
doi: 10.1186/s13059-019-1707-2
Mokry, M. et al. Accurate SNP and mutation detection by targeted custom microarray-based genomic enrichment of short-fragment sequencing libraries. Nucleic Acids Res. 38, e116–e116 (2010).
doi: 10.1093/nar/gkq072
Larson, J. L. et al. Validation of a high resolution NGS method for detecting spinal muscular atrophy carriers among phase 3 participants in the 1000 Genomes Project. BMC Med. Genet., 16 (2015).
Nei, M., Gu, X. & Sitnikova, T. Evolution by the birth-and-death process in multigene families of the vertebrate immune system. Proc. Natl. Acad. Sci. 94, 7799–7806 (1997).
doi: 10.1073/pnas.94.15.7799
Wright, C. F., FitzPatrick, D. R. & Firth, H. V. Paediatric genomics: diagnosing rare disease in children. Nat. Rev. Genet. 19, 253–268 (2018).
doi: 10.1038/nrg.2017.116
Meienberg, J., Bruggmann, R., Oexle, K. & Matyas, G. Clinical sequencing: is WGS the better WES? Hum. Genet. 135, 359–362 (2016).
doi: 10.1007/s00439-015-1631-9
Sawyer, S. L. et al. Utility of whole-exome sequencing for those near the end of the diagnostic odyssey: time to address gaps in care: Whole-exome sequencing for rare disease diagnosis. Clin. Genet. 89, 275–284 (2016).
doi: 10.1111/cge.12654
Orphanomix Physicians’ Group. et al. Clinical whole-exome sequencing for the diagnosis of rare disorders with congenital anomalies and/or intellectual disability: substantial interest of prospective annual reanalysis. Genet. Med. 20, 645–654 (2018).
doi: 10.1038/gim.2017.162
Zhernakova, D. V. et al. Analytical “bake-off” of whole genome sequencing quality for the Genome Russia project using a small cohort for autoimmune hepatitis. PLoS One 13, e0200423 (2018).
Zook, J. M. et al. Extensive sequencing of seven human genomes to characterize benchmark reference materials. Sci. Data, 3, (2016).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
doi: 10.1093/bioinformatics/btq033
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
doi: 10.1093/bioinformatics/btp324
DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
doi: 10.1038/ng.806
Van der Auwera, G. A. et al. From FastQ Data to High-Confidence Variant Calls: The Genome Analysis Toolkit Best Practices Pipeline: The Genome Analysis Toolkit Best Practices Pipeline. In Current Protocols in Bioinformatics (eds. Bateman, A., Pearson, W. R., Stein, L. D., Stormo, G. D. & Yates, J. R.) 11.10.1–11.10.33, https://doi.org/10.1002/0471250953.bi1110s43 (John Wiley & Sons, Inc., 2013).
Barbitoff, Y. A. et al. Catching hidden variation: systematic correction of reference minor allele annotation in clinical variant calling. Genet. Med. 20, 360–364 (2018).
doi: 10.1038/gim.2017.168
Thorvaldsdottir, H., Robinson, J. T. & Mesirov, J. P. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192 (2013).
doi: 10.1093/bib/bbs017
Harrow, J. et al. GENCODE: The reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).
doi: 10.1101/gr.135350.111
Landrum, M. J. et al. ClinVar: improving access to variant interpretations and supporting evidence. Nucleic Acids Res. 46, D1062–D1067 (2018).
doi: 10.1093/nar/gkx1153
Kuhn, M. Building Predictive Models in R Using the caret Package. J. Stat. Softw. 28 (2008).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis. (Springer-Verlag New York, 2016).