Shedding light on dark genes: enhanced targeted resequencing by optimizing the combination of enrichment technology and DNA fragment length.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
10 06 2020
10 06 2020
Historique:
received:
13
12
2019
accepted:
23
04
2020
entrez:
12
6
2020
pubmed:
12
6
2020
medline:
24
11
2020
Statut:
epublish
Résumé
The exome contains many obscure regions difficult to explore with current short-read sequencing methods. Repetitious genomic regions prevent the unique alignment of reads, which is essential for the identification of clinically-relevant genetic variants. Long-read technologies attempt to resolve multiple-mapping regions, but they still produce many sequencing errors. Thus, a new approach is required to enlighten the obscure regions of the genome and rescue variants that would be otherwise neglected. This work aims to improve the alignment of multiple-mapping reads through the extension of the standard DNA fragment size. As Illumina can sequence fragments up to 550 bp, we tested different DNA fragment lengths using four major commercial WES platforms and found that longer DNA fragments achieved a higher genotypability. This metric, which indicates base calling calculated by combining depth of coverage with the confidence of read alignment, increased from hundreds to thousands of genes, including several associated with clinical phenotypes. While depth of coverage has been considered crucial for the assessment of WES performance, we demonstrated that genotypability has a greater impact in revealing obscure regions, with ~1% increase in variant calling in respect to shorter DNA fragments. Results confirmed that this approach enlightened many regions previously not explored.
Identifiants
pubmed: 32523024
doi: 10.1038/s41598-020-66331-z
pii: 10.1038/s41598-020-66331-z
pmc: PMC7287100
doi:
Substances chimiques
DNA
9007-49-2
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
9424Références
Rabbani, B., Tekin, M. & Mahdieh, N. The promise of whole-exome sequencing in medical genetics. J. Hum. Genet. 59, 5–15 (2014).
doi: 10.1038/jhg.2013.114
Sun, Y. et al. Next-Generation Diagnostics: Gene Panel, Exome, or Whole Genome? Hum. Mutat. 36, 648–655 (2015).
doi: 10.1002/humu.22783
Metzker, M. L. Sequencing technologies the next generation. Nat. Rev. Genet. 11, 31–46 (2010).
doi: 10.1038/nrg2626
Ku, C. S., Cooper, D. N. & Patrinos, G. P. The Rise and Rise of Exome Sequencing. Public Health Genomics 19, 315–324 (2017).
doi: 10.1159/000450991
Shigemizu, D. et al. Performance comparison of four commercial human whole-exome capture platforms. Sci. Rep. 5, 1–8 (2015).
doi: 10.1038/srep12742
Sims, D., Sudbery, I., Ilott, N. E., Heger, A. & Ponting, C. P. Sequencing depth and coverage: Key considerations in genomic analyses. Nat. Rev. Genet. 15, 121–132 (2014).
doi: 10.1038/nrg3642
Clark, M. J. et al. Performance comparison of exome DNA sequencing technologies. Nat. Biotechnol. 29, 908–916 (2011).
doi: 10.1038/nbt.1975
García-García, G. et al. Assessment of the latest NGS enrichment capture methods in clinical context. Sci. Rep. 6, 1–8 (2016).
doi: 10.1038/srep20948
Bodi, K. et al. Comparison of commercially available target enrichment methods for next-generation sequencing. J. Biomol. Tech. 24, 73–86 (2013).
doi: 10.7171/jbt.13-2402-002
Mertes, F. et al. Targeted enrichment of genomic DNA regions for next-generation sequencing. Brief. Funct. Genomics 10, 374–386 (2011).
doi: 10.1093/bfgp/elr033
Meienberg, J. et al. New insights into the performance of human whole-exome capture platforms. Nucleic Acids Res. 43 (2015).
Pommerenke, C. et al. Enhanced whole exome sequencing by higher DNA insert lengths. BMC Genomics 17, 1–8 (2016).
doi: 10.1186/s12864-016-2698-y
Choi, M. et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc. Natl. Acad. Sci. USA 106, 19096–19101 (2009).
doi: 10.1073/pnas.0910672106
Wang, Q., Shashikant, C. S., Jensen, M., Altman, N. S. & Girirajan, S. Novel metrics to measure coverage in whole exome sequencing datasets reveal local and global non-uniformity. Sci. Rep. 7, 1–11 (2017).
doi: 10.1038/s41598-016-0028-x
Ng, S. B. et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 461, 272–276 (2009).
doi: 10.1038/nature08250
Van der Auwera, G. A. et al. From fastQ data to high-confidence variant calls: The genome analysis toolkit best practices pipeline. Current Protocols in Bioinformatics, https://doi.org/10.1002/0471250953.bi1110s43 (2013).
Ferrarini, A. et al. The use of non-variant sites to improve the clinical assessment of whole-genome sequence data. PLoS One 10, 1–15 (2015).
doi: 10.1371/journal.pone.0132180
Ebbert, M. T. W. et al. Systematic analysis of dark and camouflaged genes reveals disease-relevant genes hiding in plain sight. Genome Biol. 20, 1–23 (2019).
doi: 10.1186/s13059-019-1707-2
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
doi: 10.1093/bioinformatics/btp324
Ballester, L. Y., Luthra, R., Kanagal-Shamanna, R. & Singh, R. R. Advances in clinical next-generation sequencing: Target enrichment and sequencing technologies. Expert Rev. Mol. Diagn. 16, 357–372 (2016).
doi: 10.1586/14737159.2016.1133298
Sakharkar, M. K., Chow, V. T. K. & Kangueane, P. Distributions of exons and introns in the human genome. In Silico Biol. 4, 387–393 (2004).
pubmed: 15217358
Gudlaugsdottir, S., Boswell, D. R., Wood, G. R. & Ma, J. Exon size distribution and the origin of introns. Genetica 131, 299–306 (2007).
doi: 10.1007/s10709-007-9139-4
Head, S. R. et al. Library construction for next-generation sequencing: Overviews and challenges. Biotechniques 56, 61–77 (2014).
doi: 10.2144/000114133
Ebbert, M. T. W. et al. Evaluating the necessity of PCR duplicate removal from next-generation sequencing data and a comparison of approaches. BMC Bioinformatics 17, (2016).
Rehm, H. L. et al. ACMG clinical laboratory standards for next-generation sequencing. Genet. Med. 15, 733–747 (2013).
doi: 10.1038/gim.2013.92
Mandelker, D. et al. Navigating highly homologous genes in a molecular diagnostic setting: A resource for clinical next-generation sequencing. Genet. Med. 18, 1282–1289 (2016).
doi: 10.1038/gim.2016.58
Kalia, S. S. et al. Recommendations for reporting of secondary findings in clinical exome and genome sequencing, 2016 update (ACMG SF v2.0): A policy statement of the American College of Medical Genetics and Genomics. Genet. Med. 19, 249–255 (2017).
doi: 10.1038/gim.2016.190