Performance of somatic structural variant calling in lung cancer using Oxford Nanopore sequencing technology.
Benchmarking long read approaches
Long read sequencing
Small cell lung cancer
Somatic structural variants detection
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
30 Sep 2024
30 Sep 2024
Historique:
received:
01
03
2024
accepted:
11
09
2024
medline:
1
10
2024
pubmed:
1
10
2024
entrez:
30
9
2024
Statut:
epublish
Résumé
Lung cancer is a heterogeneous disease and the primary cause of cancer-related mortality worldwide. Somatic mutations, including large structural variants, are important biomarkers in lung cancer for selecting targeted therapy. Genomic studies in lung cancer have been conducted using short-read sequencing. Emerging long-read sequencing technologies are a promising alternative to study somatic structural variants, however there is no current consensus on how to process data and call somatic events. In this study, we preformed whole genome sequencing of lung cancer and matched non-tumour samples using long and short read sequencing to comprehensively benchmark three sequence aligners and seven structural variant callers comprised of generic callers (SVIM, Sniffles2, DELLY in generic mode and cuteSV) and somatic callers (Severus, SAVANA, nanomonsv and DELLY in somatic modes). Different combinations of aligners and variant callers influenced somatic structural variant detection. The choice of caller had a significant influence on somatic structural variant detection in terms of variant type, size, sensitivity, and accuracy. The performance of each variant caller was assessed by comparing to somatic structural variants identified by short-read sequencing. When compared to somatic structural variants detected with short-read sequencing, more events were detected with long-read sequencing. The mean recall of somatic variant events identified by long-read sequencing was higher for the somatic callers (72%) than generic callers (53%). Among the somatic callers when using the minimap2 aligner, SAVANA and Severus achieved the highest recall at 79.5% and 79.25% respectively, followed by nanomonsv with a recall of 72.5%. Long-read sequencing can identify somatic structural variants in clincal samples. The longer reads have the potential to improve our understanding of cancer development and inform personalized cancer treatment.
Sections du résumé
BACKGROUND
BACKGROUND
Lung cancer is a heterogeneous disease and the primary cause of cancer-related mortality worldwide. Somatic mutations, including large structural variants, are important biomarkers in lung cancer for selecting targeted therapy. Genomic studies in lung cancer have been conducted using short-read sequencing. Emerging long-read sequencing technologies are a promising alternative to study somatic structural variants, however there is no current consensus on how to process data and call somatic events. In this study, we preformed whole genome sequencing of lung cancer and matched non-tumour samples using long and short read sequencing to comprehensively benchmark three sequence aligners and seven structural variant callers comprised of generic callers (SVIM, Sniffles2, DELLY in generic mode and cuteSV) and somatic callers (Severus, SAVANA, nanomonsv and DELLY in somatic modes).
RESULTS
RESULTS
Different combinations of aligners and variant callers influenced somatic structural variant detection. The choice of caller had a significant influence on somatic structural variant detection in terms of variant type, size, sensitivity, and accuracy. The performance of each variant caller was assessed by comparing to somatic structural variants identified by short-read sequencing. When compared to somatic structural variants detected with short-read sequencing, more events were detected with long-read sequencing. The mean recall of somatic variant events identified by long-read sequencing was higher for the somatic callers (72%) than generic callers (53%). Among the somatic callers when using the minimap2 aligner, SAVANA and Severus achieved the highest recall at 79.5% and 79.25% respectively, followed by nanomonsv with a recall of 72.5%.
CONCLUSION
CONCLUSIONS
Long-read sequencing can identify somatic structural variants in clincal samples. The longer reads have the potential to improve our understanding of cancer development and inform personalized cancer treatment.
Identifiants
pubmed: 39350042
doi: 10.1186/s12864-024-10792-3
pii: 10.1186/s12864-024-10792-3
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
898Informations de copyright
© 2024. The Author(s).
Références
WHO. Cancer World Health Organization (Fact sheets). 2022. Available from: https://www.who.int/news-room/fact-sheets/detail/cancer . Cited 2023 23rd Feb.
Kim K-B, Dunn CT, Park K-S. Recent progress in mapping the emerging landscape of the small-cell lung cancer genome. Exp Mol Med. 2019;51(12):1–13.
pubmed: 31827074
pmcid: 6881327
Kris MG, Johnson BE, Berry LD, Kwiatkowski DJ, Iafrate AJ, Wistuba II, et al. Using multiplexed assays of oncogenic drivers in lung cancers to select targeted drugs. JAMA. 2014;311(19):1998–2006.
pubmed: 24846037
doi: 10.1001/jama.2014.3741
pmcid: 4163053
Herbst RS, Morgensztern D, Boshoff C. The biology and management of non-small cell lung cancer. Nature. 2018;553(7689):446–54.
pubmed: 29364287
doi: 10.1038/nature25183
Zhang T, Joubert P, Ansari-Pour N, Zhao W, Hoang PH, Lokanga R, et al. Genomic and evolutionary classification of lung cancer in never smokers. Nat Genet. 2021;53(9):1348–59.
pubmed: 34493867
doi: 10.1038/s41588-021-00920-0
pmcid: 8432745
Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, et al. The repertoire of mutational signatures in human cancer. Nature. 2020;578(7793):94–101.
pubmed: 32025018
doi: 10.1038/s41586-020-1943-3
pmcid: 7054213
Collisson EA, Campbell JD, Brooks AN, Berger AH, Lee W, Chmielecki J, et al. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511(7511):543–50.
doi: 10.1038/nature13385
Li Y, Roberts ND, Wala JA, Shapira O, Schumacher SE, Kumar K, et al. Patterns of somatic structural variation in human cancer genomes. Nature. 2020;578(7793):112–21.
pubmed: 32025012
doi: 10.1038/s41586-019-1913-9
pmcid: 7025897
Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature. 2007;448(7153):561–6.
pubmed: 17625570
doi: 10.1038/nature05945
Shaw AT, Yeap BY, Mino-Kenudson M, Digumarthy SR, Costa DB, Heist RS, et al. Clinical features and outcome of patients with non-small-cell lung cancer who harbor EML4-ALK. J Clin Oncol. 2009;27(26):4247–53.
pubmed: 19667264
doi: 10.1200/JCO.2009.22.6993
pmcid: 2744268
Rudin CM, Brambilla E, Faivre-Finn C, Sage J. Small-cell lung cancer. Nat Rev Dis Primers. 2021;7(1):3.
pubmed: 33446664
doi: 10.1038/s41572-020-00235-0
pmcid: 8177722
Arakawa S, Yoshida T, Shirasawa M, Takayanagi D, Yagishita S, Motoi N, et al. RB1 loss induced small cell lung cancer transformation as acquired resistance to pembrolizumab in an advanced NSCLC patient. Lung Cancer. 2021;151:101–3.
pubmed: 33279272
doi: 10.1016/j.lungcan.2020.11.016
Febres-Aldana CA, Chang JC, Ptashkin R, Wang Y, Gedvilaite E, Baine MK, et al. Rb tumor suppressor in small cell lung cancer: combined genomic and IHC analysis with a description of a distinct rb-proficient subset. Clin Cancer Res. 2022;28(21):4702–13.
pubmed: 35792876
doi: 10.1158/1078-0432.CCR-22-1115
pmcid: 9623236
George J, Lim JS, Jang SJ, Cun Y, Ozretić L, Kong G, et al. Comprehensive genomic profiles of small cell lung cancer. Nature. 2015;524(7563):47–53.
pubmed: 26168399
doi: 10.1038/nature14664
pmcid: 4861069
Cretu Stancu M, Van Roosmalen MJ, Renkens I, Nieboer MM, Middelkamp S, De Ligt J, et al. Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat Commun. 2017;8(1):1326.
pubmed: 29109544
doi: 10.1038/s41467-017-01343-4
pmcid: 5673902
Merker JD, Wenger AM, Sneddon T, Grove M, Zappala Z, Fresard L, et al. Long-read genome sequencing identifies causal structural variation in a Mendelian disease. Genet Med. 2018;20(1):159–63.
pubmed: 28640241
doi: 10.1038/gim.2017.86
Xu L, Wang X, Lu X, Liang F, Liu Z, Zhang H, et al. Long-read sequencing identifies novel structural variations in colorectal cancer. PLoS Genet. 2023;19(2): e1010514.
pubmed: 36812239
doi: 10.1371/journal.pgen.1010514
pmcid: 10013895
Gong L, Wong C-H, Cheng W-C, Tjong H, Menghi F, Ngan CY, et al. Picky comprehensively detects high-resolution structural variants in nanopore long reads. Nat Methods. 2018;15(6):455–60.
pubmed: 29713081
doi: 10.1038/s41592-018-0002-6
pmcid: 5990454
Chaisson MJP, Sanders AD, Zhao X, Malhotra A, Porubsky D, Rausch T, et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat Commun. 2019;10(1):1784.
pubmed: 30992455
doi: 10.1038/s41467-018-08148-z
pmcid: 6467913
Spies N, Weng Z, Bishara A, McDaniel J, Catoe D, Zook JM, et al. Genome-wide reconstruction of complex structural variants using read clouds. Nat Methods. 2017;14(9):915–20.
pubmed: 28714986
doi: 10.1038/nmeth.4366
pmcid: 5578891
Euskirchen P, Bielle F, Labreche K, Kloosterman WP, Rosenberg S, Daniau M, et al. Same-day genomic and epigenomic diagnosis of brain tumors using real-time nanopore sequencing. Acta Neuropathol. 2017;134(5):691–703.
pubmed: 28638988
doi: 10.1007/s00401-017-1743-5
pmcid: 5645447
Technologies ON. [Available from: https://nanoporetech.com/accuracy .
Shiraishi Y, Koya J, Chiba K, Okada A, Arai Y, Saito Y, et al. Precise characterization of somatic complex structural variations from tumor/control paired long-read sequencing data with nanomonsv. Nucleic Acids Res. 2023;51(14):e74-e.
doi: 10.1093/nar/gkad526
Rausch T, Zichner T, Schlattl A, Stütz AM, Benes V, Korbel JO. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics. 2012;28(18):i333–9.
pubmed: 22962449
doi: 10.1093/bioinformatics/bts378
pmcid: 3436805
Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods. 2018;15(6):461–8.
pubmed: 29713083
doi: 10.1038/s41592-018-0001-7
pmcid: 5990442
Jiang T, Liu Y, Jiang Y, Li J, Gao Y, Cui Z, et al. Long-read-based human genomic structural variation detection with cuteSV. Genome Biol. 2020;21(1):1–24.
doi: 10.1186/s13059-020-02107-y
Heller D, Vingron M. SVIM: structural variant identification using mapped long reads. Bioinformatics. 2019;35(17):2907–15.
pubmed: 30668829
doi: 10.1093/bioinformatics/btz041
pmcid: 6735718
Dierckxsens N, Li T, Vermeesch JR, Xie Z. A benchmark of structural variation detection by long reads through a realistic simulated model. Genome Biol. 2021;22(1):342.
pubmed: 34911553
doi: 10.1186/s13059-021-02551-4
pmcid: 8672642
Lin J, Jia P, Wang S, Ye K. Comparison and benchmark of long-read based structural variant detection strategies. bioRxiv. 2022:2022.08.09.503274. https://doi.org/10.1101/2022.08.09.503274 .
Yildiz G, Zanini SF, Afsharyan NP, Obermeier C, Snowdon RJ, Golicz AA. Benchmarking Oxford Nanopore Read Alignment-Based Structural Variant Detection Tools in Crop Plant Genomes. bioRxiv. 2022:2022.09.23.508909. https://doi.org/10.1002/tpg2.20314 .
Bolognini D, Magi A. Evaluation of germline structural variant calling methods for nanopore sequencing data. Front Genet. 2021;12: 761791.
pubmed: 34868242
doi: 10.3389/fgene.2021.761791
pmcid: 8637281
Amemiya HM, Kundaje A, Boyle AP. The ENCODE Blacklist: identification of problematic regions of the genome. Sci Rep. 2019;9(1):9354.
pubmed: 31249361
doi: 10.1038/s41598-019-45839-z
pmcid: 6597582
Jain C, Rhie A, Hansen NF, Koren S, Phillippy AM. Long-read mapping to repetitive reference sequences using Winnowmap2. Nat Methods. 2022;19(6):705–10.
pubmed: 35365778
doi: 10.1038/s41592-022-01457-8
pmcid: 10510034
LoTempio J, Delot E, Vilain E. Benchmarking long-read genome sequence alignment tools for human genomics applications. PeerJ. 2023;11:e16515. https://doi.org/10.7717/peerj.16515 .
Li H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021;37(23):4572–4.
pubmed: 34623391
doi: 10.1093/bioinformatics/btab705
pmcid: 8652018
Guo B, Han X, Wu Z, Da W, Zhu H. Spectral karyotyping: an unique technique for the detection of complex genomic rearrangements in leukemia. Transl Pediatr. 2014;3(2):135–9.
pubmed: 26835331
pmcid: 4729104
Fu Y, Mahmoud M, Muraliraman VV, Sedlazeck FJ, Treangen TJ. Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment. GigaScience. 2021;10(9). https://doi.org/10.1093/gigascience/giab063 .
Jenko Bizjan B, Katsila T, Tesovnik T, Šket R, Debeljak M, Matsoukas MT, et al. Challenges in identifying large germline structural variants for clinical use by long read sequencing. Comput Struct Biotechnol J. 2020;18:83–92.
pubmed: 32099591
doi: 10.1016/j.csbj.2019.11.008
Ura H, Togi S, Niida Y. Dual deep sequencing improves the accuracy of low-frequency somatic mutation detection in cancer gene panel testing. Int J Mol Sci. 2020;21(10): 3530.
pubmed: 32429412
doi: 10.3390/ijms21103530
pmcid: 7278996
Torii A, Oki M, Yamada A, Kogure Y, Kitagawa C, Saka H. EUS-B-FNA enhances the diagnostic yield of EBUS bronchoscope for intrathoracic lesions. Lung. 2022;200(5):643–8.
pubmed: 36074142
doi: 10.1007/s00408-022-00563-w
Li T, Kung H-J, Mack PC, Gandara DR. Genotyping and genomic profiling of non–small-cell lung cancer: implications for current and future therapies. J Clin Oncol. 2013;31(8):1039–49.
pubmed: 23401433
doi: 10.1200/JCO.2012.45.3753
pmcid: 3589700
Ramarao-Milne P, Kondrashova O, Patch AM, Nones K, Koufariotis LT, Newell F, et al. Comparison of actionable events detected in cancer genomes by whole-genome sequencing, in silico whole-exome and mutation panels. ESMO Open. 2022;7(4):100540-.
pubmed: 35849877
doi: 10.1016/j.esmoop.2022.100540
pmcid: 9463385
Fielding D, Dalley AJ, Singh M, Nandakumar L, Lakis V, Chittoory H, et al. Whole genome sequencing in advanced lung cancer can be performed using diff-quik cytology smears derived from Endobronchial Ultrasound, Transbronchial Needle Aspiration (EBUS TBNA). Lung. 2023;201(4):407–13.
pubmed: 37405466
doi: 10.1007/s00408-023-00631-9
pmcid: 10444633
Fielding D, Dalley AJ, Singh M, Nandakumar L, Nones K, Lakis V, et al. Prospective optimization of endobronchial ultrasound-guided transbronchial needle aspiration lymph node assessment for lung cancer: three needle agitations are noninferior to 10 agitations for adequate tumor cell and DNA yield. JTO Clin Res Rep. 2022;3(10):100403-.
pubmed: 36147610
pmcid: 9486562
Lee BT, Barber GP, Benet-Pagès A, Casper J, Clawson H, Diekhans M, et al. The UCSC Genome Browser database: 2022 update. Nucleic Acids Res. 2022;50(D1):D1115–22.
pubmed: 34718705
doi: 10.1093/nar/gkab959
Craig DW, Nasser S, Corbett R, Chan SK, Murray L, Legendre C, et al. A somatic reference standard for cancer genome sequencing. Sci Rep. 2016;6(1): 24607.
pubmed: 27094764
doi: 10.1038/srep24607
pmcid: 4837349
Dong X, Du MRM, Gouil Q, Tian L, Jabbari JS, Bowden R, et al. Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures. Nat Methods. 2023;20(11):1810–21.
pubmed: 37783886
doi: 10.1038/s41592-023-02026-3
Shafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE, Bosworth C, et al. Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes. Nat Biotechnol. 2020;38(9):1044–53.
pubmed: 32686750
doi: 10.1038/s41587-020-0503-6
pmcid: 7483855
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17(1):10–2.
doi: 10.14806/ej.17.1.200
Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv: Genomics. 2013;00(00):1–3. https://doi.org/10.48550/arXiv.1303.3997 .
Raine KM, Van Loo P, Wedge DC, Jones D, Menzies A, Butler AP, et al. ascatNgs: identifying somatically acquired copy-number alterations from whole-genome sequencing data. Curr Protoc Bioinform. 2016;56:15.9.1-.9.7.
doi: 10.1002/cpbi.17
Hayward NK, Wilmott JS, Waddell N, Johansson PA, Field MA, Nones K, et al. Whole-genome landscapes of major melanoma subtypes. Nature. 2017;545(7653):175–80.
pubmed: 28467829
doi: 10.1038/nature22071
Cameron DL, Baber J, Shale C, Valle-Inclan JE, Besselink N, van Hoeck A, et al. GRIDSS2: comprehensive characterisation of somatic structural variation using single breakend variants and structural variant phasing. Genome Biol. 2021;22(1):202.
pubmed: 34253237
doi: 10.1186/s13059-021-02423-x
pmcid: 8274009