Performance of somatic structural variant calling in lung cancer using Oxford Nanopore sequencing technology.

Benchmarking long read approaches Long read sequencing Small cell lung cancer Somatic structural variants detection

Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
30 Sep 2024
Historique:
received: 01 03 2024
accepted: 11 09 2024
medline: 1 10 2024
pubmed: 1 10 2024
entrez: 30 9 2024
Statut: epublish

Résumé

Lung cancer is a heterogeneous disease and the primary cause of cancer-related mortality worldwide. Somatic mutations, including large structural variants, are important biomarkers in lung cancer for selecting targeted therapy. Genomic studies in lung cancer have been conducted using short-read sequencing. Emerging long-read sequencing technologies are a promising alternative to study somatic structural variants, however there is no current consensus on how to process data and call somatic events. In this study, we preformed whole genome sequencing of lung cancer and matched non-tumour samples using long and short read sequencing to comprehensively benchmark three sequence aligners and seven structural variant callers comprised of generic callers (SVIM, Sniffles2, DELLY in generic mode and cuteSV) and somatic callers (Severus, SAVANA, nanomonsv and DELLY in somatic modes). Different combinations of aligners and variant callers influenced somatic structural variant detection. The choice of caller had a significant influence on somatic structural variant detection in terms of variant type, size, sensitivity, and accuracy. The performance of each variant caller was assessed by comparing to somatic structural variants identified by short-read sequencing. When compared to somatic structural variants detected with short-read sequencing, more events were detected with long-read sequencing. The mean recall of somatic variant events identified by long-read sequencing was higher for the somatic callers (72%) than generic callers (53%). Among the somatic callers when using the minimap2 aligner, SAVANA and Severus achieved the highest recall at 79.5% and 79.25% respectively, followed by nanomonsv with a recall of 72.5%. Long-read sequencing can identify somatic structural variants in clincal samples. The longer reads have the potential to improve our understanding of cancer development and inform personalized cancer treatment.

Sections du résumé

BACKGROUND BACKGROUND
Lung cancer is a heterogeneous disease and the primary cause of cancer-related mortality worldwide. Somatic mutations, including large structural variants, are important biomarkers in lung cancer for selecting targeted therapy. Genomic studies in lung cancer have been conducted using short-read sequencing. Emerging long-read sequencing technologies are a promising alternative to study somatic structural variants, however there is no current consensus on how to process data and call somatic events. In this study, we preformed whole genome sequencing of lung cancer and matched non-tumour samples using long and short read sequencing to comprehensively benchmark three sequence aligners and seven structural variant callers comprised of generic callers (SVIM, Sniffles2, DELLY in generic mode and cuteSV) and somatic callers (Severus, SAVANA, nanomonsv and DELLY in somatic modes).
RESULTS RESULTS
Different combinations of aligners and variant callers influenced somatic structural variant detection. The choice of caller had a significant influence on somatic structural variant detection in terms of variant type, size, sensitivity, and accuracy. The performance of each variant caller was assessed by comparing to somatic structural variants identified by short-read sequencing. When compared to somatic structural variants detected with short-read sequencing, more events were detected with long-read sequencing. The mean recall of somatic variant events identified by long-read sequencing was higher for the somatic callers (72%) than generic callers (53%). Among the somatic callers when using the minimap2 aligner, SAVANA and Severus achieved the highest recall at 79.5% and 79.25% respectively, followed by nanomonsv with a recall of 72.5%.
CONCLUSION CONCLUSIONS
Long-read sequencing can identify somatic structural variants in clincal samples. The longer reads have the potential to improve our understanding of cancer development and inform personalized cancer treatment.

Identifiants

pubmed: 39350042
doi: 10.1186/s12864-024-10792-3
pii: 10.1186/s12864-024-10792-3
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

898

Informations de copyright

© 2024. The Author(s).

Références

WHO. Cancer World Health Organization (Fact sheets). 2022. Available from: https://www.who.int/news-room/fact-sheets/detail/cancer . Cited 2023 23rd Feb.
Kim K-B, Dunn CT, Park K-S. Recent progress in mapping the emerging landscape of the small-cell lung cancer genome. Exp Mol Med. 2019;51(12):1–13.
pubmed: 31827074 pmcid: 6881327
Kris MG, Johnson BE, Berry LD, Kwiatkowski DJ, Iafrate AJ, Wistuba II, et al. Using multiplexed assays of oncogenic drivers in lung cancers to select targeted drugs. JAMA. 2014;311(19):1998–2006.
pubmed: 24846037 doi: 10.1001/jama.2014.3741 pmcid: 4163053
Herbst RS, Morgensztern D, Boshoff C. The biology and management of non-small cell lung cancer. Nature. 2018;553(7689):446–54.
pubmed: 29364287 doi: 10.1038/nature25183
Zhang T, Joubert P, Ansari-Pour N, Zhao W, Hoang PH, Lokanga R, et al. Genomic and evolutionary classification of lung cancer in never smokers. Nat Genet. 2021;53(9):1348–59.
pubmed: 34493867 doi: 10.1038/s41588-021-00920-0 pmcid: 8432745
Alexandrov LB, Kim J, Haradhvala NJ, Huang MN, Tian Ng AW, Wu Y, et al. The repertoire of mutational signatures in human cancer. Nature. 2020;578(7793):94–101.
pubmed: 32025018 doi: 10.1038/s41586-020-1943-3 pmcid: 7054213
Collisson EA, Campbell JD, Brooks AN, Berger AH, Lee W, Chmielecki J, et al. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511(7511):543–50.
doi: 10.1038/nature13385
Li Y, Roberts ND, Wala JA, Shapira O, Schumacher SE, Kumar K, et al. Patterns of somatic structural variation in human cancer genomes. Nature. 2020;578(7793):112–21.
pubmed: 32025012 doi: 10.1038/s41586-019-1913-9 pmcid: 7025897
Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, et al. Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature. 2007;448(7153):561–6.
pubmed: 17625570 doi: 10.1038/nature05945
Shaw AT, Yeap BY, Mino-Kenudson M, Digumarthy SR, Costa DB, Heist RS, et al. Clinical features and outcome of patients with non-small-cell lung cancer who harbor EML4-ALK. J Clin Oncol. 2009;27(26):4247–53.
pubmed: 19667264 doi: 10.1200/JCO.2009.22.6993 pmcid: 2744268
Rudin CM, Brambilla E, Faivre-Finn C, Sage J. Small-cell lung cancer. Nat Rev Dis Primers. 2021;7(1):3.
pubmed: 33446664 doi: 10.1038/s41572-020-00235-0 pmcid: 8177722
Arakawa S, Yoshida T, Shirasawa M, Takayanagi D, Yagishita S, Motoi N, et al. RB1 loss induced small cell lung cancer transformation as acquired resistance to pembrolizumab in an advanced NSCLC patient. Lung Cancer. 2021;151:101–3.
pubmed: 33279272 doi: 10.1016/j.lungcan.2020.11.016
Febres-Aldana CA, Chang JC, Ptashkin R, Wang Y, Gedvilaite E, Baine MK, et al. Rb tumor suppressor in small cell lung cancer: combined genomic and IHC analysis with a description of a distinct rb-proficient subset. Clin Cancer Res. 2022;28(21):4702–13.
pubmed: 35792876 doi: 10.1158/1078-0432.CCR-22-1115 pmcid: 9623236
George J, Lim JS, Jang SJ, Cun Y, Ozretić L, Kong G, et al. Comprehensive genomic profiles of small cell lung cancer. Nature. 2015;524(7563):47–53.
pubmed: 26168399 doi: 10.1038/nature14664 pmcid: 4861069
Cretu Stancu M, Van Roosmalen MJ, Renkens I, Nieboer MM, Middelkamp S, De Ligt J, et al. Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat Commun. 2017;8(1):1326.
pubmed: 29109544 doi: 10.1038/s41467-017-01343-4 pmcid: 5673902
Merker JD, Wenger AM, Sneddon T, Grove M, Zappala Z, Fresard L, et al. Long-read genome sequencing identifies causal structural variation in a Mendelian disease. Genet Med. 2018;20(1):159–63.
pubmed: 28640241 doi: 10.1038/gim.2017.86
Xu L, Wang X, Lu X, Liang F, Liu Z, Zhang H, et al. Long-read sequencing identifies novel structural variations in colorectal cancer. PLoS Genet. 2023;19(2): e1010514.
pubmed: 36812239 doi: 10.1371/journal.pgen.1010514 pmcid: 10013895
Gong L, Wong C-H, Cheng W-C, Tjong H, Menghi F, Ngan CY, et al. Picky comprehensively detects high-resolution structural variants in nanopore long reads. Nat Methods. 2018;15(6):455–60.
pubmed: 29713081 doi: 10.1038/s41592-018-0002-6 pmcid: 5990454
Chaisson MJP, Sanders AD, Zhao X, Malhotra A, Porubsky D, Rausch T, et al. Multi-platform discovery of haplotype-resolved structural variation in human genomes. Nat Commun. 2019;10(1):1784.
pubmed: 30992455 doi: 10.1038/s41467-018-08148-z pmcid: 6467913
Spies N, Weng Z, Bishara A, McDaniel J, Catoe D, Zook JM, et al. Genome-wide reconstruction of complex structural variants using read clouds. Nat Methods. 2017;14(9):915–20.
pubmed: 28714986 doi: 10.1038/nmeth.4366 pmcid: 5578891
Euskirchen P, Bielle F, Labreche K, Kloosterman WP, Rosenberg S, Daniau M, et al. Same-day genomic and epigenomic diagnosis of brain tumors using real-time nanopore sequencing. Acta Neuropathol. 2017;134(5):691–703.
pubmed: 28638988 doi: 10.1007/s00401-017-1743-5 pmcid: 5645447
Technologies ON. [Available from: https://nanoporetech.com/accuracy .
Shiraishi Y, Koya J, Chiba K, Okada A, Arai Y, Saito Y, et al. Precise characterization of somatic complex structural variations from tumor/control paired long-read sequencing data with nanomonsv. Nucleic Acids Res. 2023;51(14):e74-e.
doi: 10.1093/nar/gkad526
Rausch T, Zichner T, Schlattl A, Stütz AM, Benes V, Korbel JO. DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics. 2012;28(18):i333–9.
pubmed: 22962449 doi: 10.1093/bioinformatics/bts378 pmcid: 3436805
Sedlazeck FJ, Rescheneder P, Smolka M, Fang H, Nattestad M, von Haeseler A, et al. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods. 2018;15(6):461–8.
pubmed: 29713083 doi: 10.1038/s41592-018-0001-7 pmcid: 5990442
Jiang T, Liu Y, Jiang Y, Li J, Gao Y, Cui Z, et al. Long-read-based human genomic structural variation detection with cuteSV. Genome Biol. 2020;21(1):1–24.
doi: 10.1186/s13059-020-02107-y
Heller D, Vingron M. SVIM: structural variant identification using mapped long reads. Bioinformatics. 2019;35(17):2907–15.
pubmed: 30668829 doi: 10.1093/bioinformatics/btz041 pmcid: 6735718
Dierckxsens N, Li T, Vermeesch JR, Xie Z. A benchmark of structural variation detection by long reads through a realistic simulated model. Genome Biol. 2021;22(1):342.
pubmed: 34911553 doi: 10.1186/s13059-021-02551-4 pmcid: 8672642
Lin J, Jia P, Wang S, Ye K. Comparison and benchmark of long-read based structural variant detection strategies. bioRxiv. 2022:2022.08.09.503274.  https://doi.org/10.1101/2022.08.09.503274 .
Yildiz G, Zanini SF, Afsharyan NP, Obermeier C, Snowdon RJ, Golicz AA. Benchmarking Oxford Nanopore Read Alignment-Based Structural Variant Detection Tools in Crop Plant Genomes. bioRxiv. 2022:2022.09.23.508909.  https://doi.org/10.1002/tpg2.20314 .
Bolognini D, Magi A. Evaluation of germline structural variant calling methods for nanopore sequencing data. Front Genet. 2021;12: 761791.
pubmed: 34868242 doi: 10.3389/fgene.2021.761791 pmcid: 8637281
Amemiya HM, Kundaje A, Boyle AP. The ENCODE Blacklist: identification of problematic regions of the genome. Sci Rep. 2019;9(1):9354.
pubmed: 31249361 doi: 10.1038/s41598-019-45839-z pmcid: 6597582
Jain C, Rhie A, Hansen NF, Koren S, Phillippy AM. Long-read mapping to repetitive reference sequences using Winnowmap2. Nat Methods. 2022;19(6):705–10.
pubmed: 35365778 doi: 10.1038/s41592-022-01457-8 pmcid: 10510034
LoTempio J, Delot E, Vilain E. Benchmarking long-read genome sequence alignment tools for human genomics applications. PeerJ. 2023;11:e16515.  https://doi.org/10.7717/peerj.16515 .
Li H. New strategies to improve minimap2 alignment accuracy. Bioinformatics. 2021;37(23):4572–4.
pubmed: 34623391 doi: 10.1093/bioinformatics/btab705 pmcid: 8652018
Guo B, Han X, Wu Z, Da W, Zhu H. Spectral karyotyping: an unique technique for the detection of complex genomic rearrangements in leukemia. Transl Pediatr. 2014;3(2):135–9.
pubmed: 26835331 pmcid: 4729104
Fu Y, Mahmoud M, Muraliraman VV, Sedlazeck FJ, Treangen TJ. Vulcan: Improved long-read mapping and structural variant calling via dual-mode alignment. GigaScience. 2021;10(9).  https://doi.org/10.1093/gigascience/giab063 .
Jenko Bizjan B, Katsila T, Tesovnik T, Šket R, Debeljak M, Matsoukas MT, et al. Challenges in identifying large germline structural variants for clinical use by long read sequencing. Comput Struct Biotechnol J. 2020;18:83–92.
pubmed: 32099591 doi: 10.1016/j.csbj.2019.11.008
Ura H, Togi S, Niida Y. Dual deep sequencing improves the accuracy of low-frequency somatic mutation detection in cancer gene panel testing. Int J Mol Sci. 2020;21(10): 3530.
pubmed: 32429412 doi: 10.3390/ijms21103530 pmcid: 7278996
Torii A, Oki M, Yamada A, Kogure Y, Kitagawa C, Saka H. EUS-B-FNA enhances the diagnostic yield of EBUS bronchoscope for intrathoracic lesions. Lung. 2022;200(5):643–8.
pubmed: 36074142 doi: 10.1007/s00408-022-00563-w
Li T, Kung H-J, Mack PC, Gandara DR. Genotyping and genomic profiling of non–small-cell lung cancer: implications for current and future therapies. J Clin Oncol. 2013;31(8):1039–49.
pubmed: 23401433 doi: 10.1200/JCO.2012.45.3753 pmcid: 3589700
Ramarao-Milne P, Kondrashova O, Patch AM, Nones K, Koufariotis LT, Newell F, et al. Comparison of actionable events detected in cancer genomes by whole-genome sequencing, in silico whole-exome and mutation panels. ESMO Open. 2022;7(4):100540-.
pubmed: 35849877 doi: 10.1016/j.esmoop.2022.100540 pmcid: 9463385
Fielding D, Dalley AJ, Singh M, Nandakumar L, Lakis V, Chittoory H, et al. Whole genome sequencing in advanced lung cancer can be performed using diff-quik cytology smears derived from Endobronchial Ultrasound, Transbronchial Needle Aspiration (EBUS TBNA). Lung. 2023;201(4):407–13.
pubmed: 37405466 doi: 10.1007/s00408-023-00631-9 pmcid: 10444633
Fielding D, Dalley AJ, Singh M, Nandakumar L, Nones K, Lakis V, et al. Prospective optimization of endobronchial ultrasound-guided transbronchial needle aspiration lymph node assessment for lung cancer: three needle agitations are noninferior to 10 agitations for adequate tumor cell and DNA yield. JTO Clin Res Rep. 2022;3(10):100403-.
pubmed: 36147610 pmcid: 9486562
Lee BT, Barber GP, Benet-Pagès A, Casper J, Clawson H, Diekhans M, et al. The UCSC Genome Browser database: 2022 update. Nucleic Acids Res. 2022;50(D1):D1115–22.
pubmed: 34718705 doi: 10.1093/nar/gkab959
Craig DW, Nasser S, Corbett R, Chan SK, Murray L, Legendre C, et al. A somatic reference standard for cancer genome sequencing. Sci Rep. 2016;6(1): 24607.
pubmed: 27094764 doi: 10.1038/srep24607 pmcid: 4837349
Dong X, Du MRM, Gouil Q, Tian L, Jabbari JS, Bowden R, et al. Benchmarking long-read RNA-sequencing analysis tools using in silico mixtures. Nat Methods. 2023;20(11):1810–21.
pubmed: 37783886 doi: 10.1038/s41592-023-02026-3
Shafin K, Pesout T, Lorig-Roach R, Haukness M, Olsen HE, Bosworth C, et al. Nanopore sequencing and the Shasta toolkit enable efficient de novo assembly of eleven human genomes. Nat Biotechnol. 2020;38(9):1044–53.
pubmed: 32686750 doi: 10.1038/s41587-020-0503-6 pmcid: 7483855
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 2011;17(1):10–2.
doi: 10.14806/ej.17.1.200
Li H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv: Genomics. 2013;00(00):1–3.  https://doi.org/10.48550/arXiv.1303.3997 .
Raine KM, Van Loo P, Wedge DC, Jones D, Menzies A, Butler AP, et al. ascatNgs: identifying somatically acquired copy-number alterations from whole-genome sequencing data. Curr Protoc Bioinform. 2016;56:15.9.1-.9.7.
doi: 10.1002/cpbi.17
Hayward NK, Wilmott JS, Waddell N, Johansson PA, Field MA, Nones K, et al. Whole-genome landscapes of major melanoma subtypes. Nature. 2017;545(7653):175–80.
pubmed: 28467829 doi: 10.1038/nature22071
Cameron DL, Baber J, Shale C, Valle-Inclan JE, Besselink N, van Hoeck A, et al. GRIDSS2: comprehensive characterisation of somatic structural variation using single breakend variants and structural variant phasing. Genome Biol. 2021;22(1):202.
pubmed: 34253237 doi: 10.1186/s13059-021-02423-x pmcid: 8274009

Auteurs

Lingchen Liu (L)

QIMR Berghofer Medical Research Institute, Brisbane, Australia.
Faculty of Medicine, The University of Queensland, Brisbane, Australia.

Jia Zhang (J)

QIMR Berghofer Medical Research Institute, Brisbane, Australia.
Faculty of Medicine, The University of Queensland, Brisbane, Australia.

Scott Wood (S)

QIMR Berghofer Medical Research Institute, Brisbane, Australia.

Felicity Newell (F)

QIMR Berghofer Medical Research Institute, Brisbane, Australia.

Conrad Leonard (C)

QIMR Berghofer Medical Research Institute, Brisbane, Australia.

Lambros T Koufariotis (LT)

QIMR Berghofer Medical Research Institute, Brisbane, Australia.

Katia Nones (K)

QIMR Berghofer Medical Research Institute, Brisbane, Australia.

Andrew J Dalley (AJ)

Faculty of Medicine, The University of Queensland, Brisbane, Australia.

Haarika Chittoory (H)

Faculty of Medicine, The University of Queensland, Brisbane, Australia.

Farzad Bashirzadeh (F)

Department of Thoracic Medicine, The Royal Brisbane & Women's Hospital, Brisbane, Australia.

Jung Hwa Son (JH)

Department of Thoracic Medicine, The Royal Brisbane & Women's Hospital, Brisbane, Australia.

Daniel Steinfort (D)

Department of Thoracic Medicine, Royal Melbourne Hospital, Melbourne, Australia.

Jonathan P Williamson (JP)

Department of Thoracic Medicine, Liverpool Hospital Sydney, Sydney, Australia.

Michael Bint (M)

Department of Thoracic Medicine, Sunshine Coast University Hospital, Birtinya, Australia.

Carl Pahoff (C)

Department of Thoracic Medicine, Gold Coast University Hospital, Southport, Australia.

Phan T Nguyen (PT)

Department of Thoracic Medicine, Royal Adelaide Hospital, Adelaide, Australia.

Scott Twaddell (S)

Department of Respiratory and Sleep Medicine, John Hunter Hospital, Newcastle, Australia.

David Arnold (D)

Department of Respiratory and Sleep Medicine, John Hunter Hospital, Newcastle, Australia.

Christopher Grainge (C)

Department of Respiratory and Sleep Medicine, John Hunter Hospital, Newcastle, Australia.

Peter T Simpson (PT)

Faculty of Medicine, The University of Queensland, Brisbane, Australia.

David Fielding (D)

Faculty of Medicine, The University of Queensland, Brisbane, Australia.
Department of Thoracic Medicine, The Royal Brisbane & Women's Hospital, Brisbane, Australia.

Nicola Waddell (N)

QIMR Berghofer Medical Research Institute, Brisbane, Australia. nic.waddell@qimrberghofer.edu.au.
Faculty of Medicine, The University of Queensland, Brisbane, Australia. nic.waddell@qimrberghofer.edu.au.

John V Pearson (JV)

QIMR Berghofer Medical Research Institute, Brisbane, Australia.
Faculty of Medicine, The University of Queensland, Brisbane, Australia.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH