Fusion Gene Detection Using Whole-Exome Sequencing Data in Cancer Patients.

acute myeloid leukemia discordant read fusion gene prostate cancer split read whole exome sequencing

Journal

Frontiers in genetics
ISSN: 1664-8021
Titre abrégé: Front Genet
Pays: Switzerland
ID NLM: 101560621

Informations de publication

Date de publication:
2022
Historique:
received: 23 11 2021
accepted: 31 01 2022
entrez: 7 3 2022
pubmed: 8 3 2022
medline: 8 3 2022
Statut: epublish

Résumé

Several fusion genes are directly involved in the initiation and progression of cancers. Numerous bioinformatics tools have been developed to detect fusion events, but they are mainly based on RNA-seq data. The whole-exome sequencing (WES) represents a powerful technology that is widely used for disease-related DNA variant detection. In this study, we build a novel analysis pipeline called Fuseq-WES to detect fusion genes at DNA level based on the WES data. The same method applies also for targeted panel sequencing data. We assess the method to real datasets of acute myeloid leukemia (AML) and prostate cancer patients. The result shows that two of the main AML fusion genes discovered in RNA-seq data, PML-RARA and CBFB-MYH11, are detected in the WES data in 36 and 63% of the available samples, respectively. For the targeted deep-sequencing of prostate cancer patients, detection of the TMPRSS2-ERG fusion, which is the most frequent chimeric alteration in prostate cancer, is 91% concordant with a manually curated procedure based on four other methods. In summary, the overall results indicate that it is challenging to detect fusion genes in WES data with a standard coverage of ∼ 15-30x, where fusion candidates discovered in the RNA-seq data are often not detected in the WES data and vice versa. A subsampling study of the prostate data suggests that a coverage of at least 75x is necessary to achieve high accuracy.

Identifiants

pubmed: 35251131
doi: 10.3389/fgene.2022.820493
pii: 820493
pmc: PMC8888970
doi:

Types de publication

Journal Article

Langues

eng

Pagination

820493

Informations de copyright

Copyright © 2022 Deng, Murugan, Lindberg, Chellappa, Shen, Pawitan and Vu.

Déclaration de conflit d'intérêts

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Références

Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Cancer Inform. 2014 Sep 21;13(Suppl 2):67-82
pubmed: 25288881
Bioinformatics. 2010 Mar 1;26(5):589-95
pubmed: 20080505
Sci Rep. 2016 Feb 10;6:21597
pubmed: 26862001
Nat Biotechnol. 2019 Aug;37(8):907-915
pubmed: 31375807
Nucleic Acids Res. 2003 Jan 1;31(1):51-4
pubmed: 12519945
Nat Genet. 2017 Sep;49(9):1336-1345
pubmed: 28783165
Genome Biol. 2014 Jun 26;15(6):R84
pubmed: 24970577
Cancers (Basel). 2020 Mar 08;12(3):
pubmed: 32182684
Genome Res. 2008 Nov;18(11):1851-8
pubmed: 18714091
Eur Urol Focus. 2017 Dec;3(6):526-528
pubmed: 28753850
Proc Natl Acad Sci U S A. 2015 Mar 17;112(11):E1272-7
pubmed: 25733895
Eur J Haematol. 2011 May;86(5):361-71
pubmed: 21435002
Genome Biol. 2019 Oct 21;20(1):213
pubmed: 31639029
Ann Clin Transl Neurol. 2018 May 24;5(7):832-842
pubmed: 30009200
Nature. 2014 Jul 31;511(7511):543-50
pubmed: 25079552
Gene. 2019 Feb 20;686:85-91
pubmed: 30399426
Int J Biochem Cell Biol. 2015 Nov;68:48-58
pubmed: 26320575
Bioinformatics. 2020 Feb 1;36(3):805-812
pubmed: 31400221
N Engl J Med. 2013 May 30;368(22):2059-74
pubmed: 23634996
Biol Direct. 2018 Jul 16;13(1):14
pubmed: 30012197
BMC Bioinformatics. 2016 Nov 8;17(Suppl 12):341
pubmed: 28185561
mSystems. 2019 Feb 19;4(1):
pubmed: 30801027
Biomed Res Int. 2013;2013:340620
pubmed: 23555082
Cell. 2013 Oct 10;155(2):462-77
pubmed: 24120142
Diagnostics (Basel). 2020 Jul 27;10(8):
pubmed: 32726941
Genome Res. 2017 Dec;27(12):2050-2060
pubmed: 29097403
Genes Chromosomes Cancer. 2013 Oct;52(10):873-86
pubmed: 23761323
Nature. 2018 Oct;562(7728):526-531
pubmed: 30333627
Bioinformatics. 2011 Nov 1;27(21):2987-93
pubmed: 21903627
Nat Commun. 2017 Nov 28;8(1):1816
pubmed: 29180633
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
Bioinformatics. 2013 Feb 15;29(4):494-6
pubmed: 23314324
Genome Med. 2015 May 11;7(1):43
pubmed: 26019724
Genome Biol. 2011 Aug 11;12(8):R72
pubmed: 21835007
BMC Genomics. 2018 Nov 1;19(1):786
pubmed: 30382840
Genome Res. 2018 Apr;28(4):581-591
pubmed: 29535149
Breast Cancer Res Treat. 2016 Feb;156(1):21-32
pubmed: 26907767
Bioinformatics. 2015 Sep 1;31(17):2778-84
pubmed: 25926345
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Mol Ther. 2018 May 2;26(5):1366-1374
pubmed: 29606503

Auteurs

Wenjiang Deng (W)

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

Sarath Murugan (S)

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

Johan Lindberg (J)

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

Venkatesh Chellappa (V)

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

Xia Shen (X)

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.
Biostatistics Group, Greater Bay Area Institute of Precision Medicine, Fudan University, Guangzhou, China.
Centre for Global Health Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom.

Yudi Pawitan (Y)

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

Trung Nghia Vu (TN)

Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.

Classifications MeSH