DeepSSV: detecting somatic small variants in paired tumor and normal sequencing data with convolutional neural network.
convolutional neural network
deep learning
mapping information
paired tumor/normal sequencing data
somatic small variants
Journal
Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837
Informations de publication
Date de publication:
20 07 2021
20 07 2021
Historique:
received:
24
07
2020
revised:
05
09
2020
accepted:
19
09
2020
pubmed:
10
11
2020
medline:
20
11
2021
entrez:
9
11
2020
Statut:
ppublish
Résumé
It is of considerable interest to detect somatic mutations in paired tumor and normal sequencing data. A number of callers that are based on statistical or machine learning approaches have been developed to detect somatic small variants. However, they take into consideration only limited information about the reference and potential variant allele in both tumor and normal samples at a candidate somatic site. Also, they differ in how biological and technological noises are addressed. Hence, they are expected to produce divergent outputs. To overcome the drawbacks of existing somatic callers, we develop a deep learning-based tool called DeepSSV, which employs a convolutional neural network (CNN) model to learn increasingly abstract feature representations from the raw data in higher feature layers. DeepSSV creates a spatially oriented representation of read alignments around the candidate somatic sites adapted for the convolutional architecture, which enables it to expand to effectively gather scattered evidence. Moreover, DeepSSV incorporates the mapping information of both reference allele-supporting and variant allele-supporting reads in the tumor and normal samples at a genomic site that are readily available in the pileup format file. Together, the CNN model can process the whole alignment information. Such representational richness allows the model to capture the dependencies in the sequence and identify context-based sequencing artifacts. We fitted the model on ground truth somatic mutations and did benchmarking experiments on simulated and real tumors. The benchmarking results demonstrate that DeepSSV outperforms its state-of-the-art competitors in overall F1 score.
Identifiants
pubmed: 33164053
pii: 5960414
doi: 10.1093/bib/bbaa272
pii:
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.