A simple convolutional neural network for prediction of enhancer-promoter interactions with DNA sequence data.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
01 09 2019
01 09 2019
Historique:
received:
27
08
2018
revised:
04
12
2018
accepted:
10
01
2019
pubmed:
17
1
2019
medline:
9
6
2020
entrez:
17
1
2019
Statut:
ppublish
Résumé
Enhancer-promoter interactions (EPIs) in the genome play an important role in transcriptional regulation. EPIs can be useful in boosting statistical power and enhancing mechanistic interpretation for disease- or trait-associated genetic variants in genome-wide association studies. Instead of expensive and time-consuming biological experiments, computational prediction of EPIs with DNA sequence and other genomic data is a fast and viable alternative. In particular, deep learning and other machine learning methods have been demonstrated with promising performance. First, using a published human cell line dataset, we demonstrate that a simple convolutional neural network (CNN) performs as well as, if no better than, a more complicated and state-of-the-art architecture, a hybrid of a CNN and a recurrent neural network. More importantly, in spite of the well-known cell line-specific EPIs (and corresponding gene expression), in contrast to the standard practice of training and predicting for each cell line separately, we propose two transfer learning approaches to training a model using all cell lines to various extents, leading to substantially improved predictive performance. Computer code is available at https://github.com/zzUMN/Combine-CNN-Enhancer-and-Promoters. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 30649185
pii: 5289332
doi: 10.1093/bioinformatics/bty1050
pmc: PMC6735851
doi:
Substances chimiques
DNA
9007-49-2
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
2899-2906Subventions
Organisme : NIGMS NIH HHS
ID : R01 GM113250
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM126002
Pays : United States
Organisme : NHLBI NIH HHS
ID : R01 HL105397
Pays : United States
Organisme : NHLBI NIH HHS
ID : R01 HL116720
Pays : United States
Informations de copyright
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Références
Nat Genet. 2016 May;48(5):488-96
pubmed: 27064255
Cell. 2014 Dec 18;159(7):1665-80
pubmed: 25497547
Nature. 2015 May 28;521(7553):436-44
pubmed: 26017442
Nat Genet. 2017 Oct;49(10):1428-1436
pubmed: 28869592
Bioinformatics. 2017 Jul 15;33(14):i252-i260
pubmed: 28881991
Curr Opin Genet Dev. 2012 Apr;22(2):79-85
pubmed: 22169023
J Am Stat Assoc. 2018;113(523):955-972
pubmed: 31354179
Cell. 2016 Nov 17;167(5):1369-1384.e19
pubmed: 27863249
Mol Syst Biol. 2016 Jul 29;12(7):878
pubmed: 27474269
Cell. 2012 Jan 20;148(1-2):84-98
pubmed: 22265404
Bioinformatics. 2018 Nov 1;34(21):3727-3737
pubmed: 29850911
Quant Biol. 2019 Jun;7(2):122-137
pubmed: 34113473
Genome Res. 2014 Nov;24(11):1854-68
pubmed: 25122612
Genetics. 2018 Jul;209(3):699-709
pubmed: 29728367