FactorNet: A deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data.
DREAM
Deep learning
ENCODE
Transcription factors
Journal
Methods (San Diego, Calif.)
ISSN: 1095-9130
Titre abrégé: Methods
Pays: United States
ID NLM: 9426302
Informations de publication
Date de publication:
15 08 2019
15 08 2019
Historique:
received:
01
11
2018
revised:
05
03
2019
accepted:
20
03
2019
pubmed:
30
3
2019
medline:
18
6
2020
entrez:
30
3
2019
Statut:
ppublish
Résumé
Due to the large numbers of transcription factors (TFs) and cell types, querying binding profiles of all valid TF/cell type pairs is not experimentally feasible. To address this issue, we developed a convolutional-recurrent neural network model, called FactorNet, to computationally impute the missing binding data. FactorNet trains on binding data from reference cell types to make predictions on testing cell types by leveraging a variety of features, including genomic sequences, genome annotations, gene expression, and signal data, such as DNase I cleavage. FactorNet implements several convenient strategies to reduce runtime and memory consumption. By visualizing the neural network models, we can interpret how the model predicts binding. We also investigate the variables that affect cross-cell type accuracy, and offer suggestions to improve upon this field. Our method ranked among the top teams in the ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge, achieving first place on six of the 13 final round evaluation TF/cell type pairs, the most of any competing team. The FactorNet source code is publicly available, allowing users to reproduce our methodology from the ENCODE-DREAM Challenge.
Identifiants
pubmed: 30922998
pii: S1046-2023(18)30329-3
doi: 10.1016/j.ymeth.2019.03.020
pmc: PMC6708499
mid: NIHMS1525354
pii:
doi:
Substances chimiques
Chromatin
0
Nucleotides
0
Transcription Factors
0
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM
Pagination
40-47Subventions
Organisme : NIBIB NIH HHS
ID : T32 EB009418
Pays : United States
Informations de copyright
Copyright © 2019 Elsevier Inc. All rights reserved.
Références
Elife. 2017 Jan 16;6:
pubmed: 28079019
Nucleic Acids Res. 2016 Jul 8;44(W1):W160-5
pubmed: 27079975
Genome Res. 2013 Dec;23(12):2136-48
pubmed: 24170599
Nucleic Acids Res. 2016 Jun 20;44(11):e107
pubmed: 27084946
Curr Protoc Mol Biol. 2013 Jul;Chapter 27:Unit 21.27
pubmed: 23821440
Bioinformatics. 2011 Dec 15;27(24):3423-4
pubmed: 21949271
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2016 Dec;2016:178-183
pubmed: 32551184
Nature. 2012 Sep 6;489(7414):57-74
pubmed: 22955616
Genome Biol. 2017 Apr 11;18(1):67
pubmed: 28395661
BMC Genomics. 2018 May 23;19(1):390
pubmed: 29792182
Bioinformatics. 2015 Mar 1;31(5):761-3
pubmed: 25338716
PLoS One. 2015 Sep 25;10(9):e0138030
pubmed: 26406244
Cell. 2011 Dec 9;147(6):1408-19
pubmed: 22153082
Nature. 2012 Sep 6;489(7414):83-90
pubmed: 22955618
Nat Methods. 2009 Apr;6(4):283-9
pubmed: 19305407
Genome Biol. 2015 Jan 24;16:14
pubmed: 25616342
Bioinformatics. 2016 Jun 15;32(12):1832-9
pubmed: 26873929
Bioinformatics. 2010 Sep 1;26(17):2204-7
pubmed: 20639541
Nat Methods. 2012 Mar 18;9(5):473-6
pubmed: 22426492
Nat Biotechnol. 2015 Aug;33(8):831-8
pubmed: 26213851
Genome Res. 2011 Mar;21(3):456-64
pubmed: 21106903
Nat Biotechnol. 2008 Dec;26(12):1351-9
pubmed: 19029915
Nature. 2015 Feb 19;518(7539):317-30
pubmed: 25693563
Curr Protoc Mol Biol. 2015 Jan 05;109:21.29.1-21.29.9
pubmed: 25559105
Genome Res. 2006 Jan;16(1):123-31
pubmed: 16344561
Nucleic Acids Res. 2016 Jan 4;44(D1):D110-5
pubmed: 26531826
Nat Biotechnol. 2015 Apr;33(4):364-76
pubmed: 25690853
Nat Biotechnol. 2014 Feb;32(2):171-178
pubmed: 24441470
Genome Res. 2002 Jun;12(6):996-1006
pubmed: 12045153
Genome Res. 2012 Sep;22(9):1760-74
pubmed: 22955987
Nat Methods. 2012 Feb 28;9(3):215-6
pubmed: 22373907
Genome Res. 2011 Mar;21(3):447-55
pubmed: 21106904
Nat Methods. 2015 Oct;12(10):931-4
pubmed: 26301843
Nucleic Acids Res. 2015 Jul 1;43(W1):W50-6
pubmed: 25904632
Nucleic Acids Res. 2014 Jul;42(Web Server issue):W187-91
pubmed: 24799436
Genome Res. 2007 Jun;17(6):877-85
pubmed: 17179217
Neural Netw. 2005 Jun-Jul;18(5-6):602-10
pubmed: 16112549
J Mol Biol. 1987 Jul 20;196(2):261-82
pubmed: 3656447
Bioinformatics. 2014 Jun 15;30(12):i121-9
pubmed: 24931975
Epigenetics Chromatin. 2015 Jul 16;8:23
pubmed: 26180553
Science. 2007 Jun 8;316(5830):1497-502
pubmed: 17540862
PLoS One. 2015 Mar 04;10(3):e0118432
pubmed: 25738806
Genome Biol. 2007;8(2):R24
pubmed: 17324271
Genome Res. 2016 Jul;26(7):990-9
pubmed: 27197224
Nat Biotechnol. 2019 Jun;37(6):592-600
pubmed: 31138913
Brief Bioinform. 2013 Mar;14(2):178-92
pubmed: 22517427
Bioinformatics. 2014 Jun 15;30(12):1667-73
pubmed: 24532725