A Pretraining-Retraining Strategy of Deep Learning Improves Cell-Specific Enhancer Predictions.

deep learning prediction pretraining retraining tissue-specific enhancers

Journal

Frontiers in genetics
ISSN: 1664-8021
Titre abrégé: Front Genet
Pays: Switzerland
ID NLM: 101560621

Informations de publication

Date de publication:
2019
Historique:
received: 06 06 2019
accepted: 26 11 2019
entrez: 24 1 2020
pubmed: 24 1 2020
medline: 24 1 2020
Statut: epublish

Résumé

Deciphering the code of cis-regulatory element (CRE) is one of the core issues of today's biology. Enhancers are distal CREs and play significant roles in gene transcriptional regulation. Although identifications of enhancer locations across the whole genome [discriminative enhancer predictions (DEP)] is necessary, it is more important to predict in which specific cell or tissue types, they will be activated and functional [tissue-specific enhancer predictions (TSEP)]. Although existing deep learning models achieved great successes in DEP, they cannot be directly employed in TSEP because a specific cell or tissue type only has a limited number of available enhancer samples for training. Here, we first adopted a reported deep learning architecture and then developed a novel training strategy named "pretraining-retraining strategy" (PRS) for TSEP by decomposing the whole training process into two successive stages: a pretraining stage is designed to train with the whole enhancer data for performing DEP, and a retraining strategy is then designed to train with tissue-specific enhancer samples based on the trained pretraining model for making TSEP. As a result, PRS is found to be valid for DEP with an AUC of 0.922 and a GM (geometric mean) of 0.696, when testing on a larger-scale FANTOM5 enhancer dataset

Identifiants

pubmed: 31969903
doi: 10.3389/fgene.2019.01305
pmc: PMC6960260
doi:

Types de publication

Journal Article

Langues

eng

Pagination

1305

Informations de copyright

Copyright © 2020 Niu, Yang, Zhang, Yang and Hu.

Références

Database (Oxford). 2017 Jan 1;2017:
pubmed: 28605766
Nature. 2014 Mar 27;507(7493):455-461
pubmed: 24670763
Nat Rev Genet. 2016 Apr;17(4):207-23
pubmed: 26948815
Nature. 2015 May 28;521(7553):436-44
pubmed: 26017442
Genome Res. 2014 Oct;24(10):1595-602
pubmed: 25035418
Hum Mutat. 2017 Sep;38(9):1251-1258
pubmed: 28120510
Nat Biotechnol. 2012 Feb 26;30(3):271-7
pubmed: 22371084
Nucleic Acids Res. 2007 Jan;35(Database issue):D88-92
pubmed: 17130149
Nat Methods. 2012 Feb 28;9(3):215-6
pubmed: 22373907
Nature. 2009 May 7;459(7243):108-12
pubmed: 19295514
Bioinformatics. 2010 Mar 1;26(5):680-2
pubmed: 20053844
Science. 2012 Sep 7;337(6099):1190-5
pubmed: 22955828
Bioinformatics. 2017 Jul 1;33(13):1930-1936
pubmed: 28334114
Science. 2012 Dec 7;338(6112):1360-3
pubmed: 23118011
Bioinformatics. 2018 Mar 1;34(5):732-738
pubmed: 29069282
Cell. 2011 Feb 4;144(3):327-39
pubmed: 21295696
Nucleic Acids Res. 2016 Jun 20;44(11):e107
pubmed: 27084946
Bioinformatics. 2017 Jul 15;33(14):i92-i101
pubmed: 28881969
Nat Methods. 2012 Mar 18;9(5):473-6
pubmed: 22426492
Nat Biotechnol. 2015 Aug;33(8):831-8
pubmed: 26213851
Bioinformatics. 2016 Jul 15;32(14):2205-7
pubmed: 27153639
Genome Res. 2011 Dec;21(12):2167-80
pubmed: 21875935
Genome Res. 2016 Feb;26(2):238-55
pubmed: 26576614
Nat Rev Genet. 2014 Apr;15(4):272-86
pubmed: 24614317
Bioinformatics. 2016 Feb 1;32(3):362-9
pubmed: 26476782
Trends Genet. 2013 Jan;29(1):11-22
pubmed: 23102583
Nature. 2015 Feb 26;518(7540):556-9
pubmed: 25517091
Nat Methods. 2015 Oct;12(10):931-4
pubmed: 26301843
Nucleic Acids Res. 2015 Jan;43(1):e6
pubmed: 25378307
PLoS Comput Biol. 2014 Jul 17;10(7):e1003711
pubmed: 25033408
Nature. 2018 Feb 22;554(7693):555-557
pubmed: 29469107
Bioinformatics. 2016 Jun 15;32(12):i121-i127
pubmed: 27307608
Genes Dev. 2012 May 1;26(9):908-13
pubmed: 22499593
Science. 2013 Mar 1;339(6123):1074-7
pubmed: 23328393
Genome Res. 2016 Jul;26(7):990-9
pubmed: 27197224

Auteurs

Xiaohui Niu (X)

College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China.

Kun Yang (K)

College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China.

Ge Zhang (G)

College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China.

Zhiquan Yang (Z)

College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China.

Xuehai Hu (X)

College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, China.

Classifications MeSH