Prediction of single-cell gene expression for transcription factor analysis.
Journal
GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872
Informations de publication
Date de publication:
30 10 2020
30 10 2020
Historique:
received:
11
03
2020
revised:
20
08
2020
entrez:
30
10
2020
pubmed:
31
10
2020
medline:
26
10
2021
Statut:
ppublish
Résumé
Single-cell RNA sequencing is a powerful technology to discover new cell types and study biological processes in complex biological samples. A current challenge is to predict transcription factor (TF) regulation from single-cell RNA data. Here, we propose a novel approach for predicting gene expression at the single-cell level using cis-regulatory motifs, as well as epigenetic features. We designed a tree-guided multi-task learning framework that considers each cell as a task. Through this framework we were able to explain the single-cell gene expression values using either TF binding affinities or TF ChIP-seq data measured at specific genomic regions. TFs identified using these models could be validated by the literature. Our proposed method allows us to identify distinct TFs that show cell type-specific regulation. This approach is not limited to TFs but can use any type of data that can potentially be used in explaining gene expression at the single-cell level to study factors that drive differentiation or show abnormal regulation in disease. The implementation of our workflow can be accessed under an MIT license via https://github.com/SchulzLab/Triangulate.
Sections du résumé
BACKGROUND
Single-cell RNA sequencing is a powerful technology to discover new cell types and study biological processes in complex biological samples. A current challenge is to predict transcription factor (TF) regulation from single-cell RNA data.
RESULTS
Here, we propose a novel approach for predicting gene expression at the single-cell level using cis-regulatory motifs, as well as epigenetic features. We designed a tree-guided multi-task learning framework that considers each cell as a task. Through this framework we were able to explain the single-cell gene expression values using either TF binding affinities or TF ChIP-seq data measured at specific genomic regions. TFs identified using these models could be validated by the literature.
CONCLUSION
Our proposed method allows us to identify distinct TFs that show cell type-specific regulation. This approach is not limited to TFs but can use any type of data that can potentially be used in explaining gene expression at the single-cell level to study factors that drive differentiation or show abnormal regulation in disease. The implementation of our workflow can be accessed under an MIT license via https://github.com/SchulzLab/Triangulate.
Identifiants
pubmed: 33124660
pii: 5943496
doi: 10.1093/gigascience/giaa113
pmc: PMC7596801
pii:
doi:
Substances chimiques
Transcription Factors
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Informations de copyright
© The Author(s) 2020. Published by Oxford University Press GigaScience.
Références
Bioinformatics. 2014 Sep 1;30(17):i401-7
pubmed: 25161226
Bioinformatics. 2019 Feb 15;35(4):711-719
pubmed: 30084962
Nat Commun. 2018 Mar 8;9(1):997
pubmed: 29520097
Genome Res. 2018 Jan 9;:
pubmed: 29317474
Genes Dev. 2014 Oct 15;28(20):2219-32
pubmed: 25319825
Nat Methods. 2017 Nov;14(11):1083-1086
pubmed: 28991892
Nat Commun. 2018 Feb 22;9(1):781
pubmed: 29472610
Proc Natl Acad Sci U S A. 2003 Dec 23;100(26):15522-7
pubmed: 14673099
Nat Commun. 2019 Jan 28;10(1):470
pubmed: 30692544
Cell Commun Signal. 2019 Nov 29;17(1):159
pubmed: 31783876
Bioinformatics. 2019 May 1;35(9):1608-1609
pubmed: 30304373
Stem Cell Reports. 2014 Jan 14;2(1):26-35
pubmed: 24511468
Bioinformatics. 2020 Jan 15;36(2):496-503
pubmed: 31318408
Mol Syst Biol. 2012;8:605
pubmed: 22929615
Bioinformatics. 2007 Jan 15;23(2):134-41
pubmed: 17098775
J Stat Softw. 2010;33(1):1-22
pubmed: 20808728
Proc Natl Acad Sci U S A. 2019 Dec 10;:
pubmed: 31822622
Epigenetics Chromatin. 2020 Feb 6;13(1):4
pubmed: 32029002
Brief Funct Genomics. 2018 Jul 1;17(4):246-254
pubmed: 29342231
Nucleic Acids Res. 2017 Jan 9;45(1):54-66
pubmed: 27899623
Nat Struct Mol Biol. 2019 Nov;26(11):1063-1070
pubmed: 31695190
Genome Res. 2014 May;24(5):869-84
pubmed: 24515121
Nat Commun. 2015 Sep 30;6:8271
pubmed: 26420065
Gastroenterology. 2018 Aug;155(2):557-571.e14
pubmed: 29733835
BMC Bioinformatics. 2018 Jun 8;19(1):220
pubmed: 29884114
Nat Biotechnol. 2014 Apr;32(4):381-386
pubmed: 24658644
J Biol Chem. 2004 Jun 11;279(24):25927-34
pubmed: 15087442
PLoS Comput Biol. 2014 Dec 18;10(12):e1003943
pubmed: 25522349
BMC Bioinformatics. 2019 Jul 12;20(1):388
pubmed: 31299886
Cell. 2017 Jun 15;169(7):1342-1356.e16
pubmed: 28622514
Nat Commun. 2018 Apr 17;9(1):1516
pubmed: 29666373
Bioinformatics. 2012 Jun 15;28(12):i137-46
pubmed: 22689753
Cell Rep. 2018 Nov 6;25(6):1436-1445.e3
pubmed: 30404000