Identifying transcription factors with cell-type specific DNA binding signatures.


Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
14 Oct 2024
Historique:
received: 10 07 2024
accepted: 02 10 2024
medline: 15 10 2024
pubmed: 15 10 2024
entrez: 14 10 2024
Statut: epublish

Résumé

Transcription factors (TFs) bind to different parts of the genome in different types of cells, but it is usually assumed that the inherent DNA-binding preferences of a TF are invariant to cell type. Yet, there are several known examples of TFs that switch their DNA-binding preferences in different cell types, and yet more examples of other mechanisms, such as steric hindrance or cooperative binding, that may result in a "DNA signature" of differential binding. To survey this phenomenon systematically, we developed a deep learning method we call SigTFB (Signatures of TF Binding) to detect and quantify cell-type specificity in a TF's known genomic binding sites. We used ENCODE ChIP-seq data to conduct a wide scale investigation of 169 distinct TFs in up to 14 distinct cell types. SigTFB detected statistically significant DNA binding signatures in approximately two-thirds of TFs, far more than might have been expected from the relatively sparse evidence in prior literature. We found that the presence or absence of a cell-type specific DNA binding signature is distinct from, and indeed largely uncorrelated to, the degree of overlap between ChIP-seq peaks in different cell types, and tended to arise by two mechanisms: using established motifs in different frequencies, and by selective inclusion of motifs for distint TFs. While recent results have highlighted cell state features such as chromatin accessibility and gene expression in predicting TF binding, our results emphasize that, for some TFs, the DNA sequences of the binding sites contain substantial cell-type specific motifs.

Sections du résumé

BACKGROUND BACKGROUND
Transcription factors (TFs) bind to different parts of the genome in different types of cells, but it is usually assumed that the inherent DNA-binding preferences of a TF are invariant to cell type. Yet, there are several known examples of TFs that switch their DNA-binding preferences in different cell types, and yet more examples of other mechanisms, such as steric hindrance or cooperative binding, that may result in a "DNA signature" of differential binding.
RESULTS RESULTS
To survey this phenomenon systematically, we developed a deep learning method we call SigTFB (Signatures of TF Binding) to detect and quantify cell-type specificity in a TF's known genomic binding sites. We used ENCODE ChIP-seq data to conduct a wide scale investigation of 169 distinct TFs in up to 14 distinct cell types. SigTFB detected statistically significant DNA binding signatures in approximately two-thirds of TFs, far more than might have been expected from the relatively sparse evidence in prior literature. We found that the presence or absence of a cell-type specific DNA binding signature is distinct from, and indeed largely uncorrelated to, the degree of overlap between ChIP-seq peaks in different cell types, and tended to arise by two mechanisms: using established motifs in different frequencies, and by selective inclusion of motifs for distint TFs.
CONCLUSIONS CONCLUSIONS
While recent results have highlighted cell state features such as chromatin accessibility and gene expression in predicting TF binding, our results emphasize that, for some TFs, the DNA sequences of the binding sites contain substantial cell-type specific motifs.

Identifiants

pubmed: 39402535
doi: 10.1186/s12864-024-10859-1
pii: 10.1186/s12864-024-10859-1
doi:

Substances chimiques

Transcription Factors 0
DNA 9007-49-2

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

957

Subventions

Organisme : Queen Elizabeth Scholars
ID : QEII-GST
Organisme : Natural Sciences and Engineering Research Council of Canada
ID : RGPIN-2019-06604
Organisme : Alliance de recherche numérique du Canada
ID : RRG-TPERKINS

Informations de copyright

© 2024. The Author(s).

Références

Bintu L, et al. Transcriptional regulation by the numbers: models. Curr Opin Genet Dev. 2005;15:116–24.
pubmed: 15797194 pmcid: 3482385 doi: 10.1016/j.gde.2005.02.007
Desvergne B, Michalik L, Wahli W. Transcriptional regulation of metabolism. Physiol Rev. 2006;86:465–514.
pubmed: 16601267 doi: 10.1152/physrev.00025.2005
Lee TI, Young RA. Transcriptional regulation and its misregulation in disease. Cell. 2013;152:1237–51.
pubmed: 23498934 pmcid: 3640494 doi: 10.1016/j.cell.2013.02.014
Matys V, et al. TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 2006;34:D108–10.
pubmed: 16381825 doi: 10.1093/nar/gkj143
Bryne JC, et al. JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res. 2007;36:D102–6.
pubmed: 18006571 pmcid: 2238834 doi: 10.1093/nar/gkm955
Soleimani VD, et al. Cis-regulatory determinants of MyoD function. Nucleic Acids Res. 2018;46:7221–35.
pubmed: 30016497 pmcid: 6101602 doi: 10.1093/nar/gky388
Consortium EP, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57.
Lee B-K, et al. Cell-type specific and combinatorial usage of diverse transcriptionfactors revealed by genome-wide binding studies in multiple human cells. Genome Res. 2012;22:9–24.
pubmed: 22090374 pmcid: 3246210 doi: 10.1101/gr.127597.111
Benedetti M, Levi A, Chao MV. Di erential expression of nerve growth factor receptors leads to altered binding affinity and neurotrophin responsiveness. Proc Natl Acad Sci. 1993;90:7859–63.
pubmed: 8356095 pmcid: 47242 doi: 10.1073/pnas.90.16.7859
Srivastava D, Mahony S. Sequence and chromatin determinants of transcription factor binding and the establishment of cell type-specific binding patterns. Biochim Biophys Acta (BBA) Gene Regul Mech. 2020;1863:194443.
Brand M, et al. Dynamic changes in transcription factor complexes during erythroid differentiation revealed by quantitative proteomics. Nat Struct Mol Biol. 2004;11:73–80.
pubmed: 14718926 doi: 10.1038/nsmb713
Pilpel Y, Sudarsanam P, Church GM. Identifying regulatory networks by combinatorial analysis of promoter elements. Nat Genet. 2001;29(153):159.
Nie Y, Shu C, Sun X. Cooperative binding of transcription factors in the human genome. Genomics. 2020;112:3427–34.
pubmed: 32574834 doi: 10.1016/j.ygeno.2020.06.029
Lowen M, Scott G, Zwollo P. Functional analyses of two alternative isoforms of the transcription factor Pax-5. J Biol Chem. 2001;276:42565–74.
pubmed: 11535600 doi: 10.1074/jbc.M106536200
Castro-Mondragon JA, et al. JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2022;50:D165–73.
pubmed: 34850907 doi: 10.1093/nar/gkab1113
Kulakovskiy IV, et al. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-seq analysis. Nucleic Acids Res. 2018;46:D252–9.
pubmed: 29140464 doi: 10.1093/nar/gkx1106
Ogawa N, Biggin MD. High-throughput SELEX determination of DNA sequences bound by transcription factors in vitro. Gene Regul Netw Methods Protoc. 2012;786:51–63.
Gertz J, et al. Distinct properties of cell-type-specific and shared transcription factor binding sites. Mol Cell. 2013;52:25–36.
pubmed: 24076218 doi: 10.1016/j.molcel.2013.08.037
Zhang S, et al. OCT4 and PAX6 determine the dual function of SOX2 in human ESCs as a key pluripotent or neural factor. Stem Cell Res Ther. 2019;10:1–14.
doi: 10.1186/s13287-019-1228-7
Wang J, et al. Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res. 2012;22:1798–812.
pubmed: 22955990 pmcid: 3431495 doi: 10.1101/gr.139105.112
Arvey A, Agius P, Noble WS, Leslie C. Sequence and chromatin determinants of cell type-specific transcription factor binding. Genome Res. 2012;22:1723–34.
pubmed: 22955984 pmcid: 3431489 doi: 10.1101/gr.127712.111
Keilwagen J, Posch S, Grau J. Accurate prediction of cell type-specific transcription factor binding. Genome Biol. 2019;20:1–17.
doi: 10.1186/s13059-018-1614-y
McLeay RC, Bailey TL. Motif enrichment analysis: a unified framework and an evaluation on ChIP data. BMC Bioinformatics. 2010;11:1–11.
doi: 10.1186/1471-2105-11-165
Lesluyes T, Johnson J, Machanick P, Bailey TL. Differential motif enrichment analysis of paired ChIP-seq experiments. BMC Genomics. 2014;15:1–13.
doi: 10.1186/1471-2164-15-752
Alipanahi B, Delong A, Weirauch MT, Frey BJ. Predicting the sequence specificities of DNA-and RNA-binding proteins by deep learning. Nat Biotechnol. 2015;33:831–8.
pubmed: 26213851 doi: 10.1038/nbt.3300
Hassanzadeh H, Wang MD. DeeperBind: Enhancing prediction of sequence specificities of DNA binding proteins. Los Alamitos: IEEE Computer Society; 2016. p. 178–83.
Quang D, Xie X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res. 2016;44:e107–e107.
pubmed: 27084946 pmcid: 4914104 doi: 10.1093/nar/gkw226
Chen C, et al. DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks. BMC Bioinformatics. 2021;22:1–18.
Quang D, Xie X. FactorNet: a deep learning framework for predicting cell type specific transcription factor binding from nucleotide-resolution sequential data. Methods. 2019;166:40–7.
pubmed: 30922998 pmcid: 6708499 doi: 10.1016/j.ymeth.2019.03.020
Li H, Guan Y. Fast decoding cell type-specific transcription factor binding landscape at single-nucleotide resolution. Genome Res. 2021;31:721–31.
pubmed: 33741685 pmcid: 8015851 doi: 10.1101/gr.269613.120
Zhang Y, Wang Z, Zeng Y, Zhou J, Zou Q. High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method. Brief Bioinform. 2021;22:bbab273.
Zhang Q, et al. Base-resolution prediction of transcription factor binding signals by a deep learning framework. PLoS Comput Biol. 2022;18:e1009941.
pubmed: 35263332 pmcid: 8982852 doi: 10.1371/journal.pcbi.1009941
Cao L, Liu P, Chen J, Deng L. Prediction of transcription factor binding sites using a combined deep learning approach. Front Oncol. 2022;12:893520.
pubmed: 35719916 pmcid: 9204005 doi: 10.3389/fonc.2022.893520
Ng JW, Ong EH, Tucker-Kellogg L, Tucker-Kellogg G. Deep learning for de-convolution of Smad2 versus Smad3 binding sites. BMC Genomics. 2022;23:525.
pubmed: 35858839 pmcid: 9297549 doi: 10.1186/s12864-022-08565-x
Ding P, et al. DeepSTF: predicting transcription factor binding sites by interpretable deep neural networks combining sequence and shape. Brief Bioinform. 2023;24:bbad231.
Zhang J, Liu B, Wu J, Wang Z, Li J. DeepCAC: a deep learning approach on DNA transcription factors classification based on multi-head self-attention and concatenate convolutional neural network. BMC Bioinformatics. 2023;24:345.
pubmed: 37723425 pmcid: 10506269 doi: 10.1186/s12859-023-05469-9
Wang K, et al. BERT-TFBS: a novel BERT-based model for predicting transcription factor binding sites by transfer learning. Brief Bioinform. 2024;25:bbae195.
Zhuang J, et al. MulTFBS: A spatial-temporal network with multichannels for predicting transcription factor binding sites. J Chem Inf Model. 2024;64(10):1549–9596.
Andrews G. Deep learning as a tool to better understand transcription factor binding across cell types and species. Ph.D. thesis, UMass Chan Medical School; 2024.
Eraslan G, Avsec Ž, Gagneur J, Theis FJ. Deep learning: new computational modelling techniques for genomics. Nat Rev Genet. 2019;20:389–403.
pubmed: 30971806 doi: 10.1038/s41576-019-0122-6
Zhang S, et al. Assessing deep learning methods in cis-regulatory motif finding based on genomic sequencing data. Brief Bioinform. 2022;23:bbab374.
Novakovsky G, Dexter N, Libbrecht MW, Wasserman WW, Mostafavi S. Obtaining genetics insights from deep learning via explainable artificial intelligence. Nat Rev Genet. 2023;24:125–37.
pubmed: 36192604 doi: 10.1038/s41576-022-00532-2
Singh G, et al. A exible repertoire of transcription factor binding sites and a diversity threshold determines enhancer activity in embryonic stem cells.Genome Res. 2021;31:564–575.
Zheng A, et al. Deep neural networks identify sequence context features predictive of transcription factor binding. Nat Mach Intel. 2021;3:172–80.
doi: 10.1038/s42256-020-00282-y
Kelley DR, Snoek J, Rinn JL. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 2016;26:990–9.
pubmed: 27197224 pmcid: 4937568 doi: 10.1101/gr.200535.115
Nair S, Kim DS, Perricone J, Kundaje A. Integrating regulatory DNA sequence and gene expression to predict genome-wide chromatin accessibility across cellular contexts. Bioinformatics. 2019;35:i108–16.
pubmed: 31510655 pmcid: 6612838 doi: 10.1093/bioinformatics/btz352
Balandat M, et al. BoTorch: programmable bayesian optimization in PyTorch. 2019. arxiv e-prints arXiv–1910 .
Maekawa T, et al. Social isolation stress induces ATF-7 phosphorylation and impairs silencing of the 5-HT 5B receptor gene. EMBO J. 2010;29:196–208.
pubmed: 19893493 doi: 10.1038/emboj.2009.318
Chen M, et al. Emerging roles of activating transcription factor (ATF) family members in tumourigenesis and immunity: Implications in cancer immunotherapy. Genes Dis. 2021;9(4):981–99.
Gozdecka M, Breitwieser W. The roles of ATF2 (activating transcription factor 2) in tumorigenesis. Biochem Soc Trans. 2012;40:230–4.
pubmed: 22260696 doi: 10.1042/BST20110630
Meijer BJ, et al. ATF2 and ATF7 are critical mediators of intestinal epithelial repair. Cell Mol Gastroenterol Hepatol. 2020;10:23–42.
pubmed: 31958521 pmcid: 7210476 doi: 10.1016/j.jcmgh.2020.01.005
Kim S, Yu N-K, Kaang B-K. CTCF as a multifunctional protein in genome regulation and gene expression. Exp Mol Med. 2015;47:e166–e166.
pubmed: 26045254 pmcid: 4491725 doi: 10.1038/emm.2015.33
Chen H, Tian Y, Shu W, Bo X, Wang S. Comprehensive identification and annotation of cell type-specific and ubiquitous CTCF-binding sites in the human genome. PLoS ONE. 2012;7:e41374.
pubmed: 22829947 pmcid: 3400636 doi: 10.1371/journal.pone.0041374
Holwerda SJB, de Laat W. CTCF: the protein, the binding partners, the binding sites and their chromatin loops. Philos Trans R Soc B Biol Sci. 2013;368:20120369.
doi: 10.1098/rstb.2012.0369
Li YE, et al. An atlas of gene regulatory elements in adult mouse cerebrum. Nature. 2021;598:129–36.
pubmed: 34616068 pmcid: 8494637 doi: 10.1038/s41586-021-03604-1
BRAIN Initiative Cell Census Network (BICCN). A multimodal cell census and atlas of the mammalian primary motor cortex. Nature. 2021;598:86–102.
Zu S, et al. Single-cell analysis of chromatin accessibility in the adult mouse brain. Nature. 2023;624:378–89.
pubmed: 38092917 pmcid: 10719105 doi: 10.1038/s41586-023-06824-9
Sams DS, et al. Neuronal CTCF is necessary for basal and experience-dependent gene regulation, memory formation, and genomic structure of BDNF and Arc. Cell Rep. 2016;17:2418–30.
pubmed: 27880914 doi: 10.1016/j.celrep.2016.11.004
Dang CV. MYC on the path to cancer. Cell. 2012;149:22–35.
pubmed: 22464321 pmcid: 3345192 doi: 10.1016/j.cell.2012.03.003
Davudian S, Mansoori B, Shajari N, Mohammadi A, Baradaran B. BACH1, the master regulator gene: a novel candidate target for cancer therapy. Gene. 2016;588:30–7.
pubmed: 27108804 doi: 10.1016/j.gene.2016.04.040
Guo X, Yang M, Gu H, Zhao J, Zou L. Decreased expression of SOX6 confers a poor prognosis in hepatocellular carcinoma. Cancer Epidemiol. 2013;37:732–6.
pubmed: 23731550 doi: 10.1016/j.canep.2013.05.002
Wysocka J, Reilly PT, Herr W. Loss of HCF-1-chromatin association precedes temperature-induced growth arrest of tsBN67 cells. Mol Cell Biol. 2001;21:3820–9.
pubmed: 11340173 pmcid: 87041 doi: 10.1128/MCB.21.11.3820-3829.2001
Maslova A, et al. Deep learning of immune cell differentiation. Proc Natl Acad Sci. 2020;117(25655):25666.
Gupta S, Stamatoyannopoulos JA, Bailey TL, Noble WS. Quantifying similarity between motifs. Genome Biol. 2007;8:1–9.
doi: 10.1186/gb-2007-8-2-r24
De Graeve F, et al. Role of the ATFa/JNK2 complex in jun activation. Oncogene. 1999;18:3491–500.
pubmed: 10376527 doi: 10.1038/sj.onc.1202723
Fornes O, et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2020;48:D87–92.
pubmed: 31701148
Ambrosini G, et al. Insights gained from a comprehensive all-against-all transcription factor binding motif benchmarking study. Genome Biol. 2020;21:1–18.
doi: 10.1186/s13059-020-01996-3
Castro-Mondragon JA, Jaeger S, Thieffry D, Thomas-Chollier M, Van Helden J. RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections. Nucleic Acids Res. 2017;45:e119–e119.
pubmed: 28591841 pmcid: 5737723 doi: 10.1093/nar/gkx314
Zhou J, et al. MTTFsite: cross-cell type TF binding site prediction by using multi-task learning. Bioinformatics. 2019;35:5067–77.
pubmed: 31161194 pmcid: 6954652 doi: 10.1093/bioinformatics/btz451
Phuycharoen M, et al. Uncovering tissue-specific binding features from differential deep learning. Nucleic Acids Res. 2020;48:e27–e27.
pubmed: 31974574 pmcid: 7049686 doi: 10.1093/nar/gkaa009
Novakovsky G, Saraswat M, Fornes O, Mostafavi S, Wasserman WW. Biologically relevant transfer learning improves transcription factor binding prediction. Genome Biol. 2021;22:1–25.
doi: 10.1186/s13059-021-02499-5
Pechenick DA, Payne JL, Moore JH. Phenotypic robustness and the assortativity signature of human transcription factor networks. PLoS Comput Biol. 2014;10:e1003780.
pubmed: 25121490 pmcid: 4133045 doi: 10.1371/journal.pcbi.1003780
Rhee HS, Pugh BF. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell. 2011;147:1408–19.
pubmed: 22153082 pmcid: 3243364 doi: 10.1016/j.cell.2011.11.013
Kaya-Okur HS, et al. Cut &tag for efficient epigenomic profiling of small samples and single cells. Nat Commun. 2019;10:1930.
pubmed: 31036827 pmcid: 6488672 doi: 10.1038/s41467-019-09982-5
Wingender E, et al. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res. 2000;28(316):319.
Kheradpour P, Kellis M. Systematic discovery and characterization of regulatory motifs in encode TF binding experiments. Nucleic Acids Res. 2014;42:2976–87.
pubmed: 24335146 doi: 10.1093/nar/gkt1249
Bailey TL, Johnson J, Grant CE, Noble WS. The MEME suite. Nucleic Acids Res. 2015;43:W39–49.
pubmed: 25953851 pmcid: 4489269 doi: 10.1093/nar/gkv416

Auteurs

Aseel Awdeh (A)

School of Electrical Engineering and Compute Science, University of Ottawa, 800 King Edward Ave., Ottawa, K1N 6N5, Ontario, Canada.
Regenerative Medicine Program, Ottawa Hospital Research Institute, 501 Smyth Rd., Ottawa, K1H 8L6, Ontario, Canada.

Marcel Turcotte (M)

School of Electrical Engineering and Compute Science, University of Ottawa, 800 King Edward Ave., Ottawa, K1N 6N5, Ontario, Canada.

Theodore J Perkins (TJ)

School of Electrical Engineering and Compute Science, University of Ottawa, 800 King Edward Ave., Ottawa, K1N 6N5, Ontario, Canada. tperkins@ohri.ca.
Regenerative Medicine Program, Ottawa Hospital Research Institute, 501 Smyth Rd., Ottawa, K1H 8L6, Ontario, Canada. tperkins@ohri.ca.
Ottawa Institute of Systems Biology, Department of Biochemistry, Microbiology and Immunology, University of Ottawa, 451 Smyth Rd., Ottawa, K1H 8M5, Ontario, Canada. tperkins@ohri.ca.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH