TFscope: systematic analysis of the sequence features involved in the binding preferences of transcription factors.


Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
10 Jul 2024
Historique:
received: 09 09 2022
accepted: 24 06 2024
medline: 11 7 2024
pubmed: 11 7 2024
entrez: 10 7 2024
Statut: epublish

Résumé

Characterizing the binding preferences of transcription factors (TFs) in different cell types and conditions is key to understand how they orchestrate gene expression. Here, we develop TFscope, a machine learning approach that identifies sequence features explaining the binding differences observed between two ChIP-seq experiments targeting either the same TF in two conditions or two TFs with similar motifs (paralogous TFs). TFscope systematically investigates differences in the core motif, nucleotide environment and co-factor motifs, and provides the contribution of each key feature in the two experiments. TFscope was applied to > 305 ChIP-seq pairs, and several examples are discussed.

Identifiants

pubmed: 38987807
doi: 10.1186/s13059-024-03321-8
pii: 10.1186/s13059-024-03321-8
doi:

Substances chimiques

Transcription Factors 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

187

Subventions

Organisme : Labex NUMEV (FR)
ID : MOTION project
Organisme : SIRIC Montpellier
ID : MOTION project
Organisme : Agence Nationale de la Recherche
ID : ANR-22-CE45-0031-01
Organisme : Laboratoire d'Excellence EpiGenMed
ID : R-loops project

Informations de copyright

© 2024. The Author(s).

Références

Afek A, Cohen H, Barber-Zucker S, Gordân R, Lukatsky DB. Nonconsensus protein binding to repetitive DNA sequence elements significantly affects eukaryotic genomes. PLoS Comput Biol. 2015;11(8):e1004429. https://doi.org/10.1371/journal.pcbi.1004429 . Public Library of Science.
Agarwal V, Shendure J. Predicting mRNA abundance directly from genomic sequence using deep convolutional neural networks. Cell Rep. 2020;31(7):107663. https://doi.org/10.1016/j.celrep.2020.107663 .
doi: 10.1016/j.celrep.2020.107663 pubmed: 32433972
Ambrosini G, Vorontsov I, Penzar D, Groux R, Fornes O, Nikolaeva DD, et al. Insights gained from a comprehensive all-against-all transcription factor binding motif benchmarking study. Genome Biol. 2020;21(1):114. https://doi.org/10.1186/s13059-020-01996-3 .
doi: 10.1186/s13059-020-01996-3 pubmed: 32393327 pmcid: 7212583
Arnosti DN, Kulkarni MM. Transcriptional enhancers: intelligent enhanceosomes or flexible billboards? J Cell Biochem. 2005;94(5):890–8. https://doi.org/10.1002/jcb.20352 .
doi: 10.1002/jcb.20352 pubmed: 15696541
Avsec Ž, Weilert M, Shrikumar A, Krueger S, Alexandari A, Dalal K, et al. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet. 2021;53(3):354–66. https://doi.org/10.1038/s41588-021-00782-6 .
doi: 10.1038/s41588-021-00782-6 pubmed: 33603233 pmcid: 8812996
Bailey TL. STREME: accurate and versatile sequence motif discovery. Bioinformatics. 2021;(btab203). https://doi.org/10.1093/bioinformatics/btab203 .
Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994;2:28–36.
pubmed: 7584402
Bailey TL, Machanick P. Inferring direct DNA binding from ChIP-seq. Nucleic Acids Res. 2012;40(17):e128. https://doi.org/10.1093/nar/gks433 .
doi: 10.1093/nar/gks433 pubmed: 22610855 pmcid: 3458523
Bejjani F, Evanno E, Zibara K, Piechaczyk M, Jariel-Encontre I. The AP-1 transcriptional complex: Local switch or remote command? Biochim Biophys Acta Rev Cancer. 2019;1872(1):11–23. Elsevier.
Bejjani F, Tolza C, Boulanger M, Downes D, Romero R, Maqbool MA, et al. Fra-1 regulates its target genes via binding to remote enhancers without exerting major control on chromatin architecture in triple negative breast cancers. Nucleic Acids Res. 2021;49(5):2488–508. Oxford University Press.
Bernardini A, Lorenzo M, Chaves-Sanjuan A, Swuec P, Pigni M, Saad D, et al. The USR domain of USF1 mediates NF-Y interactions and cooperative DNA binding. Int J Biol Macromol. 2021;193:401–13. https://doi.org/10.1016/j.ijbiomac.2021.10.056 .
doi: 10.1016/j.ijbiomac.2021.10.056 pubmed: 34673109
Castellanos M, Mothi N, Muñoz V. Eukaryotic transcription factors can track and control their target genes using DNA antennas. Nat Commun. 2020;11. https://doi.org/10.1038/s41467-019-14217-8 .
Castro-Mondragon JA, Jaeger S, Thieffry D, Thomas-Chollier M, van Helden J. RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections. Nucleic Acids Res. 2017;45(13):e119. https://doi.org/10.1093/nar/gkx314 .
doi: 10.1093/nar/gkx314 pubmed: 28591841 pmcid: 5737723
Chaudhari HG, Cohen BA. Local sequence features that influence AP-1 cis-regulatory activity. Genome Res. 2018;28(2):171–81. https://doi.org/10.1101/gr.226530.117 .
doi: 10.1101/gr.226530.117 pubmed: 29305491 pmcid: 5793781
Dror I, Golan T, Levy C, Rohs R, Mandel-Gutfreund Y. A widespread role of the motif environment in transcription factor binding across diverse protein families. Genome Res. 2015;25(9):1268–80. https://doi.org/10.1101/gr.184671.114 .
doi: 10.1101/gr.184671.114 pubmed: 26160164 pmcid: 4561487
Eder T, Grebien F. Comprehensive assessment of differential ChIP-seq tools guides optimal algorithm selection. Genome Biol. 2022;23(1):119. https://doi.org/10.1186/s13059-022-02686-y .
doi: 10.1186/s13059-022-02686-y pubmed: 35606795 pmcid: 9128273
ENCODE Project Consortium. The ENCODE (ENCyclopedia Of DNA Elements) Project. Science (NY). 2004;306(5696):636–40. https://doi.org/10.1126/science.1105136 .
doi: 10.1126/science.1105136
Ernst J, Kellis M. Interplay between chromatin state, regulator binding, and regulatory motifs in six human cell types. Genome Res. 2013;23(7):1142–54. https://doi.org/10.1101/gr.144840.112 .
doi: 10.1101/gr.144840.112 pubmed: 23595227 pmcid: 3698507
Feldker N, Ferrazzi F, Schuhwerk H, Widholz SA, Guenther K, Frisch I, et al. Genome-wide cooperation of EMT transcription factor ZEB1 with YAP and AP-1 in breast cancer. EMBO J. 2020;39(17):e103209. https://doi.org/10.15252/embj.2019103209 .
Fornes O, Castro-Mondragon JA, Khan A, van der Lee R, Zhang X, Richmond PA, et al. JASPAR 2020: update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2020;48(D1):D87–92. https://doi.org/10.1093/nar/gkz1001 .
doi: 10.1093/nar/gkz1001 pubmed: 31701148
Gheorghe M, Sandve GK, Khan A, Chèneby J, Ballester B, Mathelier A. A map of direct TF-DNA interactions in the human genome. Nucleic Acids Res. 2019;47(4):e21. https://doi.org/10.1093/nar/gky1210 .
doi: 10.1093/nar/gky1210 pubmed: 30517703
Ghorbani A, Abid A, Zou J. Interpretation of neural networks is fragile. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33 no. 01. 2019. pp. 3681–3688. https://doi.org/10.1609/aaai.v33i01.33013681 .
Grant CE, Bailey TL, Noble WS. FIMO: scanning for occurrences of a given motif. Bioinformatics. 2011;27(7):1017–8. https://doi.org/10.1093/bioinformatics/btr064 .
doi: 10.1093/bioinformatics/btr064 pubmed: 21330290 pmcid: 3065696
Heinz S, Benner C, Spann N, Bertolino E, Lin YC, Laslo P, et al. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol Cell. 2010;38(4):576–89. https://doi.org/10.1016/j.molcel.2010.05.004 .
doi: 10.1016/j.molcel.2010.05.004 pubmed: 20513432 pmcid: 2898526
Horton CA, Alexandari AM, Hayes MGB, Marklund E, Schaepe JM, Aditham AK, et al. Short tandem repeats bind transcription factors to tune eukaryotic gene expression. Science (NY). 2023;381(6664):eadd1250. https://doi.org/10.1126/science.add1250 .
Huminiecki Ł, Horbańczuk J. Can we predict gene expression by understanding proximal promoter architecture? Trends Biotechnol. 2017;0(0). https://doi.org/10.1016/j.tibtech.2017.03.007 .
Jolma A, Yan J, Whitington T, Toivonen J, Nitta KR, Rastas P, et al. DNA-binding specificities of human transcription factors. Cell. 2013;152(1–2):327–39. https://doi.org/10.1016/j.cell.2012.12.009 .
doi: 10.1016/j.cell.2012.12.009 pubmed: 23332764
Jolma A, Yin Y, Nitta KR, Dave K, Popov A, Taipale M, et al. DNA-dependent formation of transcription factor pairs alters their binding specificity. Nature. 2015;527(7578):384–8. https://doi.org/10.1038/nature15518 .
doi: 10.1038/nature15518 pubmed: 26550823
Kadiyala V, Sasse SK, Altonsy MO, Berman R, Chu HW, Phang TL, et al. Cistrome-based cooperation between airway epithelial glucocorticoid receptor and NF-κB orchestrates anti-inflammatory effects. J Biol Chem. 2016;291(24):12673–87. ASBMB.
Kelley DR, Reshef YA, Bileschi M, Belanger D, McLean CY, Snoek J. Sequential regulatory activity prediction across chromosomes with convolutional neural networks. Genome Res. 2018;28(5):739–50. https://doi.org/10.1101/gr.227819.117 .
doi: 10.1101/gr.227819.117 pubmed: 29588361 pmcid: 5932613
Kolmykov S, Yevshin I, Kulyashov M, Sharipov R, Kondrakhin Y, Makeev VJ, et al. GTRD: an integrated view of transcription regulation. Nucleic Acids Res. 2021;49(D1):D104–11. https://doi.org/10.1093/nar/gkaa1057 .
doi: 10.1093/nar/gkaa1057 pubmed: 33231677
Koo PK, Eddy SR. Representation learning of genomic sequence motifs with convolutional neural networks. PLoS Comput Biol. 2019;15(12):e1007560. https://doi.org/10.1371/journal.pcbi.1007560 .
doi: 10.1371/journal.pcbi.1007560 pubmed: 31856220 pmcid: 6941814
Kribelbauer JF, Rastogi C, Bussemaker HJ, Mann RS. Low-affinity binding sites and the transcription factor specificity paradox in eukaryotes. Annu Rev Cell Dev Biol. 2019;35(1):357–79. https://doi.org/10.1146/annurev-cellbio-100617-062719 .
doi: 10.1146/annurev-cellbio-100617-062719 pubmed: 31283382 pmcid: 6787930
Kulakovskiy IV, Vorontsov IE, Yevshin IS, Sharipov RN, Fedorova AD, Rumynskiy EI, et al. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis. Nucleic Acids Res. 2017. https://doi.org/10.1093/nar/gkx1106 .
doi: 10.1093/nar/gkx1106 pmcid: 5753240
Kulik M, Bothe M, Kibar G, Fuchs A, Schöne S, Prekovic S, et al. Androgen and glucocorticoid receptor direct distinct transcriptional programs by receptor-specific and shared DNA binding sites. Nucleic Acids Res. 2021;49(7):3856–75. Oxford University Press.
Levo M, Zalckvar E, Sharon E, Machado ACD, Kalma Y, Lotam-Pompan M, et al. Unraveling determinants of transcription factor binding outside the core binding site. Genome Res. 2015;25(7):1018–29. https://doi.org/10.1101/gr.185033.114 .
doi: 10.1101/gr.185033.114 pubmed: 25762553 pmcid: 4484385
Li M, Ma B, Wang L. Finding similar regions in many strings. In: Proceedings of the thirty-first annual ACM symposium on Theory of Computing, STOC ’99. New York: Association for Computing Machinery; 1999. pp. 473–482. https://doi.org/10.1145/301250.301376 .
Menichelli C, Guitard V, Martins RM, Lèbre S, Lopez-Rubio JJ, Lecellier CH, et al. Identification of long regulatory elements in the genome of Plasmodium falciparum and other eukaryotes. PLoS Comput Biol. 2021;17(4):e1008909. https://doi.org/10.1371/journal.pcbi.1008909 . Public Library of Science.
Mirny LA. Nucleosome-mediated cooperativity between transcription factors. Proc Natl Acad Sci U S A. 2010;107(52):22534–9. https://doi.org/10.1073/pnas.0913805107 .
doi: 10.1073/pnas.0913805107 pubmed: 21149679 pmcid: 3012490
Morgunova E, Taipale J. Structural perspective of cooperative transcription factor binding. Curr Opin Struct Biol. 2017;47:1–8. https://doi.org/10.1016/j.sbi.2017.03.006 .
doi: 10.1016/j.sbi.2017.03.006 pubmed: 28349863
Novakovsky G, Saraswat M, Fornes O, Mostafavi S, Wasserman WW. Biologically relevant transfer learning improves transcription factor binding prediction. Genome Biol. 2021;22(1):280. https://doi.org/10.1186/s13059-021-02499-5 .
doi: 10.1186/s13059-021-02499-5 pubmed: 34579793 pmcid: 8474956
Quang D, Xie X. DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res. 2016;44(11):e107–e107. https://doi.org/10.1093/nar/gkw226 .
doi: 10.1093/nar/gkw226 pubmed: 27084946 pmcid: 4914104
Reiter F, Wienerroither S, Stark A. Combinatorial function of transcription factors and cofactors. Curr Opin Genet Dev. 2017;43:73–81. https://doi.org/10.1016/j.gde.2016.12.007 .
doi: 10.1016/j.gde.2016.12.007 pubmed: 28110180
Romero R, Menichelli C, Vroland C, Marin JM, Lèbre S, Lecellier C, et al. TFscope. Genome Biol. 2024. https://doi.org/10.5281/zenodo.12160588 .
doi: 10.5281/zenodo.12160588
Ruan S, Stormo GD. Inherent limitations of probabilistic models for protein-DNA binding specificity. PLoS Comput Biol. 2017;13(7):e1005638. https://doi.org/10.1371/journal.pcbi.1005638 . Public Library of Science.
Ruan S, Stormo GD. Comparison of discriminative motif optimization using matrix and DNA shape-based models. BMC Bioinformatics. 2018;19(1):86. https://doi.org/10.1186/s12859-018-2104-7 .
doi: 10.1186/s12859-018-2104-7 pubmed: 29510689 pmcid: 5840810
Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990;18(20):6097–100.
doi: 10.1093/nar/18.20.6097 pubmed: 2172928 pmcid: 332411
Severson TM, Kim Y, Joosten SE, Schuurman K, Van Der Groep P, Moelans CB, et al. Characterizing steroid hormone receptor chromatin binding landscapes in male and female breast cancer. Nat Commun. 2018;9(1):1–12. Nature Publishing Group.
Shen N, Zhao J, Schipper JL, Zhang Y, Bepler T, Leehr D, et al. Divergence in DNA specificity among paralogous transcription factors contributes to their differential in vivo binding. Cell Syst. 2018;6(4):470–483.e8. https://doi.org/10.1016/j.cels.2018.02.009 .
doi: 10.1016/j.cels.2018.02.009 pubmed: 29605182 pmcid: 6008103
Sherwood RI, Hashimoto T, O’Donnell CW, Lewis S, Barkal AA, van Hoff JP, et al. Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat Biotechnol. 2014;32(2):171–8. https://doi.org/10.1038/nbt.2798 .
doi: 10.1038/nbt.2798 pubmed: 24441470 pmcid: 3951735
Srivastava D, Mahony S. Sequence and chromatin determinants of transcription factor binding and the establishment of cell type-specific binding patterns. Biochim Biophys Acta Gene Regul Mech. 2020;1863(6):194443. https://doi.org/10.1016/j.bbagrm.2019.194443 .
doi: 10.1016/j.bbagrm.2019.194443 pubmed: 31639474
Stark R, Brown G. DiffBind: differential binding analysis of ChIP-Seq peak data. Bioconductor version: Release (3.17); 2023. https://doi.org/10.18129/B9.bioc.DiffBind .
Szalóki N, Krieger JW, Komáromi I, Tóth K, Vámosi G. Evidence for homodimerization of the c-Fos transcription factor in live cells revealed by fluorescence microscopy and computer modeling. Mol Cell Biol. 2015;35(21):3785–98. https://doi.org/10.1128/MCB.00346-15 . Taylor & Francis.
Thurman RE, Rynes E, Humbert R, Vierstra J, Maurano MT, Haugen E, et al. The accessible chromatin landscape of the human genome. Nature. 2012;489(7414):75–82. https://doi.org/10.1038/nature11232 .
doi: 10.1038/nature11232 pubmed: 22955617 pmcid: 3721348
Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B. 1994;58:267–88.
doi: 10.1111/j.2517-6161.1996.tb02080.x
Vandel J, Cassan O, Lèbre S, Lecellier CH, Bréhélin L. Probing transcription factor combinatorics in different promoter classes and in enhancers. BMC Genomics. 2019;20(1):103. https://doi.org/10.1186/s12864-018-5408-0 .
doi: 10.1186/s12864-018-5408-0 pubmed: 30709337 pmcid: 6359851
Wang J, Zhuang J, Iyer S, Lin X, Whitfield TW, Greven MC, et al. Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors. Genome Res. 2012;22(9):1798–812. https://doi.org/10.1101/gr.139105.112 . Cold Spring Harbor Lab.
Wasserman WW, Sandelin A. Applied bioinformatics for the identification of regulatory elements. J Clin Neurosci. 2004;5(4):276–87. https://doi.org/10.1038/nrg1315 .
doi: 10.1038/nrg1315
Whitaker JW, Chen Z, Wang W. Predicting the human epigenome from DNA motifs. Nat Methods. 2015;12(3):265–72. https://doi.org/10.1038/nmeth.3065 .
doi: 10.1038/nmeth.3065 pubmed: 25240437
Worsley Hunt R, Mathelier A, del Peso L, Wasserman WW. Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment. BMC Genomics. 2014;15(1):472. https://doi.org/10.1186/1471-2164-15-472 .
doi: 10.1186/1471-2164-15-472 pubmed: 24927817 pmcid: 4082612
Wunderlich Z, Mirny LA. Different gene regulation strategies revealed by analysis of binding motifs. Trends Genet. 2009;25(10):434–40. https://doi.org/10.1016/j.tig.2009.08.003 .
doi: 10.1016/j.tig.2009.08.003 pubmed: 19815308 pmcid: 3697852
Zheng A, Lamkin M, Zhao H, Wu C, Su H, Gymrek M. Deep neural networks identify sequence context features predictive of transcription factor binding. Nat Mach Intel. 2021;3(2):172–80. https://doi.org/10.1038/s42256-020-00282-y .
doi: 10.1038/s42256-020-00282-y
Zhou J, Troyanskaya OG. Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods. 2015;12(10):931–4. https://doi.org/10.1038/nmeth.3547 .
doi: 10.1038/nmeth.3547 pubmed: 26301843 pmcid: 4768299

Auteurs

Raphaël Romero (R)

LIRMM, Univ Montpellier, CNRS, Montpellier, France.
IMAG, Univ Montpellier, CNRS, Montpellier, France.

Christophe Menichelli (C)

LIRMM, Univ Montpellier, CNRS, Montpellier, France.

Christophe Vroland (C)

LIRMM, Univ Montpellier, CNRS, Montpellier, France.
Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.

Jean-Michel Marin (JM)

IMAG, Univ Montpellier, CNRS, Montpellier, France.

Sophie Lèbre (S)

IMAG, Univ Montpellier, CNRS, Montpellier, France. sophie.lebre@umontpellier.fr.
AMIS, Université Paul-Valéry-Montpellier 3, Montpellier, France. sophie.lebre@umontpellier.fr.

Charles-Henri Lecellier (CH)

LIRMM, Univ Montpellier, CNRS, Montpellier, France. charles.lecellier@igmm.cnrs.fr.
Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France. charles.lecellier@igmm.cnrs.fr.

Laurent Bréhélin (L)

LIRMM, Univ Montpellier, CNRS, Montpellier, France. brehelin@lirmm.fr.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH