A universal model of RNA.DNA:DNA triplex formation accurately predicts genome-wide RNA-DNA interactions.
DNA
RNA
RNA–DNA interaction
Triplex
machine learning
Journal
Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837
Informations de publication
Date de publication:
19 11 2022
19 11 2022
Historique:
received:
20
06
2022
revised:
16
08
2022
accepted:
17
09
2022
pubmed:
15
10
2022
medline:
24
11
2022
entrez:
14
10
2022
Statut:
ppublish
Résumé
RNA.DNA:DNA triple helix (triplex) formation is a form of RNA-DNA interaction which regulates gene expression but is difficult to study experimentally in vivo. This makes accurate computational prediction of such interactions highly important in the field of RNA research. Current predictive methods use canonical Hoogsteen base pairing rules, which whilst biophysically valid, may not reflect the plastic nature of cell biology. Here, we present the first optimization approach to learn a probabilistic model describing RNA-DNA interactions directly from motifs derived from triplex sequencing data. We find that there are several stable interaction codes, including Hoogsteen base pairing and novel RNA-DNA base pairings, which agree with in vitro measurements. We implemented these findings in TriplexAligner, a program that uses the determined interaction codes to predict triplex binding. TriplexAligner predicts RNA-DNA interactions identified in all-to-all sequencing data more accurately than all previously published tools in human and mouse and also predicts previously studied triplex interactions with known regulatory functions. We further validated a novel triplex interaction using biophysical experiments. Our work is an important step towards better understanding of triplex formation and allows genome-wide analyses of RNA-DNA interactions.
Identifiants
pubmed: 36239395
pii: 6760135
doi: 10.1093/bib/bbac445
pmc: PMC9677506
pii:
doi:
Substances chimiques
RNA
63231-63-0
DNA
9007-49-2
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : Goethe University Frankfurt am Main
ID : EXS2026
Organisme : Deutsche Forschungsgemeinschaft
ID : 403584255 - TRR 267
Informations de copyright
© The Author(s) 2022. Published by Oxford University Press.
Références
Genome Res. 2012 Jul;22(7):1372-81
pubmed: 22550012
RNA. 2018 Mar;24(3):371-380
pubmed: 29222118
PLoS Comput Biol. 2010 Jul 15;6(7):e1000852
pubmed: 20657661
Nucleic Acids Res. 2020 Jul 9;48(12):6699-6714
pubmed: 32479626
Mol Cell. 2014 Sep 4;55(5):791-802
pubmed: 25155612
Cell. 2019 Oct 17;179(3):604-618
pubmed: 31607512
Pflugers Arch. 2022 Feb;474(2):191-204
pubmed: 34791525
Nat Commun. 2015 Jul 24;6:7743
pubmed: 26205790
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
Nat Protoc. 2019 Jul;14(7):2036-2068
pubmed: 31175345
J Cell Physiol. 2019 Nov;234(11):19464-19470
pubmed: 31058319
RNA Biol. 2017 Jan 2;14(1):1-5
pubmed: 27763805
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Nucleic Acids Res. 2019 Aug 22;47(14):7213-7222
pubmed: 31265072
Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W70-4
pubmed: 18424795
Nucleic Acids Res. 2020 Jan 8;48(D1):D87-D92
pubmed: 31701148
Nat Commun. 2020 Feb 24;11(1):1018
pubmed: 32094342
Am J Cancer Res. 2019 Jul 01;9(7):1354-1366
pubmed: 31392074
Noncoding RNA. 2021 Sep 23;7(4):
pubmed: 34698214
Nat Rev Genet. 2015 Oct;16(10):583-97
pubmed: 26370899
J Cell Biol. 2001 May 14;153(4):773-84
pubmed: 11352938
Nucleic Acids Res. 2015 Jul 1;43(W1):W39-49
pubmed: 25953851
Genes Dev. 2017 Jun 1;31(11):1095-1108
pubmed: 28698299
Nucleic Acids Res. 2001 Jan 15;29(2):351-61
pubmed: 11139604
Nucleic Acids Res. 2021 Jan 8;49(D1):D884-D891
pubmed: 33137190
Nucleic Acids Res. 2019 Apr 8;47(6):e32
pubmed: 30698727
F1000Res. 2018 Feb 21;7:211
pubmed: 29707199
Elife. 2016 Dec 10;5:
pubmed: 27938663
Nat Protoc. 2019 Nov;14(11):3243-3272
pubmed: 31619811
Nucleic Acids Res. 1993 Dec 11;21(24):5547-53
pubmed: 7506827
Bioinformatics. 2005 Oct 15;21(20):3940-1
pubmed: 16096348
Nat Commun. 2021 Feb 3;12(1):770
pubmed: 33536434
BMC Bioinformatics. 2011 Mar 17;12:77
pubmed: 21414208
Nat Biotechnol. 2008 Aug;26(8):897-9
pubmed: 18688245
Cell Rep. 2015 Apr 21;11(3):474-85
pubmed: 25900080
Nucleic Acids Res. 2016 Dec 15;44(22):10631-10643
pubmed: 27634931
Nucleic Acids Res. 2019 Mar 18;47(5):2306-2321
pubmed: 30605520
Bioinformatics. 2005 Aug 15;21(16):3352-9
pubmed: 15972285
Mol Cell. 2010 May 28;38(4):576-89
pubmed: 20513432
BMC Bioinformatics. 2020 Nov 12;21(1):522
pubmed: 33183242
Brief Bioinform. 2019 Mar 22;20(2):551-564
pubmed: 29697742
Proc Natl Acad Sci U S A. 2019 Mar 26;116(13):6130-6139
pubmed: 30867287
Genes Dev. 2010 Oct 15;24(20):2264-9
pubmed: 20952535
Mol Cell. 2019 Feb 7;73(3):398-411
pubmed: 30735654
Mol Cell. 2011 Nov 18;44(4):667-78
pubmed: 21963238
Database (Oxford). 2016 Apr 07;2016:
pubmed: 27055826
Nat Commun. 2020 Apr 27;11(1):2039
pubmed: 32341350
Nat Chem Biol. 2013 Jan;9(1):18-20
pubmed: 23178934
J Am Chem Soc. 2014 Jan 29;136(4):1381-90
pubmed: 24392825
Genes Dev. 2016 Dec 1;30(23):2571-2580
pubmed: 27941123
Nat Commun. 2021 Feb 11;12(1):941
pubmed: 33574226
Nat Rev Genet. 2015 Feb;16(2):71-84
pubmed: 25554358
Elife. 2018 Apr 12;7:
pubmed: 29648534
Nat Methods. 2017 Apr;14(4):417-419
pubmed: 28263959
Bioinformatics. 2019 Nov 1;35(21):4459-4461
pubmed: 31161212
Proc Natl Acad Sci U S A. 1990 Mar;87(6):2264-8
pubmed: 2315319
Bioinformatics. 2015 Jan 15;31(2):178-86
pubmed: 25262155
Nat Struct Mol Biol. 2021 Nov;28(11):945-954
pubmed: 34759378