OGRE: calculate, visualize, and analyze overlap between genomic input regions and public annotations.
Annotation
Genomic association
Genomic regions
Omics
Overlap
Regulatory elements
Shiny
Visualization
Journal
BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194
Informations de publication
Date de publication:
26 Jul 2023
26 Jul 2023
Historique:
received:
19
04
2022
accepted:
18
07
2023
medline:
28
7
2023
pubmed:
27
7
2023
entrez:
26
7
2023
Statut:
epublish
Résumé
Modern genome sequencing leads to an ever-growing collection of genomic annotations. Combining these elements with a set of input regions (e.g. genes) would yield new insights in genomic associations, such as those involved in gene regulation. The required data are scattered across different databases making a manual approach tiresome, unpractical, and prone to error. Semi-automatic approaches require programming skills in data parsing, processing, overlap calculation, and visualization, which most biomedical researchers lack. Our aim was to develop an automated tool providing all necessary algorithms, benefiting both bioinformaticians and researchers without bioinformatic training. We developed overlapping annotated genomic regions (OGRE) as a comprehensive tool to associate and visualize input regions with genomic annotations. It does so by parsing regions of interest, mining publicly available annotations, and calculating possible overlaps between them. The user can thus identify location, type, and number of associated regulatory elements. Results are presented as easy to understand visualizations and result tables. We applied OGRE to recent studies and could show high reproducibility and potential new insights. To demonstrate OGRE's performance in terms of running time and output, we have conducted a benchmark and compared its features with similar tools. OGRE's functions and built-in annotations can be applied as a downstream overlap association step, which is compatible with most genomic sequencing outputs, and can thus enrich pre-existing analyses pipelines. Compared to similar tools, OGRE shows competitive performance, offers additional features, and has been successfully applied to two recent studies. Overall, OGRE addresses the lack of tools for automatic analysis, local genomic overlap calculation, and visualization by providing an easy to use, end-to-end solution for both biologists and computational scientists.
Sections du résumé
BACKGROUND
BACKGROUND
Modern genome sequencing leads to an ever-growing collection of genomic annotations. Combining these elements with a set of input regions (e.g. genes) would yield new insights in genomic associations, such as those involved in gene regulation. The required data are scattered across different databases making a manual approach tiresome, unpractical, and prone to error. Semi-automatic approaches require programming skills in data parsing, processing, overlap calculation, and visualization, which most biomedical researchers lack. Our aim was to develop an automated tool providing all necessary algorithms, benefiting both bioinformaticians and researchers without bioinformatic training.
RESULTS
RESULTS
We developed overlapping annotated genomic regions (OGRE) as a comprehensive tool to associate and visualize input regions with genomic annotations. It does so by parsing regions of interest, mining publicly available annotations, and calculating possible overlaps between them. The user can thus identify location, type, and number of associated regulatory elements. Results are presented as easy to understand visualizations and result tables. We applied OGRE to recent studies and could show high reproducibility and potential new insights. To demonstrate OGRE's performance in terms of running time and output, we have conducted a benchmark and compared its features with similar tools.
CONCLUSIONS
CONCLUSIONS
OGRE's functions and built-in annotations can be applied as a downstream overlap association step, which is compatible with most genomic sequencing outputs, and can thus enrich pre-existing analyses pipelines. Compared to similar tools, OGRE shows competitive performance, offers additional features, and has been successfully applied to two recent studies. Overall, OGRE addresses the lack of tools for automatic analysis, local genomic overlap calculation, and visualization by providing an easy to use, end-to-end solution for both biologists and computational scientists.
Identifiants
pubmed: 37496002
doi: 10.1186/s12859-023-05422-w
pii: 10.1186/s12859-023-05422-w
pmc: PMC10369718
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
300Subventions
Organisme : Deutsche Forschungsgemeinschaft
ID : CRU326
Organisme : Deutsche Forschungsgemeinschaft
ID : LA4064/4-1
Informations de copyright
© 2023. The Author(s).
Références
PLoS Comput Biol. 2013;9(8):e1003118
pubmed: 23950696
Nucleic Acids Res. 2021 Jan 8;49(D1):D916-D923
pubmed: 33270111
Genome Biol. 2014;15(12):550
pubmed: 25516281
BMC Genomics. 2017 May 24;18(Suppl 4):392
pubmed: 28589860
Nat Methods. 2017 Nov;14(11):1083-1086
pubmed: 28991892
Nucleic Acids Res. 2016 Jul 8;44(12):5550-6
pubmed: 27257071
Sci Rep. 2019 Sep 16;9(1):13377
pubmed: 31527706
Methods Mol Biol. 2016;1418:335-51
pubmed: 27008022
Nucleic Acids Res. 2021 Jan 8;49(D1):D1046-D1057
pubmed: 33221922
Nat Rev Mol Cell Biol. 2018 Oct;19(10):621-637
pubmed: 29946135
Bioinformatics. 2016 Jan 15;32(2):289-91
pubmed: 26424858
Comput Struct Biotechnol J. 2019 Nov 17;18:9-19
pubmed: 31890139
Adv Exp Med Biol. 2020;1253:3-55
pubmed: 32445090
Nucleic Acids Res. 2020 Jan 8;48(D1):D87-D92
pubmed: 31701148
Mol Ther Nucleic Acids. 2019 Sep 6;17:337-346
pubmed: 31299595
Nucleic Acids Res. 2020 Jan 8;48(D1):D682-D688
pubmed: 31691826
Nat Commun. 2020 May 18;11(1):2472
pubmed: 32424124
Nucleic Acids Res. 2017 Jul 3;45(W1):W490-W494
pubmed: 28472390
Genome Biol. 2012 Oct 03;13(10):R87
pubmed: 23034086
Genome Res. 2002 Jun;12(6):996-1006
pubmed: 12045153
PLoS One. 2019 Sep 4;14(9):e0215495
pubmed: 31483836
Mol Cell. 2010 May 28;38(4):576-89
pubmed: 20513432
Bioinformatics. 2013 Nov 15;29(22):2852-8
pubmed: 24008418
BMC Biol. 2018 Aug 20;16(1):94
pubmed: 30124169
Genome Biol. 2015 Mar 24;16:56
pubmed: 25887522
Cell. 1980 May;20(1):85-93
pubmed: 6156004
Bioinformatics. 2017 Aug 01;33(15):2381-2383
pubmed: 28369316
J Mol Biol. 1987 Jul 20;196(2):261-82
pubmed: 3656447
Cell Rep Med. 2021 Sep 09;2(9):100395
pubmed: 34622232
Genomics. 2008 Mar;91(3):243-8
pubmed: 18082363