dipwmsearch: a Python package for searching di-PWM motifs.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
03 04 2023
Historique:
received: 15 11 2022
revised: 23 02 2023
accepted: 12 03 2023
medline: 11 4 2023
pubmed: 4 4 2023
entrez: 3 4 2023
Statut: ppublish

Résumé

Seeking probabilistic motifs in a sequence is a common task to annotate putative transcription factor binding sites or other RNA/DNA binding sites. Useful motif representations include position weight matrices (PWMs), dinucleotide PWMs (di-PWMs), and hidden Markov models (HMMs). Dinucleotide PWMs not only combine the simplicity of PWMs-a matrix form and a cumulative scoring function-but also incorporate dependency between adjacent positions in the motif (unlike PWMs which disregard any dependency). For instance to represent binding sites, the HOCOMOCO database provides di-PWM motifs derived from experimental data. Currently, two programs, SPRy-SARUS and MOODS, can search for occurrences of di-PWMs in sequences. We propose a Python package called dipwmsearch, which provides an original and efficient algorithm for this task (it first enumerates matching words for the di-PWM, and then searches these all at once in the sequence, even if the latter contains IUPAC codes). The user benefits from an easy installation via Pypi or conda, a comprehensive documentation, and executable scripts that facilitate the use of di-PWMs. dipwmsearch is available at https://pypi.org/project/dipwmsearch/ and https://gite.lirmm.fr/rivals/dipwmsearch/ under Cecill license.

Identifiants

pubmed: 37010504
pii: 7100340
doi: 10.1093/bioinformatics/btad141
pmc: PMC10081870
pii:
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© The Author(s) 2023. Published by Oxford University Press.

Auteurs

Marie Mille (M)

LIRMM, Univ Montpellier, CNRS, Montpellier, France.

Julie Ripoll (J)

LIRMM, Univ Montpellier, CNRS, Montpellier, France.

Bastien Cazaux (B)

LIRMM, Univ Montpellier, CNRS, Montpellier, France.

Eric Rivals (E)

LIRMM, Univ Montpellier, CNRS, Montpellier, France.
Institut Français de Bioinformatique, CNRS UAR 3601, Évry, France.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Adenosine Triphosphate Adenosine Diphosphate Mitochondrial ADP, ATP Translocases Binding Sites Mitochondria
Humans Colorectal Neoplasms Biomarkers, Tumor Prognosis Gene Expression Regulation, Neoplastic

Classifications MeSH