RCDPeaks: memory-efficient density peaks clustering of long molecular dynamics.


Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
28 03 2022
Historique:
received: 30 08 2021
revised: 06 12 2021
accepted: 07 01 2022
pubmed: 13 1 2022
medline: 3 2 2023
entrez: 12 1 2022
Statut: ppublish

Résumé

Density Peaks is a widely spread clustering algorithm that has been previously applied to Molecular Dynamics (MD) simulations. Its conception of cluster centers as elements displaying both a high density of neighbors and a large distance to other elements of high density, particularly fits the nature of a geometrical converged MD simulation. Despite its theoretical convenience, implementations of Density Peaks carry a quadratic memory complexity that only permits the analysis of relatively short trajectories. Here, we describe DP+, an exact novel implementation of Density Peaks that drastically reduces the RAM consumption in comparison to the scarcely available alternatives designed for MD. Based on DP+, we developed RCDPeaks, a refined variant of the original Density Peaks algorithm. Through the use of DP+, RCDPeaks was able to cluster a one-million frames trajectory using less than 4.5 GB of RAM, a task that would have taken more than 2 TB and about 3× more time with the fastest and less memory-hunger alternative currently available. Other key features of RCDPeaks include the automatic selection of parameters, the screening of center candidates and the geometrical refining of returned clusters. The source code and documentation of RCDPeaks are free and publicly available on GitHub (https://github.com/LQCT/RCDPeaks.git). Supplementary data are available at Bioinformatics online.

Identifiants

pubmed: 35020783
pii: 6502276
doi: 10.1093/bioinformatics/btac021
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

1863-1869

Subventions

Organisme : Eiffel Scholarship Program of Excellence of Campus France
ID : P744468L
Organisme : Project Hubert Curien-Carlos J. Finlay
ID : 41814TM
Organisme : Fondo Nacional de Desarrollo Científico y Tecnológico
ID : 3170107

Informations de copyright

© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Auteurs

Daniel Platero-Rochart (D)

Departamento de Química-Física, Laboratorio de Química Computacional y Teórica (LQCT), Facultad de Química, Universidad de La Habana, La Habana 10400, Cuba.

Roy González-Alemán (R)

Departamento de Química-Física, Laboratorio de Química Computacional y Teórica (LQCT), Facultad de Química, Universidad de La Habana, La Habana 10400, Cuba.
Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Saclay, Gif-sur-Yvette F-91198, France.

Erix W Hernández-Rodríguez (EW)

Laboratorio de Bioinformática y Química Computacional (LBQC), Facultad de Medicina, Universidad Católica del Maule, Talca 3460000, Chile.
Escuela de Química y Farmacia, Facultad de Medicina, Universidad Católica del Maule, Talca 3460000, Chile.

Fabrice Leclerc (F)

Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris Saclay, Gif-sur-Yvette F-91198, France.

Julio Caballero (J)

Departamento de Bioinformática, Facultad de Ingeniería, Centro de Bioinformática, Simulación y Modelado (CBSM), Universidad de Talca, Talca 3460000, Chile.

Luis Montero-Cabrera (L)

Departamento de Química-Física, Laboratorio de Química Computacional y Teórica (LQCT), Facultad de Química, Universidad de La Habana, La Habana 10400, Cuba.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Photosynthesis Ribulose-Bisphosphate Carboxylase Carbon Dioxide Molecular Dynamics Simulation Cyanobacteria

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature

Classifications MeSH