RCDPeaks: memory-efficient density peaks clustering of long molecular dynamics.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
28 03 2022
28 03 2022
Historique:
received:
30
08
2021
revised:
06
12
2021
accepted:
07
01
2022
pubmed:
13
1
2022
medline:
3
2
2023
entrez:
12
1
2022
Statut:
ppublish
Résumé
Density Peaks is a widely spread clustering algorithm that has been previously applied to Molecular Dynamics (MD) simulations. Its conception of cluster centers as elements displaying both a high density of neighbors and a large distance to other elements of high density, particularly fits the nature of a geometrical converged MD simulation. Despite its theoretical convenience, implementations of Density Peaks carry a quadratic memory complexity that only permits the analysis of relatively short trajectories. Here, we describe DP+, an exact novel implementation of Density Peaks that drastically reduces the RAM consumption in comparison to the scarcely available alternatives designed for MD. Based on DP+, we developed RCDPeaks, a refined variant of the original Density Peaks algorithm. Through the use of DP+, RCDPeaks was able to cluster a one-million frames trajectory using less than 4.5 GB of RAM, a task that would have taken more than 2 TB and about 3× more time with the fastest and less memory-hunger alternative currently available. Other key features of RCDPeaks include the automatic selection of parameters, the screening of center candidates and the geometrical refining of returned clusters. The source code and documentation of RCDPeaks are free and publicly available on GitHub (https://github.com/LQCT/RCDPeaks.git). Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 35020783
pii: 6502276
doi: 10.1093/bioinformatics/btac021
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1863-1869Subventions
Organisme : Eiffel Scholarship Program of Excellence of Campus France
ID : P744468L
Organisme : Project Hubert Curien-Carlos J. Finlay
ID : 41814TM
Organisme : Fondo Nacional de Desarrollo Científico y Tecnológico
ID : 3170107
Informations de copyright
© The Author(s) 2022. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.