PySFD: comprehensive molecular insights from significant feature differences detected among many simulated ensembles.
Journal
Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944
Informations de publication
Date de publication:
01 05 2019
01 05 2019
Historique:
received:
18
04
2018
revised:
09
07
2018
accepted:
20
09
2018
pubmed:
25
9
2018
medline:
19
5
2020
entrez:
25
9
2018
Statut:
ppublish
Résumé
Many modeling analyses of molecular dynamics (MD) simulations are based on a definition of states that can be (groups of) clusters of simulation frames in a feature space composed of molecular coordinates. With increasing dimension of this feature space (due to the increasing size or complexity of a simulated molecule), it becomes very difficult to cluster the underlying MD data and estimate a statistically robust model. To mitigate this "curse of dimensionality", one can reduce the feature space, e.g., with principal component or time-lagged independent component analysis transformations, focusing the analysis on the most important modes of transitions. In practice, however, all these reduction strategies may neglect important molecular details that are susceptible to experimental verification. To recover such molecular details, I have developed PySFD (Significant Feature Differences analyzer for Python), a multi-processing software package that efficiently selects significantly different features of any user-defined feature type among potentially many different simulated state ensembles, such as meta-stable states of a Markov State Model (MSM). Applying PySFD on MSMs of an aggregate of 300 microseconds MD simulations recently performed on the major histocompatibility complex class II (MHCII) protein, I demonstrate how this toolkit can extract and visualize valuable mechanistic information from big MD simulation data, e.g., in form of networks of dynamic interaction changes connecting functionally relevant sites of a protein complex. PySFD is freely available under the L-GPL license at https://github.com/markovmodel/PySFD. Supplementary data are available at Bioinformatics online.
Identifiants
pubmed: 30247628
pii: 5104940
doi: 10.1093/bioinformatics/bty818
pmc: PMC6499238
doi:
Substances chimiques
Proteins
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1588-1590Informations de copyright
© The Author(s) 2018. Published by Oxford University Press.
Références
J Chem Phys. 2004 Jun 15;120(23):10880-9
pubmed: 15268118
Bioinformatics. 2007 Oct 1;23(19):2507-17
pubmed: 17720704
Curr Protoc Neurosci. 2001 May;Chapter 4:Unit 4.15
pubmed: 18428478
Proc Natl Acad Sci U S A. 2009 Nov 10;106(45):19011-6
pubmed: 19887634
J Mol Graph Model. 2010 Sep;29(2):116-25
pubmed: 20675161
J Chem Theory Comput. 2010;6(3):787-94
pubmed: 23626502
J Chem Phys. 2013 Jul 7;139(1):015102
pubmed: 23822324
Phys Chem Chem Phys. 2014 Sep 28;16(36):19181-91
pubmed: 24955434
Elife. 2014 Oct 28;3:null
pubmed: 25271373
J Biol Chem. 2015 May 29;290(22):13992-4003
pubmed: 25869126
Nat Commun. 2015 Jul 02;6:7653
pubmed: 26134632
J Chem Theory Comput. 2014 May 13;10(5):2064-9
pubmed: 26580533
J Chem Theory Comput. 2009 Oct 13;5(10):2595-605
pubmed: 26631775
Biochim Biophys Acta. 2016 Jul;1858(7 Pt B):1652-62
pubmed: 26806157
J Chem Theory Comput. 2016 Dec 13;12(12):6118-6129
pubmed: 27792332
Nat Commun. 2016 Nov 09;7:13224
pubmed: 27827392
J Chem Phys. 2016 Nov 14;145(18):184114
pubmed: 27846702
Bioinformatics. 2018 Jun 1;34(11):1941-1943
pubmed: 29329361
J Mol Graph. 1996 Feb;14(1):33-8, 27-8
pubmed: 8744570