Top-down machine learning approach for high-throughput single-molecule analysis.

HCN channels cooperativity divisive segmentation human molecular biophysics structural biology unsupervised analysis zero mode waveguides

Journal

eLife
ISSN: 2050-084X
Titre abrégé: Elife
Pays: England
ID NLM: 101579614

Informations de publication

Date de publication:
08 04 2020
Historique:
received: 06 11 2019
accepted: 08 04 2020
pubmed: 9 4 2020
medline: 23 2 2021
entrez: 9 4 2020
Statut: epublish

Résumé

Single-molecule approaches provide enormous insight into the dynamics of biomolecules, but adequately sampling distributions of states and events often requires extensive sampling. Although emerging experimental techniques can generate such large datasets, existing analysis tools are not suitable to process the large volume of data obtained in high-throughput paradigms. Here, we present a new analysis platform (DISC) that accelerates unsupervised analysis of single-molecule trajectories. By merging model-free statistical learning with the Viterbi algorithm, DISC idealizes single-molecule trajectories up to three orders of magnitude faster with improved accuracy compared to other commonly used algorithms. Further, we demonstrate the utility of DISC algorithm to probe cooperativity between multiple binding events in the cyclic nucleotide binding domains of HCN pacemaker channel. Given the flexible and efficient nature of DISC, we anticipate it will be a powerful tool for unsupervised processing of high-throughput data across a range of single-molecule experiments. During a chemical or biological process, a molecule may transition through a series of states, many of which are rare or short-lived. Advances in technology have made it easier to detect these states by gathering large amounts of data on individual molecules. However, the increasing size of these datasets has put a strain on the algorithms and software used to identify different molecular states. Now, White et al. have developed a new algorithm called DISC which overcomes this technical limitation. Unlike most other algorithms, DISC requires minimal input from the user and uses a new method to group the data into categories that represent distinct molecular states. Although this new approach produces a similar end-result, it reaches this conclusion much faster than more commonly used algorithms. To test the effectiveness of the algorithm, White et al. studied how individual molecules of a chemical known as cAMP bind to parts of proteins called cyclic nucleotide binding domains (or CNDBs for short). A fluorescent tag was attached to single molecules of cAMP and data were collected on the behavior of each molecule. Previous evidence suggested that when four CNDBs join together to form a so-called tetramer complex, this affects the binding of cAMP. Using the DISC system, White et al. showed that individual cAMP molecules interact with all four domains in a similar way, suggesting that the binding of cAMP is not impacted by the formation of a tetramer complex. Analyzing this data took DISC less than 20 minutes compared to existing algorithms which took anywhere between four hours and two weeks to complete. The enhanced speed of the DISC algorithm could make it easier to analyze much larger datasets from other techniques in addition to fluorescence. This means that a greater number of states can be sampled, providing a deeper insight into the inner workings of biological and chemical processes.

Autres résumés

Type: plain-language-summary (eng)
During a chemical or biological process, a molecule may transition through a series of states, many of which are rare or short-lived. Advances in technology have made it easier to detect these states by gathering large amounts of data on individual molecules. However, the increasing size of these datasets has put a strain on the algorithms and software used to identify different molecular states. Now, White et al. have developed a new algorithm called DISC which overcomes this technical limitation. Unlike most other algorithms, DISC requires minimal input from the user and uses a new method to group the data into categories that represent distinct molecular states. Although this new approach produces a similar end-result, it reaches this conclusion much faster than more commonly used algorithms. To test the effectiveness of the algorithm, White et al. studied how individual molecules of a chemical known as cAMP bind to parts of proteins called cyclic nucleotide binding domains (or CNDBs for short). A fluorescent tag was attached to single molecules of cAMP and data were collected on the behavior of each molecule. Previous evidence suggested that when four CNDBs join together to form a so-called tetramer complex, this affects the binding of cAMP. Using the DISC system, White et al. showed that individual cAMP molecules interact with all four domains in a similar way, suggesting that the binding of cAMP is not impacted by the formation of a tetramer complex. Analyzing this data took DISC less than 20 minutes compared to existing algorithms which took anywhere between four hours and two weeks to complete. The enhanced speed of the DISC algorithm could make it easier to analyze much larger datasets from other techniques in addition to fluorescence. This means that a greater number of states can be sampled, providing a deeper insight into the inner workings of biological and chemical processes.

Identifiants

pubmed: 32267232
doi: 10.7554/eLife.53357
pii: 53357
pmc: PMC7205464
doi:
pii:

Substances chimiques

Fluorescent Dyes 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : National Science Foundation
ID : CHE-1856518
Pays : International
Organisme : NINDS NIH HHS
ID : NS-081293
Pays : United States
Organisme : NINDS NIH HHS
ID : R35 NS116850
Pays : United States
Organisme : NINDS NIH HHS
ID : NS-101723
Pays : United States
Organisme : NIGMS NIH HHS
ID : R21 GM127957
Pays : United States
Organisme : NIGMS NIH HHS
ID : GM007507
Pays : United States
Organisme : NIGMS NIH HHS
ID : GM127957
Pays : United States
Organisme : NINDS NIH HHS
ID : NS-081320
Pays : United States

Informations de copyright

© 2020, White et al.

Déclaration de conflit d'intérêts

DW, MG, RG No competing interests declared, BC Reviewing editor, eLife

Références

Nat Commun. 2019 Jan 17;10(1):272
pubmed: 30655518
Nat Methods. 2016 Apr;13(4):341-4
pubmed: 26878382
Nat Methods. 2011 Nov 13;9(1):68-71
pubmed: 22081126
Annu Rev Phys Chem. 2019 Jun 14;70:301-322
pubmed: 30978297
Proc Natl Acad Sci U S A. 2014 Jan 14;111(2):664-9
pubmed: 24379388
Biophys J. 2017 May 23;112(10):2117-2126
pubmed: 28538149
J Chem Phys. 2018 Mar 28;148(12):123320
pubmed: 29604816
Nat Chem Biol. 2011 Dec 18;8(2):162-9
pubmed: 22179066
Nat Commun. 2016 Mar 17;7:11026
pubmed: 26984516
Nano Lett. 2018 Oct 10;18(10):6633-6637
pubmed: 30251862
Nat Methods. 2013 Mar;10(3):265-9
pubmed: 23396281
J Phys Chem Lett. 2014 Sep 18;5(18):3157-3161
pubmed: 25247055
Science. 2009 Jan 2;323(5910):133-8
pubmed: 19023044
J Phys Chem B. 2005 Jan 13;109(1):617-28
pubmed: 16851054
Methods Enzymol. 2010;472:153-78
pubmed: 20580964
Biophys J. 2008 Mar 1;94(5):1826-35
pubmed: 17921203
Biophys J. 2009 Dec 16;97(12):3196-205
pubmed: 20006957
J Phys Chem B. 2018 Jun 14;122(23):6134-6147
pubmed: 29737844
PLoS One. 2012;7(2):e30024
pubmed: 22363412
Angew Chem Int Ed Engl. 2017 Feb 20;56(9):2399-2402
pubmed: 28116856
Biophys J. 2014 Mar 18;106(6):1327-37
pubmed: 24655508
Biophys J. 2015 Dec 1;109(11):2268-76
pubmed: 26636938
Nat Methods. 2015 Mar;12(3):244-50, 3 p following 250
pubmed: 25599551
Biophys J. 2004 Mar;86(3):1488-501
pubmed: 14990476
J Phys Chem Lett. 2015 May 21;6(10):1819-23
pubmed: 26263254
Elife. 2016 Nov 18;5:
pubmed: 27858593
Biochem Soc Trans. 2017 Jun 15;45(3):759-769
pubmed: 28620037
Science. 2003 Jan 31;299(5607):682-6
pubmed: 12560545
Biophys J. 2006 Sep 1;91(5):1941-51
pubmed: 16766620
J Gen Physiol. 2018 Sep 3;150(9):1273-1286
pubmed: 30042141
J Am Chem Soc. 2016 Aug 24;138(33):10546-53
pubmed: 27409974
Biophys J. 2000 Oct;79(4):1915-27
pubmed: 11023897
Nature. 2016 Feb 4;530(7588):77-80
pubmed: 26842056
Elife. 2020 Apr 08;9:
pubmed: 32267232
J Phys Chem B. 2019 Jan 24;123(3):689-701
pubmed: 30632755
Nat Chem Biol. 2006 Feb;2(2):87-94
pubmed: 16415859
Faraday Discuss. 2015;184:9-36
pubmed: 26616210
J Phys Chem A. 2017 Jul 13;121(27):5100-5109
pubmed: 28616980
Methods. 2016 Aug 1;105:90-8
pubmed: 27038745
Biophys J. 2015 Feb 3;108(3):540-56
pubmed: 25650922
J Am Chem Soc. 2009 Dec 30;131(51):18192-3
pubmed: 19961226

Auteurs

David S White (DS)

Department of Neuroscience, University of Wisconsin-Madison, Madison, United States.
Department of Chemistry, University of Wisconsin-Madison, Madison, United States.

Marcel P Goldschen-Ohm (MP)

Department of Neuroscience, University of Texas at Austin, Austin, United States.

Randall H Goldsmith (RH)

Department of Chemistry, University of Wisconsin-Madison, Madison, United States.

Baron Chanda (B)

Department of Neuroscience, University of Wisconsin-Madison, Madison, United States.
Department of Biomolecular Chemistry University of Wisconsin-Madison, Madison, United States.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Cephalometry Humans Anatomic Landmarks Software Internet

Classifications MeSH