High-throughput cryo-ET structural pattern mining by unsupervised deep iterative subtomogram clustering.


Journal

Proceedings of the National Academy of Sciences of the United States of America
ISSN: 1091-6490
Titre abrégé: Proc Natl Acad Sci U S A
Pays: United States
ID NLM: 7505876

Informations de publication

Date de publication:
11 04 2023
Historique:
medline: 11 4 2023
entrez: 7 4 2023
pubmed: 8 4 2023
Statut: ppublish

Résumé

Cryoelectron tomography directly visualizes heterogeneous macromolecular structures in their native and complex cellular environments. However, existing computer-assisted structure sorting approaches are low throughput or inherently limited due to their dependency on available templates and manual labels. Here, we introduce a high-throughput template-and-label-free deep learning approach, Deep Iterative Subtomogram Clustering Approach (DISCA), that automatically detects subsets of homogeneous structures by learning and modeling 3D structural features and their distributions. Evaluation on five experimental cryo-ET datasets shows that an unsupervised deep learning based method can detect diverse structures with a wide range of molecular sizes. This unsupervised detection paves the way for systematic unbiased recognition of macromolecular complexes in situ.

Identifiants

pubmed: 37027429
doi: 10.1073/pnas.2213149120
pmc: PMC10104553
doi:

Substances chimiques

Macromolecular Substances 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

e2213149120

Subventions

Organisme : NIGMS NIH HHS
ID : P41 GM103712
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM134020
Pays : United States

Références

IEEE Trans Pattern Anal Mach Intell. 1979 Feb;1(2):224-7
pubmed: 21868852
Nat Methods. 2019 Apr;16(4):285
pubmed: 30923385
Proc Natl Acad Sci U S A. 2013 Nov 5;110(45):18037-41
pubmed: 24106306
J Struct Biol. 2008 Jun;162(3):436-50
pubmed: 18440828
Structure. 2014 Oct 7;22(10):1528-37
pubmed: 25242455
Biochim Biophys Acta Proteins Proteom. 2018 Jun 13;1866(9):973-981
pubmed: 29908328
Nat Commun. 2020 Oct 15;11(1):5208
pubmed: 33060581
Nat Methods. 2021 Aug;18(8):930-936
pubmed: 34326541
J Struct Biol. 2012 May;178(2):139-51
pubmed: 22245546
Nat Methods. 2020 Feb;17(2):209-216
pubmed: 31907446
Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit. 2020 Jun;2020:4072-4082
pubmed: 33716478
J Struct Biol. 2015 Jun;190(3):279-90
pubmed: 25956334
Nat Commun. 2021 Apr 16;12(1):2302
pubmed: 33863902
Science. 2021 Aug 6;373(6555):700-704
pubmed: 34353956
Nature. 2021 Aug;596(7873):590-596
pubmed: 34293799
Mol Cell Proteomics. 2010 Jun;9(6):1157-66
pubmed: 20139370
Nat Methods. 2018 Nov;15(11):955-961
pubmed: 30349041
J Struct Biol. 2017 Feb;197(2):191-198
pubmed: 27313000
Biochem J. 2009 Dec 14;425(1):1-11
pubmed: 20001958
PLoS One. 2019 Apr 15;14(4):e0215531
pubmed: 30986271
Science. 2020 Jul 31;369(6503):554-557
pubmed: 32732422
J Struct Biol. 2016 Sep;195(3):325-336
pubmed: 27424268
Bioinformatics. 2019 Jul 15;35(14):i260-i268
pubmed: 31510673
J Struct Biol. 2018 May;202(2):150-160
pubmed: 29289599
Science. 2020 Oct 9;370(6513):203-208
pubmed: 32817270
Cell. 2018 Feb 8;172(4):696-705.e12
pubmed: 29398115
Nat Methods. 2021 Nov;18(11):1386-1394
pubmed: 34675434
Sci Rep. 2020 Mar 9;10(1):4282
pubmed: 32152330
Bioinform Res Appl. 2020 Dec;12304:82-94
pubmed: 33860285
Commun Biol. 2021 Jul 15;4(1):874
pubmed: 34267316
Nat Methods. 2021 Feb;18(2):176-185
pubmed: 33542510
Open Biol. 2019 Feb 28;9(2):180241
pubmed: 30938578
J Struct Biol. 2012 May;178(2):152-64
pubmed: 22420977
Structure. 2019 Apr 2;27(4):679-691.e14
pubmed: 30744995
Mach Vis Appl. 2018 Nov;29(8):1227-1236
pubmed: 31511756
J Struct Biol. 2022 Sep;214(3):107872
pubmed: 35660516
Nucleic Acids Res. 2011 Jan;39(Database issue):D456-64
pubmed: 20935055
J Cell Biol. 2013 Aug 5;202(3):407-19
pubmed: 23918936
Proc Natl Acad Sci U S A. 2000 Dec 19;97(26):14245-50
pubmed: 11087814
Q Rev Biophys. 2012 Feb;45(1):27-56
pubmed: 22082691
BMC Bioinformatics. 2016 Oct 5;17(1):405
pubmed: 27716029
Nat Plants. 2019 Apr;5(4):436-446
pubmed: 30962530
J Struct Biol. 2019 Nov 1;208(2):107-114
pubmed: 31425790
Methods. 2016 May 1;100:25-34
pubmed: 26931650
Nat Methods. 2017 Oct;14(10):983-985
pubmed: 28846087
Nat Methods. 2019 Sep;16(9):911-917
pubmed: 31358979
Nat Methods. 2021 Feb;18(2):186-193
pubmed: 33542511
J Struct Biol. 2020 Jun 1;210(3):107498
pubmed: 32276087
J Struct Biol. 2012 Dec;180(3):519-30
pubmed: 23000701

Auteurs

Xiangrui Zeng (X)

Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213.

Anson Kahng (A)

Computer Science Department, University of Rochester, Rochester, NY 14620.

Liang Xue (L)

Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany.
Faculty of Biosciences, Collaboration for joint PhD degree between European Molecular Biology Laboratory and Heidelberg University, Heidelberg 69117, Germany.

Julia Mahamid (J)

Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany.

Yi-Wei Chang (YW)

Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA 19104.

Min Xu (M)

Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213.

Articles similaires

Photosynthesis Ribulose-Bisphosphate Carboxylase Carbon Dioxide Molecular Dynamics Simulation Cyanobacteria
1.00
Humans Magnetic Resonance Imaging Brain Infant, Newborn Infant, Premature
Cephalometry Humans Anatomic Landmarks Software Internet
Humans Male Female Intensive Care Units COVID-19

Classifications MeSH