ModelCIF: An Extension of PDBx/mmCIF Data Representation for Computed Structure Models.

Computed Structure Models Data Standard ModelCIF PDBx/mmCIF Protein Structure Prediction

Journal

Journal of molecular biology
ISSN: 1089-8638
Titre abrégé: J Mol Biol
Pays: Netherlands
ID NLM: 2985088R

Informations de publication

Date de publication:
15 07 2023
Historique:
received: 29 11 2022
revised: 15 02 2023
accepted: 16 02 2023
pmc-release: 15 07 2024
medline: 27 6 2023
pubmed: 25 2 2023
entrez: 24 2 2023
Statut: ppublish

Résumé

ModelCIF (github.com/ihmwg/ModelCIF) is a data information framework developed for and by computational structural biologists to enable delivery of Findable, Accessible, Interoperable, and Reusable (FAIR) data to users worldwide. ModelCIF describes the specific set of attributes and metadata associated with macromolecular structures modeled by solely computational methods and provides an extensible data representation for deposition, archiving, and public dissemination of predicted three-dimensional (3D) models of macromolecules. It is an extension of the Protein Data Bank Exchange / macromolecular Crystallographic Information Framework (PDBx/mmCIF), which is the global data standard for representing experimentally-determined 3D structures of macromolecules and associated metadata. The PDBx/mmCIF framework and its extensions (e.g., ModelCIF) are managed by the Worldwide Protein Data Bank partnership (wwPDB, wwpdb.org) in collaboration with relevant community stakeholders such as the wwPDB ModelCIF Working Group (wwpdb.org/task/modelcif). This semantically rich and extensible data framework for representing computed structure models (CSMs) accelerates the pace of scientific discovery. Herein, we describe the architecture, contents, and governance of ModelCIF, and tools and processes for maintaining and extending the data standard. Community tools and software libraries that support ModelCIF are also described.

Identifiants

pubmed: 36828268
pii: S0022-2836(23)00077-3
doi: 10.1016/j.jmb.2023.168021
pmc: PMC10293049
mid: NIHMS1877877
pii:
doi:

Substances chimiques

Macromolecular Substances 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S.

Langues

eng

Sous-ensembles de citation

IM

Pagination

168021

Subventions

Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/S020071/1
Pays : United Kingdom
Organisme : NIGMS NIH HHS
ID : R01 GM083960
Pays : United States
Organisme : Howard Hughes Medical Institute
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM109046
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM133198
Pays : United States
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/W017970/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/S020144/1
Pays : United Kingdom
Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/V004247/1
Pays : United Kingdom
Organisme : NIGMS NIH HHS
ID : U01 GM093324
Pays : United States
Organisme : NIGMS NIH HHS
ID : P41 GM109824
Pays : United States

Informations de copyright

Copyright © 2023 The Author(s). Published by Elsevier Ltd.. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Références

Nucleic Acids Res. 2019 Jan 8;47(D1):D520-D528
pubmed: 30357364
J Mol Biol. 1993 Dec 5;234(3):779-815
pubmed: 8254673
Protein Sci. 2021 Jan;30(1):70-82
pubmed: 32881101
Proteins. 2018 Mar;86 Suppl 1:97-112
pubmed: 29139163
Science. 2001 Oct 5;294(5540):93-6
pubmed: 11588250
Bioinformatics. 2013 Nov 1;29(21):2722-8
pubmed: 23986568
Gigascience. 2022 Nov 30;11:
pubmed: 36448847
Structure. 2018 Jun 5;26(6):894-904.e2
pubmed: 29657133
Nucleic Acids Res. 2023 Jan 6;51(D1):D488-D508
pubmed: 36420884
Bioinformatics. 2001 Nov;17(11):1047-52
pubmed: 11724733
Nucleic Acids Res. 2020 Jan 8;48(D1):D314-D319
pubmed: 31733063
Nat Struct Biol. 2003 Dec;10(12):980
pubmed: 14634627
Science. 2023 Mar 17;379(6637):1123-1130
pubmed: 36927031
PLoS Comput Biol. 2020 Oct 19;16(10):e1008247
pubmed: 33075050
Nucleic Acids Res. 2021 Jan 8;49(D1):D1388-D1395
pubmed: 33151290
Science. 2021 Aug 20;373(6557):871-876
pubmed: 34282049
Nucleic Acids Res. 2016 Jan 4;44(D1):D1214-9
pubmed: 26467479
RNA. 2009 Feb;15(2):189-99
pubmed: 19144906
J Mol Biol. 2022 Jun 15;434(11):167599
pubmed: 35460671
J Struct Funct Genomics. 2011 Jul;12(2):45-54
pubmed: 21472436
Methods Enzymol. 2011;487:545-74
pubmed: 21187238
Proteins. 2004 Dec 1;57(4):702-10
pubmed: 15476259
Nucleic Acids Res. 2016 Apr 20;44(7):e63
pubmed: 26687716
Science. 2021 Dec 10;374(6573):eabm4805
pubmed: 34762488
Proteins. 2021 Dec;89(12):1607-1617
pubmed: 34533838
Bioinformatics. 2022 Aug 10;38(16):4042-4043
pubmed: 35758624
Structure. 2006 Aug;14(8):1211-7
pubmed: 16955948
Nat Methods. 2015 Jan;12(1):7-8
pubmed: 25549265
Sci Data. 2016 Mar 15;3:160018
pubmed: 26978244
Science. 1973 Jul 20;181(4096):223-30
pubmed: 4124164
Science. 2021 Aug 27;373(6558):1047-1051
pubmed: 34446608
Nature. 2021 Aug;596(7873):583-589
pubmed: 34265844
Nucleic Acids Res. 2017 Jan 4;45(D1):D170-D176
pubmed: 27899574
Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444
pubmed: 34791371
Database (Oxford). 2013 Apr 26;2013:bat031
pubmed: 23624946
Nat Commun. 2021 Aug 18;12(1):5011
pubmed: 34408149
Nucleic Acids Res. 2017 Jan 4;45(D1):D313-D319
pubmed: 27899672
Nucleic Acids Res. 2018 Jul 2;46(W1):W296-W303
pubmed: 29788355
Nucleic Acids Res. 2021 Jul 2;49(W1):W431-W437
pubmed: 33956157
Structure. 2022 Oct 6;30(10):1385-1394.e3
pubmed: 36049478
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489
pubmed: 33237286
Bioinformatics. 2015 Apr 15;31(8):1274-8
pubmed: 25540181
Nucleic Acids Res. 2014 Jan;42(Database issue):D336-46
pubmed: 24271400
Methods Mol Biol. 2016;1490:199-215
pubmed: 27665601
Proteins. 1994 Apr;18(4):309-17
pubmed: 8208723
Structure. 2020 Aug 4;28(8):963-976.e6
pubmed: 32531203

Auteurs

Brinda Vallat (B)

Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA. Electronic address: brinda.vallat@rcsb.org.

Gerardo Tauriello (G)

Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland.

Stefan Bienert (S)

Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland.

Juergen Haas (J)

Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland.

Benjamin M Webb (BM)

Department of Bioengineering and Therapeutic Sciences, the Quantitative Biosciences Institute (QBI), and the Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94157, USA.

Augustin Žídek (A)

DeepMind, London, UK.

Wei Zheng (W)

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.

Ezra Peisach (E)

Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.

Dennis W Piehl (DW)

Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.

Ivan Anischanka (I)

Department of Biochemistry, and Institute for Protein Design, University of Washington, Seattle, WA 98195, USA.

Ian Sillitoe (I)

Department of Structural and Molecular Biology, UCL, London, UK.

James Tolchard (J)

AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK.

Mihaly Varadi (M)

AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK.

David Baker (D)

Department of Biochemistry, and Institute for Protein Design, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA.

Christine Orengo (C)

Department of Structural and Molecular Biology, UCL, London, UK.

Yang Zhang (Y)

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.

Jeffrey C Hoch (JC)

Biological Magnetic Resonance Data Bank, Department of Molecular Biology and Biophysics, University of Connecticut, Farmington, CT 06030, USA.

Genji Kurisu (G)

Protein Data Bank Japan, Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan.

Ardan Patwardhan (A)

Electron Microscopy Data Bank, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, UK.

Sameer Velankar (S)

AlphaFold Protein Structure Database, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK; Protein Data Bank in Europe, European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK.

Stephen K Burley (SK)

Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA; Research Collaboratory for Structural Bioinformatics Protein Data Bank, San Diego Supercomputer Center, University of California, La Jolla, CA 92093, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.

Andrej Sali (A)

Department of Bioengineering and Therapeutic Sciences, the Quantitative Biosciences Institute (QBI), and the Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94157, USA. Electronic address: https://twitter.com/salilab_ucsf.

Torsten Schwede (T)

Biozentrum, University of Basel, Basel, Switzerland; Computational Structural Biology, SIB Swiss Institute of Bioinformatics, Basel, Switzerland.

Helen M Berman (HM)

Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.

John D Westbrook (JD)

Research Collaboratory for Structural Bioinformatics Protein Data Bank, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Cephalometry Humans Anatomic Landmarks Software Internet

Classifications MeSH