Exploring structural diversity across the protein universe with The Encyclopedia of Domains.


Journal

Science (New York, N.Y.)
ISSN: 1095-9203
Titre abrégé: Science
Pays: United States
ID NLM: 0404511

Informations de publication

Date de publication:
Nov 2024
Historique:
medline: 1 11 2024
pubmed: 1 11 2024
entrez: 31 10 2024
Statut: ppublish

Résumé

The AlphaFold Protein Structure Database (AFDB) contains more than 214 million predicted protein structures composed of domains, which are independently folding units found in multiple structural and functional contexts. Identifying domains can enable many functional and evolutionary analyses but has remained challenging because of the sheer scale of the data. Using deep learning methods, we have detected and classified every domain in the AFDB, producing The Encyclopedia of Domains. We detected nearly 365 million domains, over 100 million more than can be found by sequence methods, covering more than 1 million taxa. Reassuringly, 77% of the nonredundant domains are similar to known superfamilies, greatly expanding representation of their domain space. We uncovered more than 10,000 new structural interactions between superfamilies and thousands of new folds across the fold space continuum.

Identifiants

pubmed: 39480926
doi: 10.1126/science.adq4946
doi:

Substances chimiques

Proteins 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

eadq4946

Auteurs

Andy M Lau (AM)

Department of Computer Science, University College London, London WC1E 6BT, UK.

Nicola Bordin (N)

Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK.

Shaun M Kandathil (SM)

Department of Computer Science, University College London, London WC1E 6BT, UK.

Ian Sillitoe (I)

Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK.

Vaishali P Waman (VP)

Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK.

Jude Wells (J)

Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK.
Centre for Artificial Intelligence, University College London, London WC1V 6BH, UK.

Christine A Orengo (CA)

Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK.

David T Jones (DT)

Department of Computer Science, University College London, London WC1E 6BT, UK.
Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK.

Articles similaires

Humans Breast Neoplasms Female Deep Learning Ultrasonography, Mammary
Humans Deep Learning Mouth Neoplasms Drug Resistance, Neoplasm Cell Line, Tumor
1.00
Saccharomyces cerevisiae Lysine Cell Nucleolus RNA, Ribosomal Saccharomyces cerevisiae Proteins
Humans Molecular Chaperones Brain Protein Folding Mutation

Classifications MeSH