Chainsaw: protein domain segmentation with fully convolutional neural networks.

convolutional neural networks domain segmentation protein domains structure prediction

Journal

Bioinformatics (Oxford, England)
ISSN: 1367-4811
Titre abrégé: Bioinformatics
Pays: England
ID NLM: 9808944

Informations de publication

Date de publication:
08 May 2024
Historique:
received: 02 01 2024
revised: 23 03 2024
accepted: 07 05 2024
medline: 8 5 2024
pubmed: 8 5 2024
entrez: 8 5 2024
Statut: aheadofprint

Résumé

Protein domains are fundamental units of protein structure and play a pivotal role in understanding folding, function, evolution, and design. The advent of accurate structure prediction techniques has resulted in an influx of new structural data, making the partitioning of these structures into domains essential for inferring evolutionary relationships and functional classification. This paper presents Chainsaw, a supervised learning approach to domain parsing that achieves accuracy that surpasses current state-of-the-art methods. Chainsaw uses a fully convolutional neural network which is trained to predict the probability that each pair of residues is in the same domain. Domain predictions are then derived from these pairwise predictions using an algorithm that searches for the most likely assignment of residues to domains given the set of pairwise co-membership probabilities. Chainsaw matches CATH domain annotations in 78% of protein domains versus 72% for the next closest method. When predicting on AlphaFold models, expert human evaluators were twice as likely to prefer Chainsaw's predictions versus the next best method. github.com/JudeWells/chainsaw.

Identifiants

pubmed: 38718225
pii: 7667299
doi: 10.1093/bioinformatics/btae296
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© The Author(s) 2024. Published by Oxford University Press.

Auteurs

Jude Wells (J)

Centre for Artificial Intelligence, University College London, UK.

Alex Hawkins-Hooker (A)

Centre for Artificial Intelligence, University College London, UK.

Nicola Bordin (N)

Institute of Structural and Molecular Biology, University College London, UK.

Ian Sillitoe (I)

Institute of Structural and Molecular Biology, University College London, UK.

Brooks Paige (B)

Centre for Artificial Intelligence, University College London, UK.

Christine Orengo (C)

Institute of Structural and Molecular Biology, University College London, UK.

Classifications MeSH