ETLD: an encoder-transformation layer-decoder architecture for protein contact and mutation effects prediction.

contact prediction encoder-transformation layer-decoder (ETLD) model multiple sequence alignments (MSAs) mutation effects prediction transformation matrix

Journal

Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837

Informations de publication

Date de publication:
20 09 2023
Historique:
received: 12 04 2023
revised: 21 06 2023
accepted: 26 07 2023
medline: 25 9 2023
pubmed: 20 8 2023
entrez: 20 8 2023
Statut: ppublish

Résumé

The latent features extracted from the multiple sequence alignments (MSAs) of homologous protein families are useful for identifying residue-residue contacts, predicting mutation effects, shaping protein evolution, etc. Over the past three decades, a growing body of supervised and unsupervised machine learning methods have been applied to this field, yielding fruitful results. Here, we propose a novel self-supervised model, called encoder-transformation layer-decoder (ETLD) architecture, capable of capturing protein sequence latent features directly from MSAs. Compared to the typical autoencoder model, ETLD introduces a transformation layer with the ability to learn inter-site couplings, which can be used to parse out the two-dimensional residue-residue contacts map after a simple mathematical derivation or an additional supervised neural network. ETLD retains the process of encoding and decoding sequences, and the predicted probabilities of amino acids at each site can be further used to construct the mutation landscapes for mutation effects prediction, outperforming advanced models such as GEMME, DeepSequence and EVmutation in general. Overall, ETLD is a highly interpretable unsupervised model with great potential for improvement and can be further combined with supervised methods for more extensive and accurate predictions.

Identifiants

pubmed: 37598423
pii: 7246470
doi: 10.1093/bib/bbad290
pii:
doi:

Substances chimiques

Proteins 0
Amino Acids 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Informations de copyright

© The Author(s) 2023. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Auteurs

He Wang (H)

MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi'an Jiaotong University, Xi'an 710049, China.

Yongjian Zang (Y)

MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi'an Jiaotong University, Xi'an 710049, China.

Ying Kang (Y)

MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi'an Jiaotong University, Xi'an 710049, China.

Jianwen Zhang (J)

MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi'an Jiaotong University, Xi'an 710049, China.

Lei Zhang (L)

MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi'an Jiaotong University, Xi'an 710049, China.

Shengli Zhang (S)

MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi'an Jiaotong University, Xi'an 710049, China.

Articles similaires

T-Lymphocytes, Regulatory Lung Neoplasms Proto-Oncogene Proteins p21(ras) Animals Humans

Pathogenic mitochondrial DNA mutations inhibit melanoma metastasis.

Spencer D Shelton, Sara House, Luiza Martins Nascentes Melo et al.
1.00
DNA, Mitochondrial Humans Melanoma Mutation Neoplasm Metastasis
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning

Prevalence and implications of fragile X premutation screening in Thailand.

Areerat Hnoonual, Sunita Kaewfai, Chanin Limwongse et al.
1.00
Humans Fragile X Mental Retardation Protein Thailand Male Female

Classifications MeSH