GLTM: A Global-Local Attention LSTM Model to Locate Dimer Motif of Single-Pass Membrane Proteins.
Bi-LSTM network
dimer motif
motif localization model
self-attention mechanism
single-pass membrane protein
Journal
Frontiers in genetics
ISSN: 1664-8021
Titre abrégé: Front Genet
Pays: Switzerland
ID NLM: 101560621
Informations de publication
Date de publication:
2022
2022
Historique:
received:
14
01
2022
accepted:
14
02
2022
entrez:
4
4
2022
pubmed:
5
4
2022
medline:
5
4
2022
Statut:
epublish
Résumé
Single-pass membrane proteins, which constitute up to 50% of all transmembrane proteins, are typically active in significant conformational changes, such as a dimer or other oligomers, which is essential for understanding the function of transmembrane proteins. Finding the key motifs of oligomers through experimental observation is a routine method used in the field to infer the potential conformations of other members of the transmembrane protein family. However, approaches based on experimental observation need to consume a lot of time and manpower costs; moreover, they are hard to reveal the potential motifs. A proposed approach is to build an accurate and efficient transmembrane protein oligomer prediction model to screen the key motifs. In this paper, an attention-based Global-Local structure LSTM model named GLTM is proposed to predict dimers and screen potential dimer motifs. Different from traditional motifs screening based on highly conserved sequence search frame, a self-attention mechanism has been employed in GLTM to locate the highest dimerization score of subsequence fragments and has been proven to locate most known dimer motifs well. The proposed GLTM can reach 97.5% accuracy on the benchmark dataset collected from Membranome2.0. The three characteristics of GLTM can be summarized as follows: First, the original sequence fragment was converted to a set of subsequences which having the similar length of known motifs, and this additional step can greatly enhance the capability of capturing motif pattern; Second, to solve the problem of sample imbalance, a novel data enhancement approach combining improved one-hot encoding with random subsequence windows has been proposed to improve the generalization capability of GLTM; Third, position penalization has been taken into account, which makes a self-attention mechanism focused on special TM fragments. The experimental results in this paper fully demonstrated that the proposed GLTM has a broad application perspective on the location of potential oligomer motifs, and is helpful for preliminary and rapid research on the conformational change of mutants.
Identifiants
pubmed: 35368690
doi: 10.3389/fgene.2022.854571
pii: 854571
pmc: PMC8965067
doi:
Types de publication
Journal Article
Langues
eng
Pagination
854571Informations de copyright
Copyright © 2022 Ma, Zou, Zhang and Yang.
Déclaration de conflit d'intérêts
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Références
Sci Rep. 2019 Mar 5;9(1):3577
pubmed: 30837494
J Theor Biol. 2015 Jul 21;377:75-84
pubmed: 25791288
BMC Bioinformatics. 2007 Oct 15;8:385
pubmed: 17937785
Biochem Soc Trans. 2016 Jun 15;44(3):790-5
pubmed: 27284043
Protein Sci. 2000 Jun;9(6):1246-53
pubmed: 10892817
PLoS One. 2012;7(9):e44263
pubmed: 22984481
IEEE/ACM Trans Comput Biol Bioinform. 2020 Nov-Dec;17(6):1918-1931
pubmed: 30998480
Cell. 2019 Mar 7;176(6):1477-1489.e14
pubmed: 30827683
J Mol Biol. 2000 Feb 25;296(3):911-9
pubmed: 10677291
Biochim Biophys Acta. 2012 Apr;1818(4):927-41
pubmed: 22051023
Nucleic Acids Res. 2017 Jan 4;45(D1):D250-D255
pubmed: 27510400
Nucleic Acids Res. 2017 Jul 3;45(W1):W470-W477
pubmed: 28460141
Protein Sci. 1998 Apr;7(4):1052-6
pubmed: 9568912
Amino Acids. 2006 Jun;30(4):461-8
pubmed: 16773245
Biochemistry. 2013 Apr 16;52(15):2574-85
pubmed: 23520975
Biochim Biophys Acta. 2007 Mar;1768(3):387-92
pubmed: 17258687
Bioinformatics. 2013 Jan 1;29(1):39-46
pubmed: 23142965
Cell Adh Migr. 2010 Apr-Jun;4(2):313-24
pubmed: 20543559
Biochim Biophys Acta. 2012 Feb;1818(2):183-93
pubmed: 21910966
Biochimie. 2011 Jul;93(7):1132-8
pubmed: 21466835
Cell Adh Migr. 2010 Apr-Jun;4(2):299-312
pubmed: 20212358
Mol Biosyst. 2012 Oct 30;8(12):3178-84
pubmed: 22990717
J Mol Biol. 2000 Feb 25;296(3):921-36
pubmed: 10677292
BMC Bioinformatics. 2008 Jan 14;9:19
pubmed: 18194537
Front Cell Dev Biol. 2020 Oct 14;8:569684
pubmed: 33163490
PLoS One. 2007 Oct 03;2(10):e967
pubmed: 17912346
Biochim Biophys Acta. 2010 Mar;1798(3):605-15
pubmed: 20036637