Construct a variable-length fragment library for de novo protein structure prediction.

Algorithms Cluster Analysis Databases, Protein Protein Folding Protein Structure, Secondary Proteins / chemistry

de novo protein structure prediction fragment library hidden Markov model secondary structure

Journal

Briefings in bioinformatics

ISSN: 1477-4054

Titre abrégé: Brief Bioinform

Pays: England

ID NLM: 100912837

Informations de publication

Date de publication:
13 05 2022

Historique:

received: 03 01 2022

revised: 10 02 2022

accepted: 20 02 2022

pubmed: 15 3 2022

medline: 24 5 2022

entrez: 14 3 2022

Statut: ppublish

Résumé

Although remarkable achievements, such as AlphaFold2, have been made in end-to-end structure prediction, fragment libraries remain essential for de novo protein structure prediction, which can help explore and understand the protein-folding mechanism. In this work, we developed a variable-length fragment library (VFlib). In VFlib, a master structure database was first constructed from the Protein Data Bank through sequence clustering. The hidden Markov model (HMM) profile of each protein in the master structure database was generated by HHsuite, and the secondary structure of each protein was calculated by DSSP. For the query sequence, the HMM-profile was first constructed. Then, variable-length fragments were retrieved from the master structure database through dynamically variable-length profile-profile comparison. A complete method for chopping the query HMM-profile during this process was proposed to obtain fragments with increased diversity. Finally, secondary structure information was used to further screen the retrieved fragments to generate the final fragment library of specific query sequence. The experimental results obtained with a set of 120 nonredundant proteins show that the global precision and coverage of the fragment library generated by VFlib were 55.04% and 94.95% at the RMSD cutoff of 1.5 Å, respectively. Compared with the benchmark method of NNMake, the global precision of our fragment library had increased by 62.89% with equivalent coverage. Furthermore, the fragments generated by VFlib and NNMake were used to predict structure models through fragment assembly. Controlled experimental results demonstrate that the average TM-score of VFlib was 16.00% higher than that of NNMake.

Identifiants

DOI: 10.1093/bib/bbac086 PMID: 35284936

pubmed: 35284936

pii: 6547572

doi: 10.1093/bib/bbac086

pii:

doi:

Substances chimiques

Proteins 0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Construct a variable-length fragment library for de novo protein structure prediction.

Journal

Informations de publication

Résumé

Identifiants

Substances chimiques

Types de publication

Langues

Sous-ensembles de citation

Informations de copyright

Auteurs

Qiongqiong Feng (Q)

Minghua Hou (M)

Jun Liu (J)

Kailong Zhao (K)

Guijun Zhang (G)

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Exploring structural diversity across the protein universe with The Encyclopedia of Domains.

Multilabel SegSRGAN-A framework for parcellation and morphometry of preterm brain in MRI.

An arithmetic operation P system based on symmetric ternary system.

Classifications MeSH