Repeatability in protein sequences.
Amino acid short tandem repeats
Computational detection of sequence repeats
Homorepeats
Low complexity regions
Repeatability
Web tool
Journal
Journal of structural biology
ISSN: 1095-8657
Titre abrégé: J Struct Biol
Pays: United States
ID NLM: 9011206
Informations de publication
Date de publication:
01 11 2019
01 11 2019
Historique:
received:
03
04
2019
revised:
06
08
2019
accepted:
08
08
2019
pubmed:
14
8
2019
medline:
23
6
2020
entrez:
14
8
2019
Statut:
ppublish
Résumé
Low complexity regions (LCRs) in protein sequences have special properties that are very different from those of globular proteins. The rules that define secondary structure elements do not apply when the distribution of amino acids becomes biased. While there is a tendency towards structural disorder in LCRs, various examples, and particularly homorepeats of single amino acids, suggest that very short repeats could adopt structures very difficult to predict. These structures are possibly variable and dependant on the context of intra- or inter-molecular interactions. In general, short repeats in LCRs can induce structure. This could explain the observation that very short (non-perfect) repeats are widespread and many define regions with a function in protein interactions. For these reasons, we have developed an algorithm to quickly analyze local repeatability along protein sequences, that is, how close a protein fragment is from a perfect repeat. Using this algorithm we identified that the proteins of the yeast Saccharomyces cerevisiae are depleted in short repeats (approximate or not) of odd-length, while the human proteins are not, that the fish Danio rerio has many proteins with repeats of length two and that the plant Arabidopsis thaliana has an unusually large amount of repeats of length seven. Our method (REpeatability Scanner, RES, accessible at http://cbdm-01.zdv.uni-mainz.de/~munoz/res/) allows to find regions with approximate short repeats in protein sequences, and helps to characterize the variable use of LCRs and compositional bias in different organisms.
Identifiants
pubmed: 31408700
pii: S1047-8477(19)30173-X
doi: 10.1016/j.jsb.2019.08.003
pii:
doi:
Substances chimiques
Proteins
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
86-91Informations de copyright
Copyright © 2019 The Author(s). Published by Elsevier Inc. All rights reserved.