Disentangling the complexity of low complexity proteins.
composition bias
disorder
low complexity regions
structure
Journal
Briefings in bioinformatics
ISSN: 1477-4054
Titre abrégé: Brief Bioinform
Pays: England
ID NLM: 100912837
Informations de publication
Date de publication:
23 03 2020
23 03 2020
Historique:
received:
12
11
2018
revised:
19
12
2018
accepted:
07
01
2019
pubmed:
31
1
2019
medline:
18
8
2021
entrez:
31
1
2019
Statut:
ppublish
Résumé
There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs. There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.
Identifiants
pubmed: 30698641
pii: 5299744
doi: 10.1093/bib/bbz007
pmc: PMC7299295
doi:
Substances chimiques
Proteins
0
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Review
Langues
eng
Sous-ensembles de citation
IM
Pagination
458-472Informations de copyright
© The Author(s) 2019. Published by Oxford University Press.
Références
FEBS Lett. 2015 Jan 2;589(1):15-22
pubmed: 25436423
Nat Struct Mol Biol. 2008 Jun;15(6):591-7
pubmed: 18511944
BMC Syst Biol. 2010 Apr 13;4:43
pubmed: 20385029
Arch Biochem Biophys. 2016 Jul 15;602:3-11
pubmed: 26747744
J Biol Chem. 2004 Dec 17;279(51):53323-30
pubmed: 15371433
Nature. 1986 Aug 14-20;322(6080):652-6
pubmed: 3748144
Nucleic Acids Res. 2014 Jul;42(Web Server issue):W301-7
pubmed: 24848016
Database (Oxford). 2011 Jan 06;2011:baq031
pubmed: 21216786
J Biol Chem. 2005 Oct 14;280(41):34702-8
pubmed: 16030012
Biochim Biophys Acta. 2012 Apr;1824(4):637-46
pubmed: 22310480
Matrix Biol. 2009 May;28(4):221-9
pubmed: 19348940
Trends Biochem Sci. 2017 Feb;42(2):130-140
pubmed: 27884598
Science. 1991 May 24;252(5009):1162-4
pubmed: 2031185
Annu Rev Biophys. 2008;37:215-46
pubmed: 18573080
FEBS Open Bio. 2016 Feb 18;6(3):168-78
pubmed: 27047745
J Biol Chem. 2006 Mar 3;281(9):5341-7
pubmed: 16326713
Biochemistry. 2018 May 1;57(17):2478-2487
pubmed: 29517898
PLoS One. 2017 Aug 17;12(8):e0182972
pubmed: 28817602
Proteomics. 2018 Mar;18(5-6):e1700193
pubmed: 29068531
Brief Bioinform. 2014 Jul;15(4):582-91
pubmed: 23418055
RNA. 2015 Dec;21(12):2023-9
pubmed: 26428695
Alzheimers Dement. 2015 Jun;11(6):681-90
pubmed: 25150734
FEBS Lett. 2010 Apr 16;584(8):1623-7
pubmed: 20303956
Bioinformatics. 2000 Oct;16(10):915-22
pubmed: 11120681
Bioinformatics. 2017 Jun 15;33(12):1889-1891
pubmed: 28186245
Biol Chem. 2016 Aug 1;397(8):731-51
pubmed: 27074551
BMC Bioinformatics. 2017 Nov 13;18(1):476
pubmed: 29132292
PLoS Comput Biol. 2009 May;5(5):e1000376
pubmed: 19412530
Curr Opin Struct Biol. 2008 Dec;18(6):756-64
pubmed: 18952168
Biochemistry. 2018 May 1;57(17):2405-2414
pubmed: 29683665
Proteins. 2009 Mar;74(4):905-16
pubmed: 18712826
Science. 2009 Jun 26;324(5935):1729-32
pubmed: 19460965
Adv Protein Chem Struct Biol. 2010;79:59-88
pubmed: 20621281
Proc Natl Acad Sci U S A. 2002 Sep 3;99(18):11634-9
pubmed: 12193654
Nat Struct Mol Biol. 2007 May;14(5):381-7
pubmed: 17450152
FEBS J. 2010 Jun;277(12):2673-82
pubmed: 20553501
Biochemistry. 2018 May 1;57(17):2499-2508
pubmed: 29509422
J Biol Chem. 2012 Feb 17;287(8):5211-24
pubmed: 22134916
Bioinformatics. 2006 Feb 1;22(3):356-8
pubmed: 16317077
Nucleic Acids Res. 2015 Jul 1;43(W1):W30-8
pubmed: 25943547
Angew Chem Int Ed Engl. 2018 Mar 26;57(14):3598-3601
pubmed: 29359503
BMC Evol Biol. 2012 Aug 24;12:155
pubmed: 22920595
Proteins. 2005;61 Suppl 7:176-82
pubmed: 16187360
J Mol Biol. 2002 Oct 25;323(3):573-84
pubmed: 12381310
Bioinformatics. 2017 Apr 15;33(8):1221-1223
pubmed: 28031183
Structure. 2009 Sep 9;17(9):1205-12
pubmed: 19748341
Nature. 2018 Mar 1;555(7694):117-120
pubmed: 29466333
Bioinformatics. 2005 Aug 15;21(16):3433-4
pubmed: 15955779
J Biol Chem. 2014 Oct 3;289(40):27825-35
pubmed: 25122759
Proteins. 2001 Jan 1;42(1):38-48
pubmed: 11093259
BMC Bioinformatics. 2006 Oct 10;7:441
pubmed: 17032452
Genome Biol. 2009;10(6):R59
pubmed: 19486509
Protein Sci. 2018 Jan;27(1):331-340
pubmed: 29076577
Prion. 2008 Jul-Sep;2(3):112-7
pubmed: 19158505
Proteins. 2017 Apr;85(4):709-719
pubmed: 28097686
Cell. 2012 May 11;149(4):753-67
pubmed: 22579281
Proteomes. 2014 Feb 07;2(1):72-83
pubmed: 28250370
Proc Natl Acad Sci U S A. 2006 Oct 17;103(42):15457-62
pubmed: 17030805
Bioinformatics. 2003 Sep 1;19(13):1672-81
pubmed: 12967964
Molecules. 2017 Nov 24;22(12):
pubmed: 29186753
Nat Struct Mol Biol. 2017 Sep;24(9):765-777
pubmed: 28805808
PLoS One. 2018 Feb 14;13(2):e0191924
pubmed: 29444145
Methods Mol Biol. 2017;1484:25-34
pubmed: 27787817
Nat Mater. 2015 Nov;14(11):1164-71
pubmed: 26390327
FEBS Lett. 2002 Feb 27;513(2-3):267-72
pubmed: 11904162
Bone. 2004 Jun;34(6):921-32
pubmed: 15193538
Front Physiol. 2015 Aug 07;6:221
pubmed: 26300786
J Am Chem Soc. 2017 Jan 25;139(3):1168-1176
pubmed: 28085263
Biophys J. 2016 Jun 7;110(11):2361-2366
pubmed: 27276254
J Mol Biol. 2005 Apr 8;347(4):827-39
pubmed: 15769473
Biochemistry. 2001 Mar 27;40(12):3544-52
pubmed: 11297420
Bioinformatics. 2015 Jul 1;31(13):2208-10
pubmed: 25712690
PLoS Comput Biol. 2009 Mar;5(3):e1000304
pubmed: 19282972
Bioinformatics. 2006 May 1;22(9):1055-63
pubmed: 16500936
Bioinformatics. 2005 Jan 15;21(2):160-70
pubmed: 15333459
PLoS One. 2017 Jan 26;12(1):e0170801
pubmed: 28125688
Proteins. 2002 Jul 1;48(1):134-40
pubmed: 12012345
Cell Transplant. 2014;23(4-5):441-58
pubmed: 24816443
J Histochem Cytochem. 2009 Mar;57(3):227-37
pubmed: 19001636
Proc Natl Acad Sci U S A. 2013 Aug 13;110(33):13392-7
pubmed: 23901099
J Struct Biol. 2012 Sep;179(3):279-88
pubmed: 21884799
BMC Bioinformatics. 2011 May 19;12:173
pubmed: 21595908
J Biol Chem. 2017 Nov 17;292(46):19110-19120
pubmed: 28924037
Proc Natl Acad Sci U S A. 2005 Sep 27;102(39):13897-902
pubmed: 16172389
Curr Opin Virol. 2014 Apr;5:72-81
pubmed: 24631901
Nucleic Acids Res. 2018 Jan 4;46(D1):D471-D476
pubmed: 29136219
Protein Sci. 2016 May;25(5):1030-6
pubmed: 26941008
Nat Rev Mol Cell Biol. 2015 Jan;16(1):18-29
pubmed: 25531225
J Mol Biol. 2018 Aug 3;430(16):2403-2421
pubmed: 29763584
Protein Sci. 2010 May;19(5):1110-6
pubmed: 20196073
J Theor Biol. 2013 Dec 7;338:80-6
pubmed: 24021867
Bioinformatics. 2002 May;18(5):672-8
pubmed: 12050063
Cell Mol Life Sci. 2015 Jan;72(1):137-51
pubmed: 24939692