Differential amino acid usage leads to ubiquitous edge effect in proteomes across domains of life that can be explained by amino acid secondary structure propensities.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
26 Oct 2024
Historique:
received: 12 07 2024
accepted: 21 10 2024
medline: 27 10 2024
pubmed: 27 10 2024
entrez: 27 10 2024
Statut: epublish

Résumé

Amino acids are the building blocks of proteins and enzymes which are essential for life. Understanding amino acid usage offers insights into protein function and molecular mechanisms underlying life histories. However, genome-wide patterns of amino acid usage across domains of life remain poorly understood. Here, we analysed the proteomes of 5590 species across four domains and found that only a few amino acids are consistently the most and least used. This differential usage results in lower amino acid usage diversity at the most and least frequent ranks, creating a ubiquitous inverted U-shape pattern of amino acid diversity and rank which we call an 'edge effect' across proteomes and domains of life. This effect likely stems from protein secondary structural constraints, not the evolutionary chronology of amino acid incorporation into the genetic code, highlighting the functional rather than evolutionary influences on amino acid usage. We also tested other contemporary hypotheses regarding amino acid usage in proteomes and found that amino acid usage varies across life's domains and is only weakly influenced by growth temperature. Our findings reveal a novel and pervasive amino acid usage pattern across genomes with the potential to help us probe deep evolutionary relationships and advance synthetic biology.

Identifiants

pubmed: 39462053
doi: 10.1038/s41598-024-77319-4
pii: 10.1038/s41598-024-77319-4
doi:

Substances chimiques

Proteome 0
Amino Acids 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

25544

Subventions

Organisme : Biotechnology and Biological Sciences Research Council
ID : BB/V015249/1
Pays : United Kingdom

Informations de copyright

© 2024. The Author(s).

Références

Brack, A. From interstellar amino acids to prebiotic catalytic peptides: A review. Chem. Biodivers. 4, 665–679 (2007).
pubmed: 17443882 doi: 10.1002/cbdv.200790057
Van der Gulik, P. T. & Speijer, D. How amino acids and peptides shaped the RNA world. Life 5, 230–246 (2015).
pubmed: 25607813 pmcid: 4390850 doi: 10.3390/life5010230
Dufton, M. J. Genetic code synonym quotas and amino acid complexity: Cutting the cost of proteins?. J. Theor. Biol. 187, 165–173 (1997).
pubmed: 9237887 doi: 10.1006/jtbi.1997.0443
Tekaia, F. & Yeramian, E. Evolution of proteomes: Fundamental signatures and global trends in amino acid compositions. BMC Genom. 7, 307 (2006).
doi: 10.1186/1471-2164-7-307
Hickey, D. A. & Singer, G. A. Genomic and proteomic adaptations to growth at high temperature. Genome Biol. 5, 117 (2004).
pubmed: 15461805 pmcid: 545586 doi: 10.1186/gb-2004-5-10-117
Kimura, M. Evolutionary rate at the molecular level. Nature 217, 624–626 (1968).
pubmed: 5637732 doi: 10.1038/217624a0
King, J. L. & Jukes, T. H. Non-Darwinian Evolution: Most evolutionary change in proteins may be due to neutral mutations and genetic drift. Science 164, 788–798 (1969).
pubmed: 5767777 doi: 10.1126/science.164.3881.788
Akashi, H. & Gojobori, T. Metabolic efficiency and amino acid composition in the proteomes of Escherichia coli and Bacillus subtilis. Proc. Natl. Acad. Sci. 99, 3695–3700 (2002).
pubmed: 11904428 pmcid: 122586 doi: 10.1073/pnas.062526999
Ng, S. B. et al. Exome sequencing identifies the cause of a mendelian disorder. Nat. Genet. 42, 30–35 (2010).
pubmed: 19915526 doi: 10.1038/ng.499
Tarailo-Graovac, M. et al. Exome sequencing and the management of neurometabolic disorders. N. Engl. J. Med. 374, 2246–2255 (2016).
pubmed: 27276562 pmcid: 4983272 doi: 10.1056/NEJMoa1515792
Bloom, J. D., Labthavikul, S. T., Otey, C. R. & Arnold, F. H. Protein stability promotes evolvability. Proc. Natl. Acad. Sci. 103, 5869–5874 (2006).
pubmed: 16581913 pmcid: 1458665 doi: 10.1073/pnas.0510098103
Moore, E. J., Zorine, D., Hansen, W. A., Khare, S. D. & Fasan, R. Enzyme stabilization via computationally guided protein stapling. Proc. Natl. Acad. Sci. 114, 12472–12477 (2017).
pubmed: 29109284 pmcid: 5703291 doi: 10.1073/pnas.1708907114
Jimenez-Rosales, A. & Flores-Merino, M. V. Tailoring proteins to re-evolve nature: A short review. Mol. Biotechnol. 60, 946–974 (2018).
pubmed: 30264233 doi: 10.1007/s12033-018-0122-3
Swire, J. Selection on synthesis cost affects interprotein amino acid usage in all three domains of life. J. Mol. Evol. 64, 558–571 (2007).
pubmed: 17476453 doi: 10.1007/s00239-006-0206-8
Gómez Ortega, J., Raubenheimer, D., Tyagi, S., Mirth, C. K. & Piper, M. D. Biosynthetic constraints on amino acid synthesis at the base of the food chain may determine their use in higher-order consumer genomes. PLoS Genet. 19, e1010635 (2023).
pubmed: 36780875 pmcid: 9956874 doi: 10.1371/journal.pgen.1010635
Piper, M. D. et al. Matching dietary amino acid balance to the in silico-translated exome optimizes growth and reproduction without cost to lifespan. Cell Metab. 25, 610–621 (2017).
pubmed: 28273481 pmcid: 5355364 doi: 10.1016/j.cmet.2017.02.005
Ohta, T. Origin of the neutral and nearly neutral theories of evolution. J. Biosci. 28, 371–377 (2003).
pubmed: 12799485 doi: 10.1007/BF02705113
Drummond, D. A., Bloom, J. D., Adami, C., Wilke, C. O. & Arnold, F. H. Why highly expressed proteins evolve slowly. Proc. Natl. Acad. Sci. 102, 14338–14343 (2005).
pubmed: 16176987 pmcid: 1242296 doi: 10.1073/pnas.0504070102
Pál, C., Papp, B. & Hurst, L. D. Highly expressed genes in yeast evolve slowly. Genetics 158, 927–931 (2001).
pubmed: 11430355 pmcid: 1461684 doi: 10.1093/genetics/158.2.927
Krick, T. et al. Amino acid metabolism conflicts with protein diversity. Mol. Biol. Evol. 31, 2905–2912 (2014).
pubmed: 25086000 pmcid: 4209132 doi: 10.1093/molbev/msu228
Ren, W. et al. Amino acids as mediators of metabolic cross talk between host and pathogen. Front. Immunol. 9, 319 (2018).
pubmed: 29535717 pmcid: 5835074 doi: 10.3389/fimmu.2018.00319
Hauser, P. M. et al. Comparative genomics suggests that the fungal pathogen Pneumocystis is an obligate parasite scavenging amino acids from its host’s lungs. PLoS One 5, e15152 (2010).
pubmed: 21188143 pmcid: 3004796 doi: 10.1371/journal.pone.0015152
Chen, Y. & Nielsen, J. Yeast has evolved to minimize protein resource cost for synthesizing amino acids. Proc. Natl. Acad. Sci. 119, e2114622119 (2022).
pubmed: 35042799 pmcid: 8795554 doi: 10.1073/pnas.2114622119
Lehmann, J., Libchaber, A. & Greenbaum, B. D. Fundamental amino acid mass distributions and entropy costs in proteomes. J. Theor. Biol. 410, 119–124 (2016).
pubmed: 27544420 doi: 10.1016/j.jtbi.2016.08.011
Miseta, A. & Csutora, P. Relationship between the occurrence of cysteine in proteins and the complexity of organisms. Mol. Biol. Evol. 17, 1232–1239 (2000).
pubmed: 10908643 doi: 10.1093/oxfordjournals.molbev.a026406
Singer, G. A. & Hickey, D. A. Thermophilic prokaryotes have characteristic patterns of codon usage, amino acid composition and nucleotide content. Gene 317, 39–47 (2003).
pubmed: 14604790 doi: 10.1016/S0378-1119(03)00660-7
Friedman, R., Drake, J. W. & Hughes, A. L. Genome-wide patterns of nucleotide substitution reveal stringent functional constraints on the protein sequences of thermophiles. Genetics 167, 1507–1512 (2004).
pubmed: 15280258 pmcid: 1470942 doi: 10.1534/genetics.104.026344
DiGiacomo, J., McKay, C. & Davila, A. ThermoBase: A database of the phylogeny and physiology of thermophilic and hyperthermophilic organisms. Plos One 17, e0268253 (2022).
pubmed: 35536846 pmcid: 9089862 doi: 10.1371/journal.pone.0268253
Go, Y.-M., Chandler, J. D. & Jones, D. P. The cysteine proteome. Free Radic. Biol. Med. 84, 227–245 (2015).
pubmed: 25843657 pmcid: 4457640 doi: 10.1016/j.freeradbiomed.2015.03.022
Bragg, J. G., Thomas, D. & Baudouin-Cornu, P. Variation among species in proteomic sulphur content is related to environmental conditions. Proc. R. Soc. B Biol. Sci. 273, 1293–1300 (2006).
doi: 10.1098/rspb.2005.3441
Kumar, S., Tsai, C.-J. & Nussinov, R. Factors enhancing protein thermostability. Protein Eng. 13, 179–191 (2000).
pubmed: 10775659 doi: 10.1093/protein/13.3.179
Seligmann, H. Cost-minimization of amino acid usage. J. Mol. Evol. 56, 151–161 (2003).
pubmed: 12574861 doi: 10.1007/s00239-002-2388-z
Porensky, L. M. & Young, T. P. Edge-effect interactions in fragmented and patchy landscapes. Conserv. Biol. 27, 509–519 (2013).
pubmed: 23531018 doi: 10.1111/cobi.12042
Ries, L. & Sisk, T. D. A predictive model of edge effects. Ecology 85, 2917–2926 (2004).
doi: 10.1890/03-8021
Mizuguchi, K. & Blundell, T. L. Analysis of conservation and substitutions of secondary structure elements within protein superfamilies. Bioinformatics 16, 1111–1119 (2000).
pubmed: 11159330 doi: 10.1093/bioinformatics/16.12.1111
Gille, C., Goede, A., Preißner, R., Rother, K. & Frömmel, C. Conservation of substructures in proteins: Interfaces of secondary structural elements in proteasomal subunits. J. Mol. Biol. 299, 1147–1154 (2000).
pubmed: 10843865 doi: 10.1006/jmbi.2000.3763
Lüthy, R., McLachlan, A. D. & Eisenberg, D. Secondary structure-based profiles: Use of structure-conserving scoring tables in searching protein sequence databases for structural similarities. Proteins Struct. Funct. Bioinforma 10, 229–239 (1991).
doi: 10.1002/prot.340100307
Zvelebil, M. J., Barton, G. J., Taylor, W. R. & Sternberg, M. J. Prediction of protein secondary structure and active sites using the alignment of homologous sequences. J. Mol. Biol. 195, 957–961 (1987).
pubmed: 3656439 doi: 10.1016/0022-2836(87)90501-8
Chou, P. Y. & Fasman, G. D. Conformational parameters for amino acids in helical, β-sheet, and random coil regions calculated from proteins. Biochemistry 13, 211–222 (1974).
pubmed: 4358939 doi: 10.1021/bi00699a001
Fujiwara, K., Toda, H. & Ikeguchi, M. Dependence of alpha-helical and beta-sheet amino acid propensities on the overall protein fold type. BMC Struct. Biol. 12, 18 (2012).
pubmed: 22857400 pmcid: 3495713 doi: 10.1186/1472-6807-12-18
Burley, S. K. et al. Protein Data Bank (PDB): The single global macromolecular structure archive. Protein Crystallogr. Methods Protoc. 2017, 627–641 (2017).
Wang, G. & Dunbrack, R. L. Jr. PISCES: A protein sequence culling server. Bioinformatics 19, 1589–1591 (2003).
pubmed: 12912846 doi: 10.1093/bioinformatics/btg224
Trifonov, E. N. Consensus temporal order of amino acids and evolution of the triplet code. Gene 261, 139–151 (2000).
pubmed: 11164045 doi: 10.1016/S0378-1119(00)00476-5
Wehbi, S. et al. Order of amino acid recruitment into the genetic code resolved by Last Universal Common Ancestor’s protein domains. BioRxiv. https://doi.org/10.1101/2024.04.13.589375 (2024).
Trivedi, R. & Nagarajaram, H. A. Substitution scoring matrices for proteins—an overview. Protein Sci. 29, 2150–2163 (2020).
pubmed: 32954566 pmcid: 7586916 doi: 10.1002/pro.3954
Foo, J. L., Ching, C. B., Chang, M. W. & Leong, S. S. J. The imminent role of protein engineering in synthetic biology. Biotechnol. Adv. 30, 541–549 (2012).
pubmed: 21963685 doi: 10.1016/j.biotechadv.2011.09.008
Grünberg, R. & Serrano, L. Strategies for protein synthetic biology. Nucleic Acids Res. 38, 2663–2675 (2010).
pubmed: 20385577 pmcid: 2860127 doi: 10.1093/nar/gkq139
R Core Team, R. R: A language and environment for statistical computing (2013).
Brüne, D., Andrade-Navarro, M. A. & Mier, P. Proteome-wide comparison between the amino acid composition of domains and linkers. BMC Res. Notes 11, 117 (2018).
pubmed: 29426365 pmcid: 5807739 doi: 10.1186/s13104-018-3221-0
Grant, B. J., Skjærven, L. & Yao, X. The Bio3d packages for structural bioinformatics. Protein Sci. 30, 20–30 (2021).
pubmed: 32734663 doi: 10.1002/pro.3923
Wickham, H. ggplot2. WIREs Comput. Stat. 3, 180–185 (2011).
doi: 10.1002/wics.147
Kuznetsova, A., Brockhoff, P. B. & Christensen, R. H. B. Package ‘lmertest’. R Package Version 2 734 (2015).
Bates, D. et al. Package ‘lme4’. Httplme4 R-Forge R-Proj. Org (2009).
Morimoto, J., Conceição, P. & Smoczyk, K. Nutrigonometry III: Curvature, area and differences between performance landscapes. R. Soc. Open Sci. 9, 221326 (2022).
pubmed: 36465681 pmcid: 9709515 doi: 10.1098/rsos.221326
Team, R. C., Team, M. R. C., Suggests, M. & Matrix, S. Package stats. R Stats Package (2018).
Barton, K. & Barton, M. K. Package ‘mumin’. Version 1 439 (2015).
Frerebeau, N. tabula: An R package for analysis, seriation, and visualization of archaeological count data. J. Open Source Softw. 4, 1821 (2019).
doi: 10.21105/joss.01821
Shannon, C. E. A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948).
doi: 10.1002/j.1538-7305.1948.tb01338.x

Auteurs

Juliano Morimoto (J)

School of Natural and Computing Sciences, Institute of Mathematics, University of Aberdeen, Fraser Noble Building, Aberdeen, AB24 3UE, UK. juliano.morimoto@abdn.ac.uk.
Programa de Pós-graduação em Ecologia e Conservação, Universidade Federal do Paraná, Curitiba, 82590-300, Brazil. juliano.morimoto@abdn.ac.uk.
Wissenschafskolleg zu Berlin, 10 Wallotstraße, Berlin, Germany. juliano.morimoto@abdn.ac.uk.

Zuzanna Pietras (Z)

Department of Physics, Chemistry and Biology (IFM), Linköping University, Linköping, Sweden.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH