BEAN and HABAS: Polyphyletic insertions in the DNA-directed RNA polymerase.


Journal

Protein science : a publication of the Protein Society
ISSN: 1469-896X
Titre abrégé: Protein Sci
Pays: United States
ID NLM: 9211750

Informations de publication

Date de publication:
Nov 2024
Historique:
revised: 26 09 2024
received: 18 05 2024
accepted: 04 10 2024
medline: 28 10 2024
pubmed: 28 10 2024
entrez: 28 10 2024
Statut: ppublish

Résumé

The β and β' subunits of the RNA polymerase (RNAP) are large proteins with complex multi-domain architectures that include several insertional domains. Here, we analyze the domain organizations of RNAP-β and RNAP-β' using sequence, experimentally determined structures and AlphaFold structure predictions. We observe that lineage-specific insertional domains in bacterial RNAP-β belong to a group that we call BEAN (broadly embedded annex). We observe that lineage-specific insertional domains in bacterial RNAP-β' belong to a group that we call HABAS (hammerhead/barrel-sandwich hybrid). The BEAN domain has a characteristic three-dimensional structure composed of two square bracket-like elements that are antiparallel relative to each other. The HABAS domain contains a four-stranded open β-sheet with a GD-box-like motif in one of the β-strands and the adjoining loop. The BEAN domain is inserted not only in the bacterial RNAP-β', but also in the archaeal version of universal ribosomal protein L10. The HABAS domain is inserted in several metabolic proteins. The phylogenetic distributions of bacterial lineage-specific insertional domains of β and β' subunits of RNAP follow the Tree of Life. The presence of insertional domains can help establish a relative timeline of events in the evolution of a protein because insertion is inferred to post-date the base domain. We discuss mechanisms that might account for the discovery of homologous insertional domains in non-equivalent locations in bacteria and archaea.

Identifiants

pubmed: 39467185
doi: 10.1002/pro.5194
doi:

Substances chimiques

DNA-Directed RNA Polymerases EC 2.7.7.6
Bacterial Proteins 0
RNA polymerase beta subunit EC 2.7.7.6

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

e5194

Subventions

Organisme : Royal Society
Organisme : NASA
ID : 80NSSC24K0344
Pays : United States

Informations de copyright

© 2024 The Author(s). Protein Science published by Wiley Periodicals LLC on behalf of The Protein Society.

Références

Alva V, Dunin‐Horkawicz S, Habeck M, Coles M, Lupas AN. The GD box: a widespread noncontiguous supersecondary structural element. Protein Sci. 2009;18:1961–1966.
Alvarez‐Carreño C, Arciniega M, Ribas de Pouplana L, Petrov AS, Hernández‐González A, Dimas‐Torres JU, et al. Common evolutionary origins of the bacterial glycyl tRNA synthetase and alanyl tRNA synthetase. Protein Sci. 2023;33:e4844.
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, et al. The Protein Data Bank. Nucleic Acids Res. 2000;28:235–242.
Borukhov S, Severinov K, Kashlev M, Lebedev A, Bass I, Rowland GC, et al. Mapping of trypsin cleavage and antibody‐binding sites and delineation of a dispensable domain in the beta subunit of Escherichia coli RNA polymerase. J Biol Chem. 1991;266:23921–23926.
Castillo RM, Mizuguchi K, Dhanaraj V, Albert A, Blundell TL, Murzin AG. A six‐stranded double‐psi beta barrel is shared by several protein superfamilies. Structure. 1999;7:227–236.
Chandonia JM, Fox NK, Brenner SE. SCOPe: manual curation and artifact removal in the structural classification of proteins—extended database. J Mol Biol. 2017;429:348–355.
Chlenov M, Masuda S, Murakami KS, Nikiforov V, Darst SA, Mustaev A. Structure and function of lineage‐specific sequence insertions in the bacterial RNA polymerase beta' subunit. J Mol Biol. 2005;353:138–154.
Coleman GA, Davín AA, Mahendrarajah TA, Szánthó LL, Spang A, Hugenholtz P, et al. A rooted phylogeny resolves early bacterial evolution. Science. 2021;372:eabe0511.
do Prado PFV, Ahrens FM, Liebers M, Ditz N, Braun HP, Pfannschmidt T, et al. Structure of the multi‐subunit chloroplast RNA polymerase. Mol Cell. 2024;84:910–925.e915.
Eddy SR. Accelerated profile HMM searches. PLoS Comput Biol. 2011;7:e1002195.
Gabler F, Nam SZ, Till S, Mirdita M, Steinegger M, Söding J, et al. Protein sequence analysis using the MPI bioinformatics toolkit. Curr Protoc Bioinformatics. 2020;72:e108.
Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum‐likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010;59:307–321.
Huang Y, Kendall T, Forsythe ES, Dorantes‐Acosta A, Li S, Caballero‐Pérez J, et al. Ancient origin and recent innovations of RNA polymerase IV and V. Mol Biol Evol. 2015;32:1788–1799.
Hurwitz J, Furth JJ, Anders M, Ortiz PJ, August JT. The enzymatic incorporation of ribonucleotides into RNA and the role of DNA. Cold Spring Harb Symp Quant Biol. 1961;26:91–100.
Iyer LM, Aravind L, Koonin EV. Common origin of four diverse families of large eukaryotic DNA viruses. J Virol. 2001;75:11720–11734.
Iyer LM, Koonin EV, Aravind L. Evolutionary connection between the catalytic subunits of DNA‐dependent RNA polymerases and eukaryotic RNA‐dependent RNA polymerases and the origin of RNA polymerases. BMC Struct Biol. 2003;3:1.
Iyer LM, Koonin EV, Aravind L. Evolution of bacterial RNA polymerase: implications for large‐scale bacterial phylogeny, domain accretion, and horizontal gene transfer. Gene. 2004;335:73–88.
Jokerst RS, Weeks JR, Zehring WA, Greenleaf AL. Analysis of the gene encoding the largest subunit of RNA polymerase II in drosophila. Mol Gen Genet. 1989;215:266–275.
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, Ronneberger O, et al. Highly accurate protein structure prediction with AlphaFold. Nature. 2021;596:583–589.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–780.
Kawabata T. MATRAS: A program for protein 3D structure comparison. Nucleic Acids Res. 2003;31:3367–3369.
Korkhin Y, Unligil UM, Littlefield O, Nelson PJ, Stuart DI, Sigler PB, et al. Evolution of complex RNA polymerases: the complete archaeal RNA polymerase structure. PLoS Biol. 2009;7:e1000102.
Lane WJ, Darst SA. Molecular evolution of multisubunit RNA polymerases: sequence analysis. J Mol Biol. 2010a;395:671–685.
Lane WJ, Darst SA. Molecular evolution of multisubunit RNA polymerases: structural analysis. J Mol Biol. 2010b;395:686–704.
Lefort V, Longueville JE, Gascuel O. SMS: smart model selection in PhyML. Mol Biol Evol. 2017;34:2422–2424.
Letunic I, Bork P. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49:W293–w296.
Manriquez‐Sandoval E, Fried SD. DomainMapper: accurate domain structure annotation including those with non‐contiguous topologies. Protein Sci. 2022;31:e4465.
Marsh JA, Teichmann SA. How do proteins gain new domains? Genome Biol. 2010;11:126.
Moody ERR, Álvarez‐Carretero S, Mahendrarajah TA, Clark JW, Betts HC, Dombrowski N, et al. The nature of the last universal common ancestor and its impact on the early earth system. Nat Ecol Evol. 2024;8:1654–1666.
Moody ERR, Mahendrarajah TA, Dombrowski N, Clark JW, Petitjean C, Offre P, et al. An estimate of the deepest branches of the tree of life from ancient vertically evolving genes. Elife. 2022;11:e66695.
Qayyum MZ, Imashimizu M, Leanca M, Vishwakarma RK, Riaz‐Bradley A, Yuzenkova Y, et al. Structure and function of the Si3 insertion integrated into the trigger loop/helix of cyanobacterial RNA polymerase. Proc Natl Acad Sci U S A. 2024;121:e2311480121.
Schaeffer RD, Liao Y, Cheng H, Grishin NV. ECOD: new developments in the evolutionary classification of domains. Nucleic Acids Res. 2017;45:D296–d302.
Severinov K, Mustaev A, Kashlev M, Borukhov S, Nikiforov V, Goldfarb A. Dissection of the beta subunit in the Escherichia coli RNA polymerase into domains by proteolytic cleavage. J Biol Chem. 1992;267:12813–12819.
Sillitoe I, Bordin N, Dawson N, Waman VP, Ashford P, Scholes HM, et al. CATH: increased structural coverage of functional space. Nucleic Acids Res. 2021;49:D266–D273.
Steinegger M, Meier M, Mirdita M, Vöhringer H, Haunsberger SJ, Söding J. HH‐suite3 for fast remote homology detection and deep protein annotation. BMC Bioinformatics. 2019;20:473.
Steinegger M, Mirdita M, Söding J. Protein‐level assembly increases protein sequence recovery from metagenomic samples manyfold. Nat Methods. 2019;16:603–606.
Sweetser D, Nonet M, Young RA. Prokaryotic and eukaryotic RNA polymerases have homologous core subunits. Proc Natl Acad Sci U S A. 1987;84:1192–1196.
Taylor WR, Orengo CA. Protein structure alignment. J Mol Biol. 1989;208:1–22.
Varadi M, Anyango S, Deshpande M, Nair S, Natassia C, Yordanova G, et al. AlphaFold protein structure database: massively expanding the structural coverage of protein‐sequence space with high‐accuracy models. Nucleic Acids Res. 2022;50:D439–D444.
Vishwanath P, Favaretto P, Hartman H, Mohr SC, Smith TF. Ribosomal protein‐sequence block structure suggests complex prokaryotic evolution with implications for the origin of eukaryotes. Mol Phylogenet Evol. 2004;33:615–625.
Weiner J 3rd, Beaussart F, Bornberg‐Bauer E. Domain deletions and substitutions in the modular protein evolution. FEBS J. 2006;273:2037–2047.
Werner F, Grohmann D. Evolution of multisubunit RNA polymerases in the three domains of life. Nat Rev Microbiol. 2011;9:85–98.
Wilkins D. gggenes: draw gene arrow maps in ‘ggplot2’. (2020).
Witwinowski J, Sartori‐Rupp A, Taib N, Pende N, Tham TN, Poppleton D, et al. An ancient divide in outer membrane tethering systems in bacteria suggests a mechanism for the diderm‐to‐monoderm transition. Nat Microbiol. 2022;7:411–422.
Zhu Q, Mai U, Pfeiffer W, Janssen S, Asnicar F, Sanders JG, et al. Phylogenomics of 10,575 genomes reveals evolutionary proximity between domains bacteria and archaea. Nat Commun. 2019;10:5477.
Zimmermann L, Stephens A, Nam SZ, Rau D, Kübler J, Lozajic M, et al. A completely reimplemented MPI bioinformatics toolkit with a new HHpred server at its core. J Mol Biol. 2018;430:2237–2243.

Auteurs

Claudia Alvarez-Carreño (C)

Institute of Structural and Molecular Biology, University College London, London, UK.

Angela T Huynh (AT)

School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia, USA.

Anton S Petrov (AS)

School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia, USA.
NASA Center for the Origin of Life, Georgia Institute of Technology, Atlanta, Georgia, USA.

Christine Orengo (C)

Institute of Structural and Molecular Biology, University College London, London, UK.

Loren Dean Williams (LD)

School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia, USA.
NASA Center for the Origin of Life, Georgia Institute of Technology, Atlanta, Georgia, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Photosynthesis Ribulose-Bisphosphate Carboxylase Carbon Dioxide Molecular Dynamics Simulation Cyanobacteria
Databases, Protein Protein Domains Protein Folding Proteins Deep Learning
Animals Hemiptera Insect Proteins Phylogeny Insecticides

Classifications MeSH