High-throughput protein characterization by complementation using DNA barcoded fragment libraries.
DNA Barcoding
Functional Genomics
High-throughput Characterization
Protein Annotation
Journal
Molecular systems biology
ISSN: 1744-4292
Titre abrégé: Mol Syst Biol
Pays: Germany
ID NLM: 101235389
Informations de publication
Date de publication:
07 Oct 2024
07 Oct 2024
Historique:
received:
08
05
2024
accepted:
09
09
2024
revised:
05
09
2024
medline:
8
10
2024
pubmed:
8
10
2024
entrez:
7
10
2024
Statut:
aheadofprint
Résumé
Our ability to predict, control, or design biological function is fundamentally limited by poorly annotated gene function. This can be particularly challenging in non-model systems. Accordingly, there is motivation for new high-throughput methods for accurate functional annotation. Here, we used complementation of auxotrophs and DNA barcode sequencing (Coaux-Seq) to enable high-throughput characterization of protein function. Fragment libraries from eleven genetically diverse bacteria were tested in twenty different auxotrophic strains of Escherichia coli to identify genes that complement missing biochemical activity. We recovered 41% of expected hits, with effectiveness ranging per source genome, and observed success even with distant E. coli relatives like Bacillus subtilis and Bacteroides thetaiotaomicron. Coaux-Seq provided the first experimental validation for 53 proteins, of which 11 are less than 40% identical to an experimentally characterized protein. Among the unexpected function identified was a sulfate uptake transporter, an O-succinylhomoserine sulfhydrylase for methionine synthesis, and an aminotransferase. We also identified instances of cross-feeding wherein protein overexpression and nearby non-auxotrophic strains enabled growth. Altogether, Coaux-Seq's utility is demonstrated, with future applications in ecology, health, and engineering.
Identifiants
pubmed: 39375541
doi: 10.1038/s44320-024-00068-z
pii: 10.1038/s44320-024-00068-z
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Subventions
Organisme : U.S. Department of Energy (DOE)
ID : DE-AC02-05CH11231
Organisme : HHS | National Institutes of Health (NIH)
ID : NIH S10 OD018174
Informations de copyright
© 2024. The Author(s).
Références
Aguilar-Barajas E, Díaz-Pérez C, Ramírez-Díaz MI, Riveros-Rosas H, Cervantes C (2011) Bacterial transport of sulfate, molybdate, and related oxyanions. BioMetals 24:687–707
doi: 10.1007/s10534-011-9421-x
pubmed: 21301930
Ankrah NYD, Bernstein DB, Biggs M, Carey M, Engevik M, García-Jiménez B, Lakshmanan M, Pacheco AR, Sulheim S, Medlock GL et al (2021) Enhancing microbiome research through genome-scale metabolic modeling. mSystems 6:e00599–21
Auger S, Yuen WH, Danchin A, Martin-Verstraete I (2002) The metIC operon involved in methionine biosynthesis in Bacillus subtilis is controlled by transcription antitermination. Microbiology 148:507–518
Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2:2006–0008
Bateman A, Martin MJ, Orchard S, Magrane M, Ahmad S, Alpi E, Bowler-Barnett EH, Britto R, Bye-A-Jee H, Cukura A et al (2023) UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res 51:D523–D531
doi: 10.1093/nar/gkac1052
Bernstein DB, Sulheim S, Almaas E, Segrè D (2021) Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol https://doi.org/10.1186/s13059-021-02289-z
Bordbar A, Monk JM, King ZA, Palsson BO (2014) Constraint-based models predict metabolic and associated cellular functions. Nat Rev Genet 15:107–120 https://doi.org/10.1038/nrg3643
Cain AK, Barquist L, Goodman AL, Paulsen IT, Parkhill J, van Opijnen T (2020) A decade of advances in transposon-insertion sequencing. Nat Rev Genet 21:526–540 https://doi.org/10.1038/s41576-020-0244-x
Carim S, Azadeh AL, Kazakov AE, Price MN, Walian PJ, Lui LM, Nielsen TN, Chakraborty R, Deutschbauer AM, Mutalik VK et al (2021) Systematic discovery of pseudomonad genetic factors involved in sensitivity to tailocins. ISME J 15:2289–2305
doi: 10.1038/s41396-021-00921-1
pubmed: 33649553
pmcid: 8319346
Carlson HK, Price MN, Callaghan M, Aaring A, Chakraborty R, Liu H, Kuehl JV, Arkin AP, Deutschbauer AM (2019) The selective pressures on the microbial community in a metal-contaminated aquifer. ISME J 13:937–949
doi: 10.1038/s41396-018-0328-1
pubmed: 30523276
Caspi R, Billington R, Keseler IM, Kothari A, Krummenacker M, Midford PE, Ong WK, Paley S, Subhraveti P, Karp PD (2020) The MetaCyc database of metabolic pathways and enzymes-a 2019 update. Nucleic Acids Res 48:D455–D453
doi: 10.1093/nar/gkz862
Cerutti P, Guroff G (1965) Enzymatic formation of phenylpyruvic acid in Pseudomonas Sp. (ATCC 11299a) and its regulation. J Biol Chem 240:3034–3038
doi: 10.1016/S0021-9258(18)97282-0
pubmed: 14342329
Chang A, Jeske L, Ulbrich S, Hofmann J, Koblitz J, Schomburg I, Neumann-Schaal M, Jahn D, Schomburg D (2021) BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic Acids Res 49:D498–D508
doi: 10.1093/nar/gkaa1025
pubmed: 33211880
Cheng D, Wang R, Prather KJ, Chow KL, Hsing IM (2015) Tackling codon usage bias for heterologous expression in Rhodobacter sphaeroides by supplementation of rare tRNAs. Enzym Micro Technol 72:25
doi: 10.1016/j.enzmictec.2015.02.003
Clark DP (1989) The fermentation pathways of Escherichia coli. FEMS Microbiol Lett 63:223–234
doi: 10.1111/j.1574-6968.1989.tb03398.x
Crofts TS, McFarland AG, Hartmann EM (2021) Mosaic ends tagmentation (METa) assembly for highly efficient construction of functional metagenomic libraries. mSystems 6:e0052421
doi: 10.1128/mSystems.00524-21
pubmed: 34184912
Enright AL, Heelan WJ, Ward RD, Peters JM (2024) CRISPRi functional genomics in bacteria and its application to medical and industrial research. Microbiol Mol Biol Rev 88:e0017022
doi: 10.1128/mmbr.00170-22
pubmed: 38809084
Fincham JRS (1968) Genetic complementation. Sci Prog 56:165–177
pubmed: 4879184
Foglino M, Borne F, Bally M, Ballt G, Patte JC (1995) A direct sulfhydrylation pathway is used for methionine biosynthesis in Pseudornonas aeruginosa. Microbiology 141:43–44
doi: 10.1099/13500872-141-2-431
Frioux C, Singh D, Korcsmaros T, Hildebrand F (2020) From bag-of-genes to bag-of-genomes: metabolic modelling of communities in the era of metagenome-assembled genomes. Comput Struct Biotechnol J 18:1722–1734 https://doi.org/10.1016/j.csbj.2020.06.028
Gillespie D, Demerec ZM, Itikawa H, Sanderson E (1968) Appearance of double mutants in aged cultures of Salmonella typhzmurzum cysteine-requiring strains. Genetics 59:433–442
doi: 10.1093/genetics/59.4.433
pubmed: 4884730
pmcid: 1212013
Goff JL, Lui LM, Nielsen TN, Thorgersen MP, Szink EG, Chandonia J-M, Poole FL, Zhou J, Hazen TC, Arkin AP et al (2022) Complete genome sequence of Bacillus cereus strain CPT56D-587-MTF, isolated from a nitrate- and metal-contaminated subsurface environment. Microbiol Resour Announc 11:e0014522
doi: 10.1128/mra.00145-22
pubmed: 35475637
Gou Y, Graff F, Kilian O, Kafkas S, Katuri J, Kim JH, Marinos N, McEntyre J, Morrison A, Pi X et al (2015) Europe PMC: a full-text literature database for the life sciences and platform for innovation. Nucleic Acids Res 43:D1042–D1048
doi: 10.1093/nar/gku1061
Gray AN, Koo BM, Shiver AL, Peters JM, Osadnik H, Gross CA (2015) High-throughput bacterial functional genomics in the sequencing era. Curr Opin Microbiol 27:86–95 https://doi.org/10.1016/j.mib.2015.07.012
Grenier F, Matteau D, Baby V, Rodrigue S (2014) Complete genome sequence of Escherichia coli BW25113. Genome Announc 2:e01038–14
doi: 10.1128/genomeA.01038-14
pubmed: 25323716
pmcid: 4200154
Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31:371–373 https://doi.org/10.1093/nar/gkg128
Hensel M, Shea JE, Gleeson C, Jones MD, Dalton E, Holden DW (1995) Simultaneous identification of bacterial virulence genes by negative selection. Science 269:400–403
doi: 10.1126/science.7618105
pubmed: 7618105
Hettwer S, Sterner R (2002) A novel tryptophan synthase β-subunit from the hyperthermophile Thermotoga maritima: quaternary structure, steady-state kinetics, and putative physiological role. J Biol Chem 277:8194–8201
doi: 10.1074/jbc.M111541200
pubmed: 11756459
Huang YY, Price MN, Hung A, Gal-Oz O, Ho D, Carion H, Deutschbauer AM, Arkin AP (2024) Barcoded overexpression screens in gut Bacteroidales identify genes with roles in carbon utilization and stress resistance. Nat Commun 15:6618
Kishore N, Holden MJ, Tewari YB, Goldberg RN, H Ch HH, Cooh Ch CO (1999) A thermodynamic investigation of some reactions involving prephenic acid. J Chem Thermodyn 31:211–227
Lee TS, Krupa RA, Zhang F, Hajimorad M, Holtz WJ, Prasad N, Lee SK, Keasling JD (2011) BglBrick vectors and datasheets: a synthetic biology platform for gene expression. J Biol Eng 5:12
doi: 10.1186/1754-1611-5-12
pubmed: 21933410
pmcid: 3189095
Li H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100
doi: 10.1093/bioinformatics/bty191
pubmed: 29750242
pmcid: 6137996
Lomsadze A, Gemayel K, Tang S, Borodovsky M (2018) Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes. Genome Res 28:1079–1089
doi: 10.1101/gr.230615.117
pubmed: 29773659
pmcid: 6028130
Mansilla MC, De Mendoza D (2000) The Bacillus subtilis cysP gene encodes a novel sulphate permease related to the inorganic phosphate transporter (Pit) family. Microbiology 146:815–821
Mutalik VK, Novichkov PS, Price MN, Owens TK, Callaghan M, Carim S, Deutschbauer AM, Arkin AP (2019) Dual-barcoded shotgun expression library sequencing for high-throughput characterization of functional traits in bacteria. Nat Commun 10:308
doi: 10.1038/s41467-018-08177-8
pubmed: 30659179
pmcid: 6338753
Pedretti M, Fernández-Rodríguez C, Conter C, Oyenarte I, Favretto F, di Matteo A, Dominici P, Petrosino M, Martinez-Chantar ML, Majtan T et al (2024) Catalytic specificity and crystal structure of cystathionine γ-lyase from Pseudomonas aeruginosa. Sci Rep 14:9364
doi: 10.1038/s41598-024-57625-7
pubmed: 38654065
pmcid: 11039470
Peng M, Wang D, Lui LM, Nielsen T, Tian R, Kempher ML, Tao X, Pan C, Chakraborty R, Deutschbauer AM et al (2022) Genomic features and pervasive negative selection in Rhodanobacter strains isolated from nitrate and heavy metal contaminated aquifer. Microbiol Spectr 10:e02591-21
Price M (2023) Erroneous predictions of auxotrophies by CarveMe. Nat Ecol Evol 7:194–195 https://doi.org/10.1038/s41559-022-01936-3
Price MN, Arkin AP (2017) PaperBLAST: text mining papers for information about homologs. mSystems 2:e00039–17
doi: 10.1128/mSystems.00039-17
pubmed: 28845458
pmcid: 5557654
Price MN, Arkin AP (2022) Interactive analysis of functional residues in protein families. mSystems 7:e0070522
doi: 10.1128/msystems.00705-22
pubmed: 36374048
Price MN, Deutschbauer AM, Arkin AP (2020) GapMind: automated annotation of amino acid biosynthesis. mSystems 5:e00291–20
doi: 10.1128/msystems.00291-20
pubmed: 32576650
pmcid: 7311316
Price MN, Wetmore KM, Waters RJ, Callaghan M, Ray J, Liu H, Kuehl JV, Melnyk RA, Lamson JS, Suh Y et al (2018) Mutant phenotypes for thousands of bacterial genes of unknown function. Nature 557:503–509
doi: 10.1038/s41586-018-0124-0
pubmed: 29769716
Qi LS, Larson MH, Gilbert LA, Doudna JA, Weissman JS, Arkin AP, Lim WA (2013) Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152:1173–1183
doi: 10.1016/j.cell.2013.02.022
pubmed: 23452860
pmcid: 3664290
Rice P, Longden I, Bleasby A (2000) EMBOSS: the European Molecular Biology open software suite. Trends Genet 16:276–277
doi: 10.1016/S0168-9525(00)02024-2
pubmed: 10827456
Rishi HS, Toro E, Liu H, Wang X, Qi LS, Arkin AP (2020) Systematic genome-wide querying of coding and non-coding functional elements in E. coli using CRISPRi. Preprint at bioRxiv https://doi.org/10.1101/2020.03.04.975888
Roots C, Lukasiewicz A, Barrick J (2021) OSTIR: open source translation initiation rate prediction. J Open Source Softw 6:3362
doi: 10.21105/joss.03362
pubmed: 36177308
pmcid: 9518832
Schnoes AM, Brown SD, Dodevski I, Babbitt PC (2009) Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLoS Comput Biol 5:e1000605
doi: 10.1371/journal.pcbi.1000605
pubmed: 20011109
pmcid: 2781113
Sharp PM, Li W-H (1987) The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281–1295
doi: 10.1093/nar/15.3.1281
pubmed: 3547335
pmcid: 340524
Wang D, Ding X, Rather PN (2001) Indole can act as an extracellular signal in Escherichia coli. J Bacteriol 183:4210–4216
doi: 10.1128/JB.183.14.4210-4216.2001
pubmed: 11418561
pmcid: 95310
Wang Y, Wang L, Zhang J, Duan X, Feng Y, Wang S, Shena L (2020) PA0335, a gene encoding histidinol phosphate phosphatase, mediates histidine auxotrophy in Pseudomonas aeruginosa. Appl Environ Microbiol 86:e02593–19
pubmed: 31862725
pmcid: 7028973
Wetmore KM, Price MN, Waters RJ, Lamson JS, He J, Hoover CA, Blow MJ, Bristow J, Butland G, Arkin AP et al (2015) Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons. mBio 6:1–15
doi: 10.1128/mBio.00306-15
Wick RR, Judd LM, Gorrie CL, Holt KE (2017) Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595
doi: 10.1371/journal.pcbi.1005595
pubmed: 28594827
pmcid: 5481147
Widder S, Allen RJ, Pfeiffer T, Curtis TP, Wiuf C, Sloan WT, Cordero OX, Brown SP, Momeni B, Shou W et al (2016) Challenges in microbial ecology: building predictive understanding of community function and dynamics. ISME J 10:2557–2568 https://doi.org/10.1038/ismej.2016.45