High-throughput protein characterization by complementation using DNA barcoded fragment libraries.

DNA Barcoding Functional Genomics High-throughput Characterization Protein Annotation

Journal

Molecular systems biology
ISSN: 1744-4292
Titre abrégé: Mol Syst Biol
Pays: Germany
ID NLM: 101235389

Informations de publication

Date de publication:
07 Oct 2024
Historique:
received: 08 05 2024
accepted: 09 09 2024
revised: 05 09 2024
medline: 8 10 2024
pubmed: 8 10 2024
entrez: 7 10 2024
Statut: aheadofprint

Résumé

Our ability to predict, control, or design biological function is fundamentally limited by poorly annotated gene function. This can be particularly challenging in non-model systems. Accordingly, there is motivation for new high-throughput methods for accurate functional annotation. Here, we used complementation of auxotrophs and DNA barcode sequencing (Coaux-Seq) to enable high-throughput characterization of protein function. Fragment libraries from eleven genetically diverse bacteria were tested in twenty different auxotrophic strains of Escherichia coli to identify genes that complement missing biochemical activity. We recovered 41% of expected hits, with effectiveness ranging per source genome, and observed success even with distant E. coli relatives like Bacillus subtilis and Bacteroides thetaiotaomicron. Coaux-Seq provided the first experimental validation for 53 proteins, of which 11 are less than 40% identical to an experimentally characterized protein. Among the unexpected function identified was a sulfate uptake transporter, an O-succinylhomoserine sulfhydrylase for methionine synthesis, and an aminotransferase. We also identified instances of cross-feeding wherein protein overexpression and nearby non-auxotrophic strains enabled growth. Altogether, Coaux-Seq's utility is demonstrated, with future applications in ecology, health, and engineering.

Identifiants

pubmed: 39375541
doi: 10.1038/s44320-024-00068-z
pii: 10.1038/s44320-024-00068-z
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : U.S. Department of Energy (DOE)
ID : DE-AC02-05CH11231
Organisme : HHS | National Institutes of Health (NIH)
ID : NIH S10 OD018174

Informations de copyright

© 2024. The Author(s).

Références

Aguilar-Barajas E, Díaz-Pérez C, Ramírez-Díaz MI, Riveros-Rosas H, Cervantes C (2011) Bacterial transport of sulfate, molybdate, and related oxyanions. BioMetals 24:687–707
doi: 10.1007/s10534-011-9421-x pubmed: 21301930
Ankrah NYD, Bernstein DB, Biggs M, Carey M, Engevik M, García-Jiménez B, Lakshmanan M, Pacheco AR, Sulheim S, Medlock GL et al (2021) Enhancing microbiome research through genome-scale metabolic modeling. mSystems 6:e00599–21
Auger S, Yuen WH, Danchin A, Martin-Verstraete I (2002) The metIC operon involved in methionine biosynthesis in Bacillus subtilis is controlled by transcription antitermination. Microbiology 148:507–518
Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H (2006) Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2:2006–0008
Bateman A, Martin MJ, Orchard S, Magrane M, Ahmad S, Alpi E, Bowler-Barnett EH, Britto R, Bye-A-Jee H, Cukura A et al (2023) UniProt: the universal protein knowledgebase in 2023. Nucleic Acids Res 51:D523–D531
doi: 10.1093/nar/gkac1052
Bernstein DB, Sulheim S, Almaas E, Segrè D (2021) Addressing uncertainty in genome-scale metabolic model reconstruction and analysis. Genome Biol https://doi.org/10.1186/s13059-021-02289-z
Bordbar A, Monk JM, King ZA, Palsson BO (2014) Constraint-based models predict metabolic and associated cellular functions. Nat Rev Genet 15:107–120 https://doi.org/10.1038/nrg3643
Cain AK, Barquist L, Goodman AL, Paulsen IT, Parkhill J, van Opijnen T (2020) A decade of advances in transposon-insertion sequencing. Nat Rev Genet 21:526–540 https://doi.org/10.1038/s41576-020-0244-x
Carim S, Azadeh AL, Kazakov AE, Price MN, Walian PJ, Lui LM, Nielsen TN, Chakraborty R, Deutschbauer AM, Mutalik VK et al (2021) Systematic discovery of pseudomonad genetic factors involved in sensitivity to tailocins. ISME J 15:2289–2305
doi: 10.1038/s41396-021-00921-1 pubmed: 33649553 pmcid: 8319346
Carlson HK, Price MN, Callaghan M, Aaring A, Chakraborty R, Liu H, Kuehl JV, Arkin AP, Deutschbauer AM (2019) The selective pressures on the microbial community in a metal-contaminated aquifer. ISME J 13:937–949
doi: 10.1038/s41396-018-0328-1 pubmed: 30523276
Caspi R, Billington R, Keseler IM, Kothari A, Krummenacker M, Midford PE, Ong WK, Paley S, Subhraveti P, Karp PD (2020) The MetaCyc database of metabolic pathways and enzymes-a 2019 update. Nucleic Acids Res 48:D455–D453
doi: 10.1093/nar/gkz862
Cerutti P, Guroff G (1965) Enzymatic formation of phenylpyruvic acid in Pseudomonas Sp. (ATCC 11299a) and its regulation. J Biol Chem 240:3034–3038
doi: 10.1016/S0021-9258(18)97282-0 pubmed: 14342329
Chang A, Jeske L, Ulbrich S, Hofmann J, Koblitz J, Schomburg I, Neumann-Schaal M, Jahn D, Schomburg D (2021) BRENDA, the ELIXIR core data resource in 2021: new developments and updates. Nucleic Acids Res 49:D498–D508
doi: 10.1093/nar/gkaa1025 pubmed: 33211880
Cheng D, Wang R, Prather KJ, Chow KL, Hsing IM (2015) Tackling codon usage bias for heterologous expression in Rhodobacter sphaeroides by supplementation of rare tRNAs. Enzym Micro Technol 72:25
doi: 10.1016/j.enzmictec.2015.02.003
Clark DP (1989) The fermentation pathways of Escherichia coli. FEMS Microbiol Lett 63:223–234
doi: 10.1111/j.1574-6968.1989.tb03398.x
Crofts TS, McFarland AG, Hartmann EM (2021) Mosaic ends tagmentation (METa) assembly for highly efficient construction of functional metagenomic libraries. mSystems 6:e0052421
doi: 10.1128/mSystems.00524-21 pubmed: 34184912
Enright AL, Heelan WJ, Ward RD, Peters JM (2024) CRISPRi functional genomics in bacteria and its application to medical and industrial research. Microbiol Mol Biol Rev 88:e0017022
doi: 10.1128/mmbr.00170-22 pubmed: 38809084
Fincham JRS (1968) Genetic complementation. Sci Prog 56:165–177
pubmed: 4879184
Foglino M, Borne F, Bally M, Ballt G, Patte JC (1995) A direct sulfhydrylation pathway is used for methionine biosynthesis in Pseudornonas aeruginosa. Microbiology 141:43–44
doi: 10.1099/13500872-141-2-431
Frioux C, Singh D, Korcsmaros T, Hildebrand F (2020) From bag-of-genes to bag-of-genomes: metabolic modelling of communities in the era of metagenome-assembled genomes. Comput Struct Biotechnol J 18:1722–1734 https://doi.org/10.1016/j.csbj.2020.06.028
Gillespie D, Demerec ZM, Itikawa H, Sanderson E (1968) Appearance of double mutants in aged cultures of Salmonella typhzmurzum cysteine-requiring strains. Genetics 59:433–442
doi: 10.1093/genetics/59.4.433 pubmed: 4884730 pmcid: 1212013
Goff JL, Lui LM, Nielsen TN, Thorgersen MP, Szink EG, Chandonia J-M, Poole FL, Zhou J, Hazen TC, Arkin AP et al (2022) Complete genome sequence of Bacillus cereus strain CPT56D-587-MTF, isolated from a nitrate- and metal-contaminated subsurface environment. Microbiol Resour Announc 11:e0014522
doi: 10.1128/mra.00145-22 pubmed: 35475637
Gou Y, Graff F, Kilian O, Kafkas S, Katuri J, Kim JH, Marinos N, McEntyre J, Morrison A, Pi X et al (2015) Europe PMC: a full-text literature database for the life sciences and platform for innovation. Nucleic Acids Res 43:D1042–D1048
doi: 10.1093/nar/gku1061
Gray AN, Koo BM, Shiver AL, Peters JM, Osadnik H, Gross CA (2015) High-throughput bacterial functional genomics in the sequencing era. Curr Opin Microbiol 27:86–95 https://doi.org/10.1016/j.mib.2015.07.012
Grenier F, Matteau D, Baby V, Rodrigue S (2014) Complete genome sequence of Escherichia coli BW25113. Genome Announc 2:e01038–14
doi: 10.1128/genomeA.01038-14 pubmed: 25323716 pmcid: 4200154
Haft DH, Selengut JD, White O (2003) The TIGRFAMs database of protein families. Nucleic Acids Res 31:371–373 https://doi.org/10.1093/nar/gkg128
Hensel M, Shea JE, Gleeson C, Jones MD, Dalton E, Holden DW (1995) Simultaneous identification of bacterial virulence genes by negative selection. Science 269:400–403
doi: 10.1126/science.7618105 pubmed: 7618105
Hettwer S, Sterner R (2002) A novel tryptophan synthase β-subunit from the hyperthermophile Thermotoga maritima: quaternary structure, steady-state kinetics, and putative physiological role. J Biol Chem 277:8194–8201
doi: 10.1074/jbc.M111541200 pubmed: 11756459
Huang YY, Price MN, Hung A, Gal-Oz O, Ho D, Carion H, Deutschbauer AM, Arkin AP (2024) Barcoded overexpression screens in gut Bacteroidales identify genes with roles in carbon utilization and stress resistance. Nat Commun 15:6618
Kishore N, Holden MJ, Tewari YB, Goldberg RN, H Ch HH, Cooh Ch CO (1999) A thermodynamic investigation of some reactions involving prephenic acid. J Chem Thermodyn 31:211–227
Lee TS, Krupa RA, Zhang F, Hajimorad M, Holtz WJ, Prasad N, Lee SK, Keasling JD (2011) BglBrick vectors and datasheets: a synthetic biology platform for gene expression. J Biol Eng 5:12
doi: 10.1186/1754-1611-5-12 pubmed: 21933410 pmcid: 3189095
Li H (2018) Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100
doi: 10.1093/bioinformatics/bty191 pubmed: 29750242 pmcid: 6137996
Lomsadze A, Gemayel K, Tang S, Borodovsky M (2018) Modeling leaderless transcription and atypical genes results in more accurate gene prediction in prokaryotes. Genome Res 28:1079–1089
doi: 10.1101/gr.230615.117 pubmed: 29773659 pmcid: 6028130
Mansilla MC, De Mendoza D (2000) The Bacillus subtilis cysP gene encodes a novel sulphate permease related to the inorganic phosphate transporter (Pit) family. Microbiology 146:815–821
Mutalik VK, Novichkov PS, Price MN, Owens TK, Callaghan M, Carim S, Deutschbauer AM, Arkin AP (2019) Dual-barcoded shotgun expression library sequencing for high-throughput characterization of functional traits in bacteria. Nat Commun 10:308
doi: 10.1038/s41467-018-08177-8 pubmed: 30659179 pmcid: 6338753
Pedretti M, Fernández-Rodríguez C, Conter C, Oyenarte I, Favretto F, di Matteo A, Dominici P, Petrosino M, Martinez-Chantar ML, Majtan T et al (2024) Catalytic specificity and crystal structure of cystathionine γ-lyase from Pseudomonas aeruginosa. Sci Rep 14:9364
doi: 10.1038/s41598-024-57625-7 pubmed: 38654065 pmcid: 11039470
Peng M, Wang D, Lui LM, Nielsen T, Tian R, Kempher ML, Tao X, Pan C, Chakraborty R, Deutschbauer AM et al (2022) Genomic features and pervasive negative selection in Rhodanobacter strains isolated from nitrate and heavy metal contaminated aquifer. Microbiol Spectr 10:e02591-21
Price M (2023) Erroneous predictions of auxotrophies by CarveMe. Nat Ecol Evol 7:194–195 https://doi.org/10.1038/s41559-022-01936-3
Price MN, Arkin AP (2017) PaperBLAST: text mining papers for information about homologs. mSystems 2:e00039–17
doi: 10.1128/mSystems.00039-17 pubmed: 28845458 pmcid: 5557654
Price MN, Arkin AP (2022) Interactive analysis of functional residues in protein families. mSystems 7:e0070522
doi: 10.1128/msystems.00705-22 pubmed: 36374048
Price MN, Deutschbauer AM, Arkin AP (2020) GapMind: automated annotation of amino acid biosynthesis. mSystems 5:e00291–20
doi: 10.1128/msystems.00291-20 pubmed: 32576650 pmcid: 7311316
Price MN, Wetmore KM, Waters RJ, Callaghan M, Ray J, Liu H, Kuehl JV, Melnyk RA, Lamson JS, Suh Y et al (2018) Mutant phenotypes for thousands of bacterial genes of unknown function. Nature 557:503–509
doi: 10.1038/s41586-018-0124-0 pubmed: 29769716
Qi LS, Larson MH, Gilbert LA, Doudna JA, Weissman JS, Arkin AP, Lim WA (2013) Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell 152:1173–1183
doi: 10.1016/j.cell.2013.02.022 pubmed: 23452860 pmcid: 3664290
Rice P, Longden I, Bleasby A (2000) EMBOSS: the European Molecular Biology open software suite. Trends Genet 16:276–277
doi: 10.1016/S0168-9525(00)02024-2 pubmed: 10827456
Rishi HS, Toro E, Liu H, Wang X, Qi LS, Arkin AP (2020) Systematic genome-wide querying of coding and non-coding functional elements in E. coli using CRISPRi. Preprint at bioRxiv https://doi.org/10.1101/2020.03.04.975888
Roots C, Lukasiewicz A, Barrick J (2021) OSTIR: open source translation initiation rate prediction. J Open Source Softw 6:3362
doi: 10.21105/joss.03362 pubmed: 36177308 pmcid: 9518832
Schnoes AM, Brown SD, Dodevski I, Babbitt PC (2009) Annotation error in public databases: misannotation of molecular function in enzyme superfamilies. PLoS Comput Biol 5:e1000605
doi: 10.1371/journal.pcbi.1000605 pubmed: 20011109 pmcid: 2781113
Sharp PM, Li W-H (1987) The codon adaptation index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res 15:1281–1295
doi: 10.1093/nar/15.3.1281 pubmed: 3547335 pmcid: 340524
Wang D, Ding X, Rather PN (2001) Indole can act as an extracellular signal in Escherichia coli. J Bacteriol 183:4210–4216
doi: 10.1128/JB.183.14.4210-4216.2001 pubmed: 11418561 pmcid: 95310
Wang Y, Wang L, Zhang J, Duan X, Feng Y, Wang S, Shena L (2020) PA0335, a gene encoding histidinol phosphate phosphatase, mediates histidine auxotrophy in Pseudomonas aeruginosa. Appl Environ Microbiol 86:e02593–19
pubmed: 31862725 pmcid: 7028973
Wetmore KM, Price MN, Waters RJ, Lamson JS, He J, Hoover CA, Blow MJ, Bristow J, Butland G, Arkin AP et al (2015) Rapid quantification of mutant fitness in diverse bacteria by sequencing randomly bar-coded transposons. mBio 6:1–15
doi: 10.1128/mBio.00306-15
Wick RR, Judd LM, Gorrie CL, Holt KE (2017) Unicycler: resolving bacterial genome assemblies from short and long sequencing reads. PLoS Comput Biol 13:e1005595
doi: 10.1371/journal.pcbi.1005595 pubmed: 28594827 pmcid: 5481147
Widder S, Allen RJ, Pfeiffer T, Curtis TP, Wiuf C, Sloan WT, Cordero OX, Brown SP, Momeni B, Shou W et al (2016) Challenges in microbial ecology: building predictive understanding of community function and dynamics. ISME J 10:2557–2568 https://doi.org/10.1038/ismej.2016.45

Auteurs

Bradley W Biggs (BW)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Morgan N Price (MN)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Dexter Lai (D)

Department of Bioengineering, University of California-Berkeley, Berkeley, CA, 94720, USA.

Jasmine Escobedo (J)

Department of Bioengineering, University of California-Berkeley, Berkeley, CA, 94720, USA.

Yuridia Fortanel (Y)

Department of Bioengineering, University of California-Berkeley, Berkeley, CA, 94720, USA.

Yolanda Y Huang (YY)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Kyoungmin Kim (K)

Department of Bioengineering, University of California-Berkeley, Berkeley, CA, 94720, USA.

Valentine V Trotter (VV)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Jennifer V Kuehl (JV)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Lauren M Lui (LM)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Romy Chakraborty (R)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.

Adam M Deutschbauer (AM)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA.
Department of Plant and Microbial Biology, University of California-Berkeley, Berkeley, CA, 94720, USA.

Adam P Arkin (AP)

Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA. aparkin@lbl.gov.
Department of Bioengineering, University of California-Berkeley, Berkeley, CA, 94720, USA. aparkin@lbl.gov.

Classifications MeSH