Improved prediction of bacterial CRISPRi guide efficiency from depletion screens through mixed-effect machine learning and data integration.


Journal

Genome biology
ISSN: 1474-760X
Titre abrégé: Genome Biol
Pays: England
ID NLM: 100960660

Informations de publication

Date de publication:
11 Jan 2024
Historique:
received: 19 06 2022
accepted: 20 12 2023
medline: 11 1 2024
pubmed: 11 1 2024
entrez: 10 1 2024
Statut: epublish

Résumé

CRISPR interference (CRISPRi) is the leading technique to silence gene expression in bacteria; however, design rules remain poorly defined. We develop a best-in-class prediction algorithm for guide silencing efficiency by systematically investigating factors influencing guide depletion in genome-wide essentiality screens, with the surprising discovery that gene-specific features substantially impact prediction. We develop a mixed-effect random forest regression model that provides better estimates of guide efficiency. We further apply methods from explainable AI to extract interpretable design rules from the model. This study provides a blueprint for predictive models for CRISPR technologies where only indirect measurements of guide activity are available.

Identifiants

pubmed: 38200565
doi: 10.1186/s13059-023-03153-y
pii: 10.1186/s13059-023-03153-y
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

13

Subventions

Organisme : Bayerisches Staatsministerium für Bildung und Kultus, Wissenschaft und Kunst
ID : Research network bayresq.net

Informations de copyright

© 2024. The Author(s).

Références

Qi LS, Larson MH, Gilbert LA, Doudna JA, Weissman JS, Arkin AP, et al. Repurposing CRISPR as an RNA-guided platform for sequence-specific control of gene expression. Cell. 2013;152:1173–83.
doi: 10.1016/j.cell.2013.02.022 pubmed: 23452860 pmcid: 3664290
Bikard D, Jiang W, Samai P, Hochschild A, Zhang F, Marraffini LA. Programmable repression and activation of bacterial gene expression using an engineered CRISPR-Cas system. Nucleic Acids Res. 2013;41:7429–37.
doi: 10.1093/nar/gkt520 pubmed: 23761437 pmcid: 3753641
Luo ML, Leenay RT, Beisel CL. Current and future prospects for CRISPR-based tools in bacteria. Biotechnol Bioeng. 2016;113:930–43.
doi: 10.1002/bit.25851 pubmed: 26460902
Vigouroux A, Bikard D. CRISPR Tools To Control Gene Expression in Bacteria. Microbiol Mol Biol Rev. 2020;84:e00077-e119.
doi: 10.1128/MMBR.00077-19 pubmed: 32238445 pmcid: 7117552
Cain AK, Barquist L, Goodman AL, Paulsen IT, Parkhill J, van Opijnen T. A decade of advances in transposon-insertion sequencing. Nat Rev Genet. 2020; Available from: https://doi.org/10.1038/s41576-020-0244-x .
Jusiak B, Cleto S, Perez-Piñera P, Lu TK. Engineering Synthetic Gene Circuits in Living Cells with CRISPR Technology. Trends Biotechnol. 2016;34:535–47.
doi: 10.1016/j.tibtech.2015.12.014 pubmed: 26809780
Cho S, Shin J, Cho B-K. Applications of CRISPR/Cas System to Bacterial Metabolic Engineering. Int J Mol Sci. 2018;19. Available from: https://doi.org/10.3390/ijms19041089 .
Mougiakos I, Bosma EF, Ganguly J, van der Oost J, van Kranenburg R. Hijacking CRISPR-Cas for high-throughput bacterial metabolic engineering: advances and prospects. Curr Opin Biotechnol. 2018;50:146–57.
doi: 10.1016/j.copbio.2018.01.002 pubmed: 29414054
Liao C, Ttofali F, Slotkowski RA, Denny SR, Cecil TD, Leenay RT, et al. Modular one-pot assembly of CRISPR arrays enables library generation and reveals factors influencing crRNA biogenesis. Nat Commun. 2019;10:2948.
doi: 10.1038/s41467-019-10747-3 pubmed: 31270316 pmcid: 6610086
Reis AC, Halper SM, Vezeau GE, Cetnar DP, Hossain A, Clauer PR, et al. Simultaneous repression of multiple bacterial genes using nonrepetitive extra-long sgRNA arrays. Nat Biotechnol. 2019;37:1294–301.
doi: 10.1038/s41587-019-0286-9 pubmed: 31591552
Doench JG, Hartenian E, Graham DB, Tothova Z, Hegde M, Smith I, et al. Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat Biotechnol. 2014;32:1262–7.
doi: 10.1038/nbt.3026 pubmed: 25184501 pmcid: 4262738
Wong N, Liu W, Wang X. WU-CRISPR: characteristics of functional guide RNAs for the CRISPR/Cas9 system. Genome Biol. 2015;16:218.
doi: 10.1186/s13059-015-0784-0 pubmed: 26521937 pmcid: 4629399
Labun K, Montague TG, Gagnon JA, Thyme SB, Valen E. CHOPCHOP v2: a web tool for the next generation of CRISPR genome engineering. Nucleic Acids Res. 2016;44:W272–6.
doi: 10.1093/nar/gkw398 pubmed: 27185894 pmcid: 4987937
Moreno-Mateos MA, Vejnar CE, Beaudoin J-D, Fernandez JP, Mis EK, Khokha MK, et al. CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo. Nat Methods. 2015;12:982–8.
doi: 10.1038/nmeth.3543 pubmed: 26322839 pmcid: 4589495
Doench JG, Fusi N, Sullender M, Hegde M, Vaimberg EW, Donovan KF, et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol. 2016;34:184–91.
doi: 10.1038/nbt.3437 pubmed: 26780180 pmcid: 4744125
Chuai G, Ma H, Yan J, Chen M, Hong N, Xue D, et al. DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol. 2018;19:80.
doi: 10.1186/s13059-018-1459-4 pubmed: 29945655 pmcid: 6020378
Wang D, Zhang C, Wang B, Li B, Wang Q, Liu D, et al. Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning. Nat Commun. 2019;10:4284.
doi: 10.1038/s41467-019-12281-8 pubmed: 31537810 pmcid: 6753114
Kim HK, Kim Y, Lee S, Min S, Bae JY, Choi JW, et al. SpCas9 activity prediction by DeepSpCas9, a deep learning-based model with high generalization performance. Sci Adv. 2019;5:eaax9249.
Xiang X, Corsi GI, Anthon C, Qu K, Pan X, Liang X, et al. Enhancing CRISPR-Cas9 gRNA efficiency prediction by data integration and deep learning. Nat Commun. 2021;12:3238.
doi: 10.1038/s41467-021-23576-0 pubmed: 34050182 pmcid: 8163799
Calvo-Villamañán A, Ng JW, Planel R, Ménager H, Chen A, Cui L, et al. On-target activity predictions enable improved CRISPR-dCas9 screens in bacteria. Nucleic Acids Res. 2020; Available from: https://doi.org/10.1093/nar/gkaa294 .
Rousset F, Cui L, Siouve E, Becavin C, Depardieu F, Bikard D. Genome-wide CRISPR-dCas9 screens in E. coli identify essential genes and phage host factors. PLoS Genet. 2018;14:e1007749.
Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F. Efficient and Robust Automated Machine Learning. In: Cortes C, Lawrence ND, Lee DD, Sugiyama M, Garnett R, editors. Advances in Neural Information Processing Systems 28. Cambridge: Curran Associates, Inc.; 2015. p. 2962–70.
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine Learning in Python. J Mach Learn Res. 2011;12:2825–30.
Baba T, Ara T, Hasegawa M, Takai Y. Construction of Escherichia coli K‐12 in‐frame, single‐gene knockout mutants: the Keio collection. Mol Syst Biol. 2006; Available from: https://www.embopress.org/doi/abs/10.1038/msb4100050 .
Wang T, Guan C, Guo J, Liu B, Wu Y, Xie Z, et al. Pooled CRISPR interference screening enables genome-scale functional genomics study in bacteria with superior performance. Nat Commun. 2018;9:2475.
doi: 10.1038/s41467-018-04899-x pubmed: 29946130 pmcid: 6018678
Lorenz R, Bernhart SH, Höner Zu Siederdissen C, Tafer H, Flamm C, Stadler PF, et al. ViennaRNA Package 2.0. Algorithms Mol Biol. 2011;6:26.
Lorenz R, Hofacker IL, Bernhart SH. Folding RNA/DNA hybrid duplexes. Bioinformatics. 2012;28:2530–1.
doi: 10.1093/bioinformatics/bts466 pubmed: 22829626
Conway T, Creecy JP, Maddox SM, Grissom JE, Conkle TL, Shadid TM, et al. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing. MBio. 2014;5:e01442-e1514.
doi: 10.1128/mBio.01442-14 pubmed: 25006232 pmcid: 4161252
Santos-Zavaleta A, Salgado H, Gama-Castro S, Sánchez-Pérez M, Gómez-Romero L, Ledezma-Tejeida D, et al. RegulonDB v 10.5: tackling challenges to unify classic and high throughput knowledge of gene regulation in E. coli K-12. Nucleic Acids Res. 2019;47:D212–20.
Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nat Mach Intell. 2020;2:56–67.
doi: 10.1038/s42256-019-0138-9 pubmed: 32607472 pmcid: 7326367
Cui L, Vigouroux A, Rousset F, Varet H, Khanna V, Bikard D. A CRISPRi screen in E. coli reveals sequence-specific toxicity of dCas9. Nat Commun. 2018;9:1912.
Hajjem A, Bellavance F, Larocque D. Mixed-effects random forest for clustered data. J Stat Comput Simul. 2014;84:1313–28.
doi: 10.1080/00949655.2012.741599
Corsi GI, Qu K, Alkan F, Pan X, Luo Y, Gorodkin J. CRISPR/Cas9 gRNA activity depends on free energy changes and on the target PAM context. Nat Commun. 2022;13:3006.
doi: 10.1038/s41467-022-30515-0 pubmed: 35637227 pmcid: 9151727
Vialetto E, Yu Y, Collins SP, Wandera KG, Barquist L, Beisel CL. A target expression threshold dictates invader defense and prevents autoimmunity by CRISPR-Cas13. Cell Host Microbe. 2022; Available from: https://www.sciencedirect.com/science/article/pii/S1931312822002736 .
Typas A, Nichols RJ, Siegele DA, Shales M, Collins SR, Lim B, et al. High-throughput, quantitative analyses of genetic interactions in E. coli. Nat Methods. 2008;5:781–7.
Butland G, Babu M, Díaz-Mejía JJ, Bohdana F, Phanse S, Gold B, et al. eSGA: E. coli synthetic genetic array analysis. Nat Methods. 2008;5:789–95.
Kuzmin E, VanderSluis B, Wang W, Tan G, Deshpande R, Chen Y, et al. Systematic analysis of complex genetic interactions. Science. 2018;360. Available from: https://doi.org/10.1126/science.aao1729 .
Lian J, HamediRad M, Hu S, Zhao H. Combinatorial metabolic engineering using an orthogonal tri-functional CRISPR system. Nat Commun. 2017;8:1688.
doi: 10.1038/s41467-017-01695-x pubmed: 29167442 pmcid: 5700065
Cho S, Choe D, Lee E, Kim SC, Palsson BØ, Cho B-K. High-level dCas9 expression induces abnormal cell morphology in Escherichia coli. ACS Synth Biol. 2018; Available from: https://doi.org/10.1021/acssynbio.7b00462 .
Rock JM, Hopkins FF, Chavez A, Diallo M, Chase MR, Gerrick ER, et al. Programmable transcriptional repression in mycobacteria using an orthogonal CRISPR interference platform. Nat Microbiol. 2017;2:16274.
doi: 10.1038/nmicrobiol.2016.274 pubmed: 28165460 pmcid: 5302332
Collias D, Beisel CL. CRISPR technologies and the search for the PAM-free nuclease. Nat Commun. 2021;12:555.
doi: 10.1038/s41467-020-20633-y pubmed: 33483498 pmcid: 7822910
Alkan F, Wenzel A, Anthon C, Havgaard JH, Gorodkin J. CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters. Genome Biol. 2018;19:177.
doi: 10.1186/s13059-018-1534-x pubmed: 30367669 pmcid: 6203265
Tierrafría VH, Rioualen C, Salgado H, Lara P, Gama-Castro S, Lally P, et al. RegulonDB 11.0: Comprehensive high-throughput datasets on transcriptional regulation in Escherichia coli K-12. Microb Genom. 2022;8. Available from: https://doi.org/10.1099/mgen.0.000833 .
Puigbò P, Bravo IG, Garcia-Vallve S. CAIcal: a combined set of tools to assess codon usage adaptation. Biol Direct. 2008;3:38.
doi: 10.1186/1745-6150-3-38 pubmed: 18796141 pmcid: 2553769
Bergstra J, Yamins D, Cox D. Making a Science of Model Search: Hyperparameter Optimization in Hundreds of Dimensions for Vision Architectures. In: Dasgupta S, McAllester D, editors. Proceedings of the 30th International Conference on Machine Learning. Atlanta: PMLR; 2013. p. 115–23.
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. arXiv [cs.LG]. 2019. Available from: http://arxiv.org/abs/1912.01703 .
Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. arXiv [cs.LG]. 2015. Available from: http://arxiv.org/abs/1502.03167 .
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R. Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res. 2014;15:1929–58.
Loshchilov I, Hutter F. Decoupled Weight Decay Regularization. arXiv [cs.LG]. 2017. Available from: http://arxiv.org/abs/1711.05101 .
Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. arXiv [cs.LG]. 2014. Available from: http://arxiv.org/abs/1412.6980 .
Bushnell B, Rood J, Singer E. BBMerge – Accurate paired shotgun read merging via overlap. PLoS ONE. 2017;12: e0185056.
doi: 10.1371/journal.pone.0185056 pubmed: 29073143 pmcid: 5657622
Robinson MD, McCarthy DJ, Smyth GK. edgeR: A Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2009;26:139–40.
doi: 10.1093/bioinformatics/btp616 pubmed: 19910308 pmcid: 2796818
Yu, Y, Gawlitt, S, Barros de Andrade e Sousa L, Merdivan E, Piraud M, Beisel CL, Barquist L. CRISPRi_guide_efficiency_bacteria. Github. https://github.com/BarquistLab/CRISPRi_guide_efficiency_bacteria .
Yu, Y, Gawlitt, S, Barros de Andrade e Sousa L, Merdivan E, Piraud M, Beisel CL, Barquist L. BarquistLab/CRISPRi_guide_efficiency_bacteria: version 1.0. Zenodo. https://zenodo.org/doi/10.5281/zenodo.10262866 .
Yu, Y, Gawlitt, S, Beisel CL, Barquist L. Improved prediction of bacterial CRISPRi guide efficiency from depletion screens through mixed-effect machine learning and data integration. NCBI GEO. GSE196911. 2023. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?&acc=GSE196911 .

Auteurs

Yanying Yu (Y)

Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, 97080, Germany.

Sandra Gawlitt (S)

Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, 97080, Germany.

Lisa Barros de Andrade E Sousa (LB)

Helmholtz AI, Helmholtz Zentrum München, Neuherberg, 85764, Germany.

Erinc Merdivan (E)

Helmholtz AI, Helmholtz Zentrum München, Neuherberg, 85764, Germany.

Marie Piraud (M)

Helmholtz AI, Helmholtz Zentrum München, Neuherberg, 85764, Germany.

Chase L Beisel (CL)

Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, 97080, Germany.
Medical Faculty, University of Würzburg, Würzburg, 97080, Germany.

Lars Barquist (L)

Helmholtz Institute for RNA-Based Infection Research (HIRI), Helmholtz Centre for Infection Research (HZI), Würzburg, 97080, Germany. lars.barquist@helmholtz-hiri.de.
Medical Faculty, University of Würzburg, Würzburg, 97080, Germany. lars.barquist@helmholtz-hiri.de.

Classifications MeSH