Codon usage and expression-based features significantly improve prediction of CRISPR efficiency.


Journal

NPJ systems biology and applications
ISSN: 2056-7189
Titre abrégé: NPJ Syst Biol Appl
Pays: England
ID NLM: 101677786

Informations de publication

Date de publication:
03 Sep 2024
Historique:
received: 23 02 2024
accepted: 27 08 2024
medline: 4 9 2024
pubmed: 4 9 2024
entrez: 3 9 2024
Statut: epublish

Résumé

CRISPR is a precise and effective genome editing technology; but despite several advancements during the last decade, our ability to computationally design gRNAs remains limited. Most predictive models have relatively low predictive power and utilize only the sequence of the target site as input. Here we suggest a new category of features, which incorporate the target site genomic position and the presence of genes close to it. We calculate four features based on gene expression and codon usage bias indices. We show, on CRISPR datasets taken from 3 different cell types, that such features perform comparably with 425 state-of-the-art predictive features, ranking in the top 2-12% of features. We trained new predictive models, showing that adding expression features to them significantly improves their r

Identifiants

pubmed: 39227603
doi: 10.1038/s41540-024-00431-8
pii: 10.1038/s41540-024-00431-8
doi:

Substances chimiques

RNA, Guide, CRISPR-Cas Systems 0
Codon 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

100

Informations de copyright

© 2024. The Author(s).

Références

Doudna, J. A. & Charpentier, E. The new frontier of genome engineering with CRISPR-Cas9. Science 346, 1258096 (2014).
pubmed: 25430774 doi: 10.1126/science.1258096
Pickar-Oliver, A. & Gersbach, C. A. The next generation of CRISPR–Cas technologies and applications. Nat. Rev. Mol. Cell Biol. 20, 490–507 (2019).
pubmed: 31147612 pmcid: 7079207 doi: 10.1038/s41580-019-0131-5
Li, H. et al. Applications of genome editing technology in the targeted therapy of human diseases: mechanisms, advances and prospects. Signal Transduct. Target. Ther. 5, 1 (2020).
pubmed: 32296011 pmcid: 6946647 doi: 10.1038/s41392-019-0089-y
Moreno-Mateos, M. A. et al. CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo. Nat. Methods 12, 982–988 (2015).
pubmed: 26322839 pmcid: 4589495 doi: 10.1038/nmeth.3543
Singh, R., Kuscu, C., Quinlan, A., Qi, Y. & Adli, M. Cas9-chromatin binding information enables more accurate CRISPR off-target prediction. Nucleic Acids Res. 43, e118–e118 (2015).
pubmed: 26032770 pmcid: 4605288 doi: 10.1093/nar/gkv575
Xu, H. et al. Sequence determinants of improved CRISPR sgRNA design. Genome Res. 25, 1147–1157 (2015).
pubmed: 26063738 pmcid: 4509999 doi: 10.1101/gr.191452.115
Kaur, K., Gupta, A. K., Rajput, A. & Kumar, M. ge-CRISPR - an integrated pipeline for the prediction and analysis of sgRNAs genome editing efficiency for CRISPR/Cas system. Sci. Rep. 6, 30870 (2016).
pubmed: 27581337 pmcid: 5007494 doi: 10.1038/srep30870
Labuhn, M. et al. Refined sgRNA efficacy prediction improves large- and small-scale CRISPR–Cas9 applications. Nucleic Acids Res. 46, 1375–1385 (2018).
pubmed: 29267886 doi: 10.1093/nar/gkx1268
Chari, R., Yeo, N. C., Chavez, A. & Church, G. M. sgRNA Scorer 2.0: a species-independent model to predict CRISPR/Cas9 activity. ACS Synth. Biol. 6, 902–904 (2017).
pubmed: 28146356 pmcid: 5793212 doi: 10.1021/acssynbio.6b00343
Abadi, S., Yan, W. X., Amar, D. & Mayrose, I. A machine learning approach for predicting CRISPR-Cas9 cleavage efficiencies and patterns underlying its mechanism of action. PLOS Comput. Biol. 13, e1005807 (2017).
pubmed: 29036168 pmcid: 5658169 doi: 10.1371/journal.pcbi.1005807
Listgarten, J. et al. Prediction of off-target activities for the end-to-end design of CRISPR guide RNAs. Nat. Biomed. Eng. 2, 38–47 (2018).
pubmed: 29998038 pmcid: 6037314 doi: 10.1038/s41551-017-0178-6
Chuai, G. et al. DeepCRISPR: optimized CRISPR guide RNA design by deep learning. Genome Biol. 19, 80 (2018).
pubmed: 29945655 pmcid: 6020378 doi: 10.1186/s13059-018-1459-4
Lin, J. & Wong, K.-C. Off-target predictions in CRISPR-Cas9 gene editing using deep learning. Bioinformatics 34, i656–i663 (2018).
pubmed: 30423072 pmcid: 6129261 doi: 10.1093/bioinformatics/bty554
Peng, H., Zheng, Y., Blumenstein, M., Tao, D. & Li, J. CRISPR/Cas9 cleavage efficiency regression through boosting algorithms and Markov sequence profiling. Bioinformatics 34, 3069–3077 (2018).
pubmed: 29672669 doi: 10.1093/bioinformatics/bty298
Alkan, F., Wenzel, A., Anthon, C., Havgaard, J. H. & Gorodkin, J. CRISPR-Cas9 off-targeting assessment with nucleic acid duplex energy parameters. Genome Biol. 19, 177 (2018).
pubmed: 30367669 pmcid: 6203265 doi: 10.1186/s13059-018-1534-x
Wang, D. et al. Optimized CRISPR guide RNA design for two high-fidelity Cas9 variants by deep learning. Nat. Commun. 10, 4284 (2019).
pubmed: 31537810 pmcid: 6753114 doi: 10.1038/s41467-019-12281-8
Xue, L., Tang, B., Chen, W. & Luo, J. Prediction of CRISPR sgRNA activity using a deep convolutional neural network. J. Chem. Inf. Model. 59, 615–624 (2019).
pubmed: 30485088 doi: 10.1021/acs.jcim.8b00368
Zhang, G., Dai, Z. & Dai, X. A novel hybrid CNN-SVR for CRISPR/Cas9 guide RNA activity prediction. Front. Genet. 10, 1303 (2019).
pubmed: 31969902 doi: 10.3389/fgene.2019.01303
Dimauro, G. et al. CRISPRLearner: a deep learning-based system to predict CRISPR/Cas9 sgRNA on-target cleavage efficiency, GiovanniAU - Colagrande. Electronics 8, 1478 (2019).
doi: 10.3390/electronics8121478
Hiranniramol, K., Chen, Y., Liu, W. & Wang, X. Generalizable sgRNA design for improved CRISPR/Cas9 editing efficiency. Bioinformatics 36, 2684–2689 (2020).
pubmed: 31971562 pmcid: 7203743 doi: 10.1093/bioinformatics/btaa041
Niu, R., Peng, J., Zhang, Z. & Shang, X. R-CRISPR: a deep learning network to predict off-target activities with mismatch, insertion and deletion in CRISPR-Cas9 system. Genes 12, 1878 (2021).
pubmed: 34946828 pmcid: 8702036 doi: 10.3390/genes12121878
Zhang, G., Dai, Z. & Dai, X. C-RNNCrispr: prediction of CRISPR/Cas9 sgRNA activity using convolutional and recurrent neural networks. Comput. Struct. Biotechnol. J. 18, 344–354 (2020).
pubmed: 32123556 pmcid: 7037582 doi: 10.1016/j.csbj.2020.01.013
Konstantakos, V., Nentidis, A., Krithara, A. & Paliouras, G. CRISPRedict: a CRISPR-Cas9 web tool for interpretable efficiency predictions. Nucleic Acids Res. 50, W191–W198 (2022).
pubmed: 35670672 pmcid: 9252759 doi: 10.1093/nar/gkac466
Shen, M. W. et al. Predictable and precise template-free CRISPR editing of pathogenic variants. Nature 563, 646–651 (2018).
pubmed: 30405244 pmcid: 6517069 doi: 10.1038/s41586-018-0686-x
Chen, W. et al. Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair. Nucleic Acids Res. 47, 7989–8003 (2019).
pubmed: 31165867 pmcid: 6735782 doi: 10.1093/nar/gkz487
Leenay, R. T. et al. Large dataset enables prediction of repair after CRISPR–Cas9 editing in primary T cells. Nat. Biotechnol. 37, 1034–1037 (2019).
pubmed: 31359007 pmcid: 7388783 doi: 10.1038/s41587-019-0203-2
Allen, F. et al. Predicting the mutations generated by repair of Cas9-induced double-strand breaks. Nat. Biotechnol. 37, 64–72 (2019).
doi: 10.1038/nbt.4317
Li, V. R., Zhang, Z. & Troyanskaya, O. G. CROTON: an automated and variant-aware deep learning framework for predicting CRISPR/Cas9 editing outcomes. Bioinformatics 37, i342–i348 (2021).
pubmed: 34252931 pmcid: 8275342 doi: 10.1093/bioinformatics/btab268
Zhu, L. J., Holmes, B. R., Aronin, N. & Brodsky, M. H. CRISPRseek: a Bioconductor package to identify target-specific guide RNAs for CRISPR-Cas9 genome-editing systems. PLoS One 9, e108424 (2014).
pubmed: 25247697 pmcid: 4172692 doi: 10.1371/journal.pone.0108424
Xie, S., Shen, B., Zhang, C., Huang, X. & Zhang, Y. sgRNAcas9: a software package for designing CRISPR sgRNA and evaluating potential off-target cleavage sites. PLoS One 9, e100448 (2014).
pubmed: 24956386 pmcid: 4067335 doi: 10.1371/journal.pone.0100448
Bae, S., Park, J. & Kim, J.-S. Cas-OFFinder: a fast and versatile algorithm that searches for potential off-target sites of Cas9 RNA-guided endonucleases. Bioinformatics 30, 1473–1475 (2014).
pubmed: 24463181 pmcid: 4016707 doi: 10.1093/bioinformatics/btu048
Xiao, A. et al. CasOT: a genome-wide Cas9/gRNA off-target searching tool. Bioinformatics 30, 1180–1182 (2014).
pubmed: 24389662 doi: 10.1093/bioinformatics/btt764
Heigwer, F., Kerr, G. & Boutros, M. E-CRISP: fast CRISPR target site identification. Nat. Methods 11, 122–123 (2014).
pubmed: 24481216 doi: 10.1038/nmeth.2812
Cradick, T. J., Qiu, P., Lee, C. M., Fine, E. J. & Bao, G. COSMID: a web-based tool for identifying and validating CRISPR/Cas off-target sites. Mol. Ther. Nucleic Acids. 3, e214 (2014).
pubmed: 25462530 pmcid: 4272406 doi: 10.1038/mtna.2014.64
Stemmer, M., Thumberger, T., del Sol Keyer, M., Wittbrodt, J. & Mateo, J. L. CCTop: an intuitive, flexible and reliable CRISPR/Cas9 target prediction tool. PLoS One 10, e0124633 (2015).
pubmed: 25909470 pmcid: 4409221 doi: 10.1371/journal.pone.0124633
Liu, H. et al. CRISPR-ERA: a comprehensive design tool for CRISPR-mediated gene editing, repression and activation. Bioinformatics 31, 3676–3678 (2015).
pubmed: 26209430 pmcid: 4757951 doi: 10.1093/bioinformatics/btv423
Peng, D. & Tarleton, R. EuPaGDT: a web tool tailored to design CRISPR guide RNAs for eukaryotic pathogens. Microb. Genom. 1, e000033 (2015).
pubmed: 28348817 pmcid: 5320623
Oliveros, J. C. et al. Breaking-Cas—interactive design of guide RNAs for CRISPR-Cas experiments for ENSEMBL genomes. Nucleic Acids Res. 44, W267–W271 (2016).
pubmed: 27166368 pmcid: 4987939 doi: 10.1093/nar/gkw407
Pulido-Quetglas, C. et al. Scalable design of paired CRISPR guide RNAs for genomic deletion. PLOS Comput. Biol. 13, e1005341 (2017).
pubmed: 28253259 pmcid: 5333799 doi: 10.1371/journal.pcbi.1005341
Perez, A. R. et al. GuideScan software for improved single and paired CRISPR guide RNA design. Nat. Biotechnol. 35, 347–349 (2017).
pubmed: 28263296 pmcid: 5607865 doi: 10.1038/nbt.3804
Liu, H. et al. CRISPR-P 2.0: an improved CRISPR-Cas9 tool for genome editing in plants. Mol. Plant 10, 530–532 (2017).
pubmed: 28089950 doi: 10.1016/j.molp.2017.01.003
Xie, X. et al. CRISPR-GE: a convenient software toolkit for CRISPR-based genome editing. Mol. Plant 10, 1246–1249 (2017).
pubmed: 28624544 doi: 10.1016/j.molp.2017.06.004
Concordet, J.-P. & Haeussler, M. CRISPOR: intuitive guide selection for CRISPR/Cas9 genome editing experiments and screens. Nucleic Acids Res. 46, W242–W245 (2018).
pubmed: 29762716 pmcid: 6030908 doi: 10.1093/nar/gky354
McKenna, A. & Shendure, J. FlashFry: a fast and flexible tool for large-scale CRISPR target design. BMC Biol. 16, 74 (2018).
pubmed: 29976198 pmcid: 6033233 doi: 10.1186/s12915-018-0545-0
Peng, H., Zheng, Y., Zhao, Z., Liu, T. & Li, J. Recognition of CRISPR/Cas9 off-target sites through ensemble learning of uneven mismatch distributions. Bioinformatics 34, i757–i765 (2018).
pubmed: 30423065 doi: 10.1093/bioinformatics/bty558
Jacquin, A. L. S., Odom, D. T. & Lukk, M. Crisflash: open-source software to generate CRISPR guide RNAs against genomes annotated with individual variation. Bioinformatics 35, 3146–3147 (2019).
pubmed: 30649181 pmcid: 6735888 doi: 10.1093/bioinformatics/btz019
Labun, K. et al. CHOPCHOP v3: expanding the CRISPR web toolbox beyond genome editing. Nucleic Acids Res. 47, W171–W174 (2019).
pubmed: 31106371 pmcid: 6602426 doi: 10.1093/nar/gkz365
Minkenberg, B., Zhang, J., Xie, K. & Yang, Y. CRISPR-PLANT v2: an online resource for highly specific guide RNA spacers based on improved off-target analysis. Plant Biotechnol. J. 17, 5–8 (2019).
pubmed: 30325102 doi: 10.1111/pbi.13025
Bao, X. R., Pan, Y., Lee, C. M., Davis, T. H. & Bao, G. Tools for experimental and computational analyses of off-target editing by programmable nucleases. Nat. Protoc. 16, 10–26 (2021).
pubmed: 33288953 doi: 10.1038/s41596-020-00431-y
Newman, A., Starrs, L. & Burgio, G. Cas9 cuts and consequences; detecting, predicting, and mitigating CRISPR/Cas9 on- and off-target damage. BioEssays 42, 2000047 (2020).
doi: 10.1002/bies.202000047
Sledzinski, P., Nowaczyk, M. & Olejniczak, M. Computational tools and resources supporting CRISPR-Cas experiments. Cells 9, 1288 (2020).
pubmed: 32455882 pmcid: 7290941 doi: 10.3390/cells9051288
Wang, J., Zhang, X., Cheng, L. & Luo, Y. An overview and metanalysis of machine and deep learning-based CRISPR gRNA design tools. RNA Biol. 17, 13–22 (2020).
pubmed: 31533522 doi: 10.1080/15476286.2019.1669406
Konstantakos, V., Nentidis, A., Krithara, A. & Paliouras, G. CRISPR–Cas9 gRNA efficiency prediction: an overview of predictive tools and the role of deep learning. Nucleic Acids Res. 50, 3616–3637 (2022).
pubmed: 35349718 pmcid: 9023298 doi: 10.1093/nar/gkac192
Alipanahi, R., Safari, L. & Khanteymoori, A. CRISPR genome editing using computational approaches: a survey. Front. Bioinforma. 2, 1001131 (2023).
doi: 10.3389/fbinf.2022.1001131
Liu, G., Zhang, Y. & Zhang, T. Computational approaches for effective CRISPR guide RNA design and evaluation. Comput. Struct. Biotechnol. J. 18, 35–44 (2020).
pubmed: 31890142 doi: 10.1016/j.csbj.2019.11.006
Buccitelli, C. & Selbach, M. mRNAs, proteins and the emerging principles of gene expression control. Nat. Rev. Genet. 21, 630–644 (2020).
pubmed: 32709985 doi: 10.1038/s41576-020-0258-4
Bergman, S. & Tuller, T. Widespread non-modular overlapping codes in the coding regions. Phys. Biol. 17, 31002 (2020).
doi: 10.1088/1478-3975/ab7083
Bahiri-Elitzur, S. & Tuller, T. Codon-based indices for modeling gene expression and transcript evolution. Comput. Struct. Biotechnol. J. 19, 2646–2663 (2021).
pubmed: 34025951 pmcid: 8122159 doi: 10.1016/j.csbj.2021.04.042
Schmid-Burgk, J. L. et al. Highly parallel profiling of Cas9 variant specificity. Mol. Cell 78, 794–800.e8 (2020).
pubmed: 32187529 pmcid: 7370240 doi: 10.1016/j.molcel.2020.02.023
Tsai, S. Q. et al. GUIDE-seq enables genome-wide profiling of off-target cleavage by CRISPR-Cas nucleases. Nat. Biotechnol. 33, 187–197 (2015).
pubmed: 25513782 doi: 10.1038/nbt.3117
Martin, F. J. et al. Ensembl 2023. Nucleic Acids Res. 51, D933–D941 (2023).
pubmed: 36318249 doi: 10.1093/nar/gkac958
Moreno, P. et al. Expression Atlas update: gene and protein expression in multiple species. Nucleic Acids Res. 50, D129–D140 (2022).
pubmed: 34850121 doi: 10.1093/nar/gkab1030
Diament, A. et al. ChimeraUGEM: unsupervised gene expression modeling in any given organism. Bioinformatics https://doi.org/10.1093/bioinformatics/btz080 (2019).
Pechmann, S. & Frydman, J. Evolutionary conservation of codon optimality reveals hidden signatures of cotranslational folding. Nat. Struct. Mol. Biol. 20, 237–243 (2013).
pubmed: 23262490 doi: 10.1038/nsmb.2466
Roymondal, U., Das, S. & Sahoo, S. Predicting gene expression level from relative codon usage bias: an application to Escherichia coli genome. DNA Res. 16, 13–30 (2009).
pubmed: 19131380 pmcid: 2646356 doi: 10.1093/dnares/dsn029
Doench, J. G. et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat. Biotechnol. 34, 184–191 (2016).
pubmed: 26780180 pmcid: 4744125 doi: 10.1038/nbt.3437
Kwon, K. H. et al. SpCas9 activity prediction by DeepSpCas9, a deep learning–based model with high generalization performance. Sci. Adv. 5, eaax9249 (2022).
Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).
pubmed: 32607472 pmcid: 7326367 doi: 10.1038/s42256-019-0138-9
Sternberg, S. H., Redding, S., Jinek, M., Greene, E. C. & Doudna, J. A. DNA interrogation by the CRISPR RNA-guided endonuclease Cas9. Nature 507, 62–67 (2014).
pubmed: 24476820 pmcid: 4106473 doi: 10.1038/nature13011
Sharp, P. M. & Li, W. H. The codon Adaptation Index-a measure of directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15, 1281–1295 (1987).
pubmed: 3547335 pmcid: 340524 doi: 10.1093/nar/15.3.1281
Reis, M. D., Savva, R. & Wernisch, L. Solving the riddle of codon usage preferences: a test for translational selection. Nucleic Acids Res. 32, 5036–5044 (2004).
pubmed: 15448185 pmcid: 521650 doi: 10.1093/nar/gkh834
Tuller, T. et al. An evolutionarily conserved mechanism for controlling the efficiency of protein translation. Cell 141, 344–354 (2010).
pubmed: 20403328 doi: 10.1016/j.cell.2010.03.031
Luo, Y. et al. New developments on the Encyclopedia of DNA Elements (ENCODE) data portal. Nucleic Acids Res. 48, D882–D889 (2020).
pubmed: 31713622 doi: 10.1093/nar/gkz1062
Lorenz, R. et al. ViennaRNA package 2.0. Algorithms Mol. Biol. 6, 26 (2011).
pubmed: 22115189 pmcid: 3319429 doi: 10.1186/1748-7188-6-26

Auteurs

Shaked Bergman (S)

Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel.

Tamir Tuller (T)

Department of Biomedical Engineering, Tel-Aviv University, Tel Aviv, Israel. tamirtul@tauex.tau.ac.il.
The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv, Israel. tamirtul@tauex.tau.ac.il.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH