In silico prediction of chemical aquatic toxicity by multiple machine learning and deep learning approaches.
chemical aquatic toxicity
classification models
deep learning
machine learning
molecular fingerprint
Journal
Journal of applied toxicology : JAT
ISSN: 1099-1263
Titre abrégé: J Appl Toxicol
Pays: England
ID NLM: 8109495
Informations de publication
Date de publication:
Nov 2022
Nov 2022
Historique:
revised:
16
05
2022
received:
12
04
2022
accepted:
31
05
2022
pubmed:
3
6
2022
medline:
18
10
2022
entrez:
2
6
2022
Statut:
ppublish
Résumé
Fish is one of the model animals used to evaluate the adverse effects of a chemical exposed to the ecosystem. However, its low throughput and relevantly high expense make it impossible to test all new chemicals in manufacture. Hence, using in silico models to prioritize compounds to be tested has been widely applied in environmental risk assessment and drug discovery. In this study, we constructed the local predictive models for four fish species, including bluegill sunfish, rainbow trout, fathead minnow, and sheepshead minnow, and the global models with all four fish data. A total of 1874 unique compounds with their labels, that is, toxic (LC
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
1766-1776Subventions
Organisme : National Natural Science Foundation of China
ID : 81973242
Organisme : National Natural Science Foundation of China
ID : 81872800
Informations de copyright
© 2022 John Wiley & Sons Ltd.
Références
Abiodun, O. I., Jantan, A., Omolara, A. E., Dada, K. V., Mohamed, N. A., & Arshad, H. (2018). State-of-the-art in artificial neural network applications: a survey. Heliyon, 4(11), e00938.
Altae-Tran, H., Ramsundar, B., Pappu, A. S., & Pande, V. (2017). Low data drug discovery with one-shot learning. ACS Central Science, 3(4), 283-293. https://doi.org/10.1021/acscentsci.6b00367
Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M., Al-Amidie, M., & Farhan, L. (2021). Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1), 1-74. https://doi.org/10.1186/s40537-021-00444-8
Ankley, G. T., & Villeneuve, D. L. (2006). The fathead minnow in aquatic toxicology: past, present and future. Aquatic Toxicology, 78(1), 91-102. https://doi.org/10.1016/j.aquatox.2006.01.018
Berthold, M. R., Cebron, N., Dill, F., Gabriel, T. R., Kötter, T., Meinl, T., Ohl, P., Sieb, C., Thiel, K., & Wiswedel, B. (2008). KNIME: The Konstanz Information Miner (pp. 319-326). Springer Berlin Heidelberg.
Cao, Q., Liu, L., Yang, H., Cai, Y., Li, W., Liu, G., Lee, P. W., & Tang, Y. (2018). In silicoestimation of chemical aquatic toxicity on crustaceans using chemical category methods. Environmental Science: Processes & Impacts, 20(9), 1234-1243. https://doi.org/10.1039/c8em00220g
Card, M. L., Gomez-Alvarez, V., Lee, W.-H., Lynch, D. G., Orentas, N. S., Lee, M. T., Wong, E. M., & Boethling, R. S. (2017). History of EPI Suite™ and future perspectives on chemical property estimation in US Toxic Substances Control Act new chemical risk assessments. Environmental Science: Processes & Impacts, 19(3), 203-212. https://doi.org/10.1039/C7EM00064B
Cassotti, M., Ballabio, D., Todeschini, R., & Consonni, V. (2015). A similarity-based QSAR model for predicting acute toxicity towards the fathead minnow (Pimephales promelas). SAR and QSAR in Environmental Research, 26(3), 217-243. https://doi.org/10.1080/1062936x.2015.1018938
Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: applications, challenges and trends. Neurocomputing, 408, 189-215. https://doi.org/10.1016/j.neucom.2019.10.118
Chatterjee, M., & Roy, K. (2021). Prediction of aquatic toxicity of chemical mixtures by the QSAR approach using 2D structural descriptors. Journal of Hazardous Materials, 408, 124936. https://doi.org/10.1016/j.jhazmat.2020.124936
Chen, Y. J., Cheng, F. X., Sun, L., Li, W. H., Liu, G. X., & Tang, Y. (2014). Computational models to predict endocrine-disrupting chemical binding with androgen or oestrogen receptors. Ecotoxicology and Environmental Safety, 110, 280-287. https://doi.org/10.1016/j.ecoenv.2014.08.026
Cheng, F., Li, W., Zhou, Y., Shen, J., Wu, Z., Liu, G., Lee, P. W., & Tang, Y. (2012). admetSAR: a comprehensive source and free tool for assessment of chemical ADMET properties. Journal of Chemical Information and Modeling, 52(11), 3099-3105. https://doi.org/10.1021/ci300367a
Cheng, F., Yu, Y., Shen, J., Yang, L., Li, W., Liu, G., Lee, P. W., & Tang, Y. (2011). Classification of cytochrome P450 inhibitors and noninhibitors using combined classifiers. Journal of Chemical Information and Modeling, 51(5), 996-1011. https://doi.org/10.1021/ci200028n
Demsar, J., Curk, T., Erjavec, A., Gorup, C., Hocevar, T., Milutinovic, M., Možina, M., Polajnar, M., Toplak, M., Starič, A., Stajdohar, M., Umek, L., Žagar, L., Žbontar, J., Žitnik, M., & Zupan, B. (2013). Orange: data mining toolbox in Python. Journal of Machine Learning Research, 14, 2349-2353. https://doi.org/10.5555/2567709.2567736
Deng, L., & Yu, D. (2014). Deep learning: methods and applications. Foundations and Trends in Signal Processing, 7(3-4), 197-387. https://doi.org/10.1561/2000000039
Durant, J. L., Leland, B. A., Henry, D. R., & Nourse, J. G. (2002). Reoptimization of MDL keys for use in drug discovery. Journal of Chemical Information and Computer Sciences, 42(6), 1273-1280. https://doi.org/10.1021/ci010132r
Esposito, C., Landrum, G. A., Schneider, N., Stiefl, N., & Riniker, S. (2021). GHOST: adjusting the decision threshold to handle imbalanced data in machine learning. Journal of Chemical Information and Modeling, 61(6), 2623-2640. https://doi.org/10.1021/acs.jcim.1c00160
Frank, E., Hall, M., Trigg, L., Holmes, G., & Witten, I. H. (2004). Data mining in bioinformatics using Weka. Bioinformatics, 20(15), 2479-2481. https://doi.org/10.1093/bioinformatics/bth261
Gao, S., Dong, W., Cheng, K., Yang, X., Zheng, S., & Yu, H. (2020). Adaptive decision threshold-based extreme learning machine for classifying imbalanced multi-label data. Neural Processing Letters, 52(3), 2151-2173. https://doi.org/10.1007/s11063-020-10343-3
Gou, J., Ma, H., Ou, W., Zeng, S., Rao, Y., & Yang, H. (2019). A generalized mean distance-based k-nearest neighbor classifier. Expert Systems with Applications, 115, 356-372.
Harris, C. A., Scott, A. P., Johnson, A. C., Panter, G. H., Sheahan, D., Roberts, M., & Sumpter, J. P. (2014). Principles of sound ecotoxicology. Environmental Science & Technology, 48(6), 3100-3111. https://doi.org/10.1021/es4047507
Kato, Y., Hamada, S., & Goto, H. (2020). Validation study of QSAR/DNN models using the competition datasets. Molecular Informatics, 39(1-2), 1900154.
Khan, K., Baderna, D., Cappelli, C., Toma, C., Lombardo, A., Roy, K., & Benfenati, E. (2019). Ecotoxicological QSAR modeling of organic compounds against fish: application of fragment based descriptors in feature analysis. Aquatic Toxicology, 212, 162-174. https://doi.org/10.1016/j.aquatox.2019.05.011
Klekota, J., & Roth, F. P. (2008). Chemical substructures that enrich for biological activity. Bioinformatics, 24(21), 2518-2525. https://doi.org/10.1093/bioinformatics/btn479
Landrum, G. (2013). RDKit: a software suite for cheminformatics, computational chemistry, and predictive modeling. Academic Press.
Li, F., Fan, D., Wang, H., Yang, H., Li, W., Tang, Y., & Liu, G. (2017). In silico prediction of pesticide aquatic toxicity with chemical category approaches. Toxicology Research, 6(6), 831-842. https://doi.org/10.1039/C7TX00144D
Li, X., Chen, L., Cheng, F., Wu, Z., Bian, H., Xu, C., Li, W., Liu, G., Shen, X., & Tang, Y. (2014). In silico prediction of chemical acute oral toxicity using multi-classification methods. Journal of Chemical Information and Modeling, 54(4), 1061-1069. https://doi.org/10.1021/ci5000467
Liu, L., Yang, H., Cai, Y., Cao, Q., Sun, L., Wang, Z., Li, W., Liu, G., Lee, P. W., & Tang, Y. (2019). In silico prediction of chemical aquatic toxicity for marine crustaceans via machine learning. Toxicology Research, 8(3), 341-352. https://doi.org/10.1039/c8tx00331a
Liu, R., Yu, X., & Wallqvist, A. (2015). Data-driven identification of structural alerts for mitigating the risk of drug-induced human liver injuries. Journal of Cheminformatics, 7, 4. https://doi.org/10.1186/s13321-015-0053-y
Mayr, A., Klambauer, G., Unterthiner, T., & Hochreiter, S. (2016). DeepTox: Toxicity prediction using deep learning. Frontiers in Environmental Science, 3(80). https://doi.org/10.3389/fenvs.2015.00080
O'Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., & Hutchison, G. R. (2011). Open Babel: an open chemical toolbox. Journal of Cheminformatics, 3, 33. https://doi.org/10.1186/1758-2946-3-33
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.
Rajabi, M., & Shafiei, F. (2019). QSAR models for predicting aquatic toxicity of esters using genetic algorithm-multiple linear regression methods. Combinatorial Chemistry & High Throughput Screening, 22(5), 317-325. https://doi.org/10.2174/1386207322666190618150856
Ramsundar, B., Eastman, P., Walters, P., Pande, V., Leswing, K., & Wu, Z. (2019). Deep Learning for the Life Sciences. O'Reilly Media, Inc.
Rigatti, S. J. (2017). Random forest. Journal of Insurance Medicine, 47(1), 31-39. https://doi.org/10.17849/insm-47-01-31-39.1
Rogers, D., & Hahn, M. (2010). Extended-connectivity fingerprints. Journal of Chemical Information and Modeling, 50(5), 742-754. https://doi.org/10.1021/ci100050t
Rusche, B. (2003). The 3Rs and animal welfare-conflict or the way forward? ALTEX, 20(Suppl 1), 63-76.
Sagi, O., & Rokach, L. (2018). Ensemble learning: a survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249. https://doi.org/10.1002/widm.1249
Schuurmann, G., Ebert, R. U., & Kuhne, R. (2011). Quantitative read-across for predicting the acute fish toxicity of organic compounds. Environmental Science & Technology, 45(10), 4616-4622. https://doi.org/10.1021/es200361r
Sheffield, T. Y., & Judson, R. S. (2019). Ensemble QSAR modeling to predict multispecies fish toxicity lethal concentrations and points of departure. Environmental Science & Technology, 53(21), 12793-12802. https://doi.org/10.1021/acs.est.9b03957
Shen, J., Cheng, F., Xu, Y., Li, W., & Tang, Y. (2010). Estimation of ADME properties with substructure pattern recognition. Journal of Chemical Information and Modeling, 50(6), 1034-1041. https://doi.org/10.1021/ci100104j
Steinbeck, C., Han, Y. Q., Kuhn, S., Horlacher, O., Luttmann, E., & Willighagen, E. (2003). The Chemistry Development Kit (CDK): an open-source Java library for chemo- and bioinformatics. Journal of Chemical Information and Computer Sciences, 43(2), 493-500. https://doi.org/10.1021/ci025584y
Su, Q., Lu, W., Du, D., Chen, F., Niu, B., & Chou, K.-C. (2017). Prediction of the aquatic toxicity of aromatic compounds to tetrahymena pyriformis through support vector regression. Oncotarget, 8(30), 49359. https://doi.org/10.18632/oncotarget.17210
Sumpter, J. P., & Harris, C. A. (2016). In response: an academic perspective. Environmental Toxicology and Chemistry, 35(1), 14-16. https://doi.org/10.1002/etc.3195
Sun, L., Zhang, C., Chen, Y. J., Li, X., Zhuang, S. L., Li, W. H., Liu, G., Leea, P. W., & Tang, Y. (2015). In silico prediction of chemical aquatic toxicity with chemical category approaches and substructural alerts. Toxicology Research, 4(2), 452-463. https://doi.org/10.1039/c4tx00174e
Tunkel, J., Mayo, K., Austin, C., Hickerson, A., & Howard, P. (2005). Practical considerations on the use of predictive models for regulatory purposes. Environmental Science & Technology, 39(7), 2188-2199. https://doi.org/10.1021/es049220t
United Nations. (2019). Globally Harmonized System of Classification and Labelling of Chemicals (8th ed.). United Nations..
USEPA. (2016). Statistics for the New Chemicals Review Program under TSCA. Retrieved from https://www.epa.gov/reviewing-new-chemicals-under-toxic-substances-control-act-tsca/statistics-new-chemicals-review
USEPA. (2017). ECOTOX User Guide: ECOTOXicology Knowledgebase System. Version 4.0.
von der Ohe, P. C., Kuhne, R., Ebert, R. U., Altenburger, R., Liess, M., & Schuurmann, G. (2005). Structural alerts-a new classification model to discriminate excess toxicity from narcotic effect levels of organic compounds in the acute daphnid assay. Chemical Research in Toxicology, 18(3), 536-555. https://doi.org/10.1021/tx0497954
Wang, Q., Li, X., Yang, H. B., Cai, Y. C., Wang, Y. Y., Wang, Z., Li, W., Tang, Y., & Liu, G. X. (2017). In silico prediction of serious eye irritation or corrosion potential of chemicals. RSC Advances, 7(11), 6697-6703. https://doi.org/10.1039/c6ra25267b
Wu, K., & Wei, G.-W. (2018). Quantitative toxicity prediction using topology based multitask deep neural networks. Journal of Chemical Information and Modeling, 58(2), 520-531. https://doi.org/10.1021/acs.jcim.7b00558
Wu, X., Zhang, Q., & Hu, J. (2016). QSAR study of the acute toxicity to fathead minnow based on a large dataset. SAR and QSAR in Environmental Research, 27(2), 147-164. https://doi.org/10.1080/1062936x.2015.1137353
Wu, Z., Jiang, D., Wang, J., Hsieh, C. Y., Cao, D., & Hou, T. (2021). Mining toxicity information from large amounts of toxicity data. Journal of Medicinal Chemistry, 64(10), 6924-6936. https://doi.org/10.1021/acs.jmedchem.1c00421
Yang, H., Lou, C., Sun, L., Li, J., Cai, Y., Wang, Z., Li, W., Liu, G., & Tang, Y. (2019). admetSAR 2.0: web-service for prediction and optimization of chemical ADMET properties. Bioinformatics, 35(6), 1067-1069. https://doi.org/10.1093/bioinformatics/bty707
Yang, H., Sun, L., Li, W., Liu, G., & Tang, Y. (2018). In silico prediction of chemical toxicity for drug design using machine learning methods and structural alerts. Frontiers in Chemistry, 6, 30. https://doi.org/10.3389/fchem.2018.00030
Yap, C. W. (2011). PaDEL-Descriptor: an open source software to calculate molecular descriptors and fingerprints. Journal of Computational Chemistry, 32(7), 1466-1474. https://doi.org/10.1002/jcc.21707