Identifying diseases symptoms and general rules using supervised and unsupervised machine learning.
Apriori algorithm
Association rules
Classification methods
Diseases symptoms
Machine learning algorithms
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
02 Aug 2024
02 Aug 2024
Historique:
received:
02
03
2024
accepted:
30
07
2024
medline:
3
8
2024
pubmed:
3
8
2024
entrez:
2
8
2024
Statut:
epublish
Résumé
The symptoms of diseases can vary among individuals and may remain undetected in the early stages. Detecting these symptoms is crucial in the initial stage to effectively manage and treat cases of varying severity. Machine learning has made major advances in recent years, proving its effectiveness in various healthcare applications. This study aims to identify patterns of symptoms and general rules regarding symptoms among patients using supervised and unsupervised machine learning. The integration of a rule-based machine learning technique and classification methods is utilized to extend a prediction model. This study analyzes patient data that was available online through the Kaggle repository. After preprocessing the data and exploring descriptive statistics, the Apriori algorithm was applied to identify frequent symptoms and patterns in the discovered rules. Additionally, the study applied several machine learning models for predicting diseases, including stepwise regression, support vector machine, bootstrap forest, boosted trees, and neural-boosted methods. Several predictive machine learning models were applied to the dataset to predict diseases. It was discovered that the stepwise method for fitting outperformed all competitors in this study, as determined through cross-validation conducted for each model based on established criteria. Moreover, numerous significant decision rules were extracted in the study, which can streamline clinical applications without the need for additional expertise. These rules enable the prediction of relationships between symptoms and diseases, as well as between different diseases. Therefore, the results obtained in this study have the potential to improve the performance of prediction models. We can discover diseases symptoms and general rules using supervised and unsupervised machine learning for the dataset. Overall, the proposed algorithm can support not only healthcare professionals but also patients who face cost and time constraints in diagnosing and treating these diseases.
Identifiants
pubmed: 39095606
doi: 10.1038/s41598-024-69029-8
pii: 10.1038/s41598-024-69029-8
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
17956Subventions
Organisme : University of Torbat Heydarieh
ID : 212
Informations de copyright
© 2024. The Author(s).
Références
Yan, H., Jiang, Y., Zheng, J., Peng, C. & Li, Q. A multilayer perceptron-based medical decision support system for heart disease diagnosis. Expert Syst. Appl. 30, 272–281. https://doi.org/10.1016/j.eswa.2005.07.022 (2006).
doi: 10.1016/j.eswa.2005.07.022
Manikandan, K. Diagnosis of diabetes diseases using optimized fuzzy rule set by grey wolf optimization. Pattern Recogn. Lett. 125, 432–438. https://doi.org/10.1016/j.patrec.2023.03.011 (2019).
doi: 10.1016/j.patrec.2023.03.011
Bajwa, J., Munir, U., Nori, A. & Williams, B. Artificial intelligence in healthcare: Transforming the practice of medicine. Future Healthc. J. 8, 188–194. https://doi.org/10.7861/fhj.2021-0095 (2021).
doi: 10.7861/fhj.2021-0095
Ahsan, M. M., Luna, S. A. & Siddique, Z. Machine-learning-based disease diagnosis: A comprehensive review. Healthcare 10, 541. https://doi.org/10.3390/healthcare10030541 (2022).
doi: 10.3390/healthcare10030541
pubmed: 35327018
pmcid: 8950225
Ali, O. et al. A systematic literature review of artificial intelligence in the healthcare sector: Benefits, challenges, methodologies, and functionalities. J. Innov. Knowl. 8, 100333. https://doi.org/10.1016/j.jik.2023.100333 (2023).
doi: 10.1016/j.jik.2023.100333
Mirbabaie, M., Stieglitz, S. & Frick, N. R. Artificial intelligence in disease diagnostics: A critical review and classification on the current state of research guiding future direction. Health Technol. 11, 693–773. https://doi.org/10.1007/s12553-021-00555-5 (2021).
doi: 10.1007/s12553-021-00555-5
Woodman, R. J. & Mangoni, A. A. A comprehensive review of machine learning algorithms and their application in geriatric medicine: Present and future. Aging Clin. Exp. Res. 35, 2363–2397. https://doi.org/10.1007/s40520-023-02552-2 (2023).
doi: 10.1007/s40520-023-02552-2
pubmed: 37682491
pmcid: 10627901
Poudel, S. A study of disease diagnosis using machine learning. Med. Sci. Forum 10, 8–20. https://doi.org/10.3390/IECH2022-12311 (2022).
doi: 10.3390/IECH2022-12311
Kumar, Y., Koul, A., Singla, R. & Ijaz, M. F. Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. J. Ambient Intell. Humaniz. Comput. 1, 1–28. https://doi.org/10.1007/s12652-021-03612-z (2022).
doi: 10.1007/s12652-021-03612-z
Ferdous M., Debnath J. and Chakraborty N.R., (2020). Machine learning algorithms in healthcare: A literature survey. In 2020 11th International conference on computing, communication and networking technologies 1–6. https://doi.org/10.1109/ICCCNT49239.2020.9225642
Fatima, M. & Pasha, M. Survey of machine learning algorithms for disease diagnostic. J. Intell. Learn. Syst. Appl. 9, 1–16. https://doi.org/10.4236/jilsa.2017.91001 (2017).
doi: 10.4236/jilsa.2017.91001
Burkart, N. & Huber, M. F. A survey on the explain ability of supervised machine learning. J. Artif. Intell. Res. 70, 245–317. https://doi.org/10.1613/jair.1.12228 (2021).
doi: 10.1613/jair.1.12228
Dowdell, J. et al. Intervertebral disk degeneration and repair. Neurosurgery 80, S46. https://doi.org/10.1093/neuros/nyw078 (2017).
doi: 10.1093/neuros/nyw078
pubmed: 28350945
pmcid: 5585783
Flores, A. M. et al. Unsupervised learning for automated detection of coronary artery disease subgroups. J. Am. Heart Assoc. 10, e021976. https://doi.org/10.1161/JAHA.121.021976 (2021).
doi: 10.1161/JAHA.121.021976
pubmed: 34845917
pmcid: 9075403
Chauhan T., Rawat S., Malik S. and Singh P., (2021). March. Supervised and unsupervised machine learning based review on diabetes care. In 2021 7th International Conference on Advanced Computing and Communication Systems, 1, 581–585. IEEE. https://doi.org/10.1109/ICACCS51430.2021.9442021
Lim, S., Tucker, C. S. & Kumara, S. An unsupervised machine learning model for discovering latent infectious diseases using social media data. J. Biomed. Inform. 66, 82–94. https://doi.org/10.1016/j.jbi.2016.12.007 (2017).
doi: 10.1016/j.jbi.2016.12.007
pubmed: 28034788
Shomorony, I. et al. An unsupervised learning approach to identify novel signatures of health and disease from multimodal data. Genome Med. 12, 1–14. https://doi.org/10.1186/s13073-019-0705-z (2020).
doi: 10.1186/s13073-019-0705-z
Bose, E. & Radhakrishnan, K. Using unsupervised machine learning to identify subgroups among home health patients with heart failure using telehealth. CIN Comput. Inform. Nurs. 36, 242–248. https://doi.org/10.1097/CIN.0000000000000423 (2018).
doi: 10.1097/CIN.0000000000000423
pubmed: 29494361
Callahan, A. & Shah, N. H. Machine learning in healthcare. In Key Advances in Clinical Informatics (eds Callahan, A. & Shah, N. H.) 279–291 (Elsevier, 2017).
doi: 10.1016/B978-0-12-809523-2.00019-4
Talukdar, J., Gogoi, D. K. & Singh, T. P. A comparative assessment of most widely used machine learning classifiers for analysing and classifying autism spectrum disorder in toddlers and adolescents. Healthc. Anal. 3, 100178. https://doi.org/10.1016/j.health.2023.100178 (2023).
doi: 10.1016/j.health.2023.100178
Brossette, S. E. et al. Association rules and data mining in hospital infection control and public health surveillance. J. Am. Med. Inform. Assoc. 5, 373–381. https://doi.org/10.1136/jamia.1998.0050373 (1998).
doi: 10.1136/jamia.1998.0050373
pubmed: 9670134
pmcid: 61314
Sarıyer, G. & Öcal, T. C. Highlighting the rules between diagnosis types and laboratory diagnostic tests for patients of an emergency department: Use of association rule mining. Health Inform. J. 26, 1177–1193. https://doi.org/10.1177/1460458219871135 (2020).
doi: 10.1177/1460458219871135
Happawana, K. A. & Diamond, B. J. Association rule learning in neuropsychological data analysis for Alzheimer’s disease. J. Neuropsychol. 16, 116–130. https://doi.org/10.1111/jnp.12252 (2022).
doi: 10.1111/jnp.12252
pubmed: 33993623
Miswan, N. H., Sulaiman, I. M., Chan, C. S. & Ng, C. G. Association rules mining for hospital readmission: A case study. Mathematics 9, 2706. https://doi.org/10.3390/math9212706 (2021).
doi: 10.3390/math9212706
Tandan, M., Acharya, Y., Pokharel, S. & Timilsina, M. Discovering symptom patterns of COVID-19 patients using association rule mining. Comput. Biol. Med. 131, 104249. https://doi.org/10.1016/j.compbiomed.2021.104249 (2021).
doi: 10.1016/j.compbiomed.2021.104249
pubmed: 33561673
pmcid: 7966840
Dehghani, M. & Yazdanparast, Z. Discovering the symptom patterns of COVID-19 from recovered and deceased patients using Apriori association rule mining. Inform. Med. Unlocked 42, 101351. https://doi.org/10.1016/j.imu.2023.101351 (2023).
doi: 10.1016/j.imu.2023.101351
Khafaga, D. S., Alharbi, A. H., Mohamed, I. & Hosny, K. M. An integrated classification and association rule technique for early-stage diabetes risk prediction. Healthcare 10, 2070. https://doi.org/10.3390/healthcare10102070 (2022).
doi: 10.3390/healthcare10102070
pubmed: 36292517
pmcid: 9602561
Cui, J., Zhao, S. and Sun, X., (2022). An association rule mining algorithm for clinical decision support. In Proceedings of the 8th International Conference on Computing and Artificial Intelligence, 1, 137–143. https://doi.org/10.1145/3532213.3532234 .
Péran, P. et al. MRI supervised and unsupervised classification of Parkinson’s disease and multiple system atrophy. Mov. Disord. 33(4), 600–608. https://doi.org/10.1002/mds.27307 (2018).
doi: 10.1002/mds.27307
pubmed: 29473662
Ma, E. Y. et al. Combined unsupervised-supervised machine learning for phenotyping complex diseases with its application to obstructive sleep apnea. Sci. Rep. 11(1), 4457. https://doi.org/10.1038/s41598-021-84003-4 (2021).
doi: 10.1038/s41598-021-84003-4
pubmed: 33627761
pmcid: 7904925
Cai, M., Li, J., Nali, M., & Mackey, T. K. (2021, June). Evaluation of hybrid unsupervised and supervised machine learning approach to detect self-reporting of COVID-19 symptoms on Twitter. In 2021 IEEE International Conference on Communications Workshops (ICC Workshops) (pp. 1–6). https://doi.org/10.1109/ICCWorkshops50388.2021.9473830 .
Sáiz-Manzanares, M. C. et al. Use of digitalisation and machine learning techniques in therapeutic intervention at early ages: Supervised and unsupervised analysis. Children 11(4), 381. https://doi.org/10.3390/children11040381 (2024).
doi: 10.3390/children11040381
pubmed: 38671598
pmcid: 11048911
Ahmed, K. et al. Early detection of lung cancer risk using data mining. Asian Pac. J. Cancer Prev. 1, 595–598. https://doi.org/10.7314/APJCP.2013.14.1.595 (2013).
doi: 10.7314/APJCP.2013.14.1.595
Hasan, S. M. M., Mamun, M. A., Uddin, M. P. & Hossain, M. A. Comparative analysis of classification approaches for heart disease prediction. Int. Conf. Comput. Commun. Chem. Mater. Electron. Eng. https://doi.org/10.1109/IC4ME2.2018.8465594 (2018).
doi: 10.1109/IC4ME2.2018.8465594