Novel method of building train and test sets for evaluation of machine learning models related to software bugs assignment.
Journal
Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288
Informations de publication
Date de publication:
06 Dec 2023
06 Dec 2023
Historique:
received:
04
04
2023
accepted:
28
11
2023
medline:
7
12
2023
pubmed:
7
12
2023
entrez:
6
12
2023
Statut:
epublish
Résumé
Nowadays many tools are in use in processes related to handling bug reports, feature requests, supporting questions or similar related issues which should be handled during software development or maintenance. Part of them use machine learning techniques. In introduction is presented a review of fundamental methods used for evaluation of machine learning models. This paper points out weak points of currently used metrics for evaluation in specific context of the cases related to software development especially bug reports. The disadvantages of state of the art are related to disregarding time dependencies which are important to be applied for creating train and test sets as they may have impact on results. Extensive research of the art has been conducted and has not been found any article with the use of time dependencies for evaluation of machine learning models in the context of works related to software development applications like machine learning solutions to supporting bug tracking systems. This paper introduces a novel solution which is devoid of these drawbacks. Experimental research showed the effectiveness of the introduced method and significantly different results obtained compared to the state-of-the-art methods.
Identifiants
pubmed: 38057324
doi: 10.1038/s41598-023-48617-0
pii: 10.1038/s41598-023-48617-0
pmc: PMC10700481
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
21512Informations de copyright
© 2023. The Author(s).
Références
Gujral, S., Sharma, G., Sharma, S. & Diksha. Classifying bug severity using dictionary based approach. In 2015 International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE) 599–602 (2015).
Lamkanfi, A. & Demeyer, S. Predicting reassignments of bug reports: An exploratory investigation. In 2013 17th European Conference on Software Maintenance and Reengineering 327–330 (2013).
Anjali, Mohan, D. & Sardana, N. Visheshagya: Time based expertise model for bug report assignment. In 2016 Ninth International Conference on Contemporary Computing (IC3) 1–6 (2016).
Behl, D., Handa, S. & Arora, A. A bug mining tool to identify and analyze security bugs using naive bayes and tf-idf. In 2014 International Conference on Reliability Optimization and Information Technology (ICROIT) 294–299 (2014).
Tsuruda, A., Manabe, Y. & Aritsugi, M. Can we detect bug report duplication with unfinished bug reports? In 2015 Asia-Pacific Software Engineering Conference (APSEC) 151–158 (2015).
Ahsan, S. N., Ferzund, J. & Wotawa, F. Automatic software bug triage system (bts) based on latent semantic indexing and support vector machine. In 2009 Fourth International Conference on Software Engineering Advances 216–221 (2009).
Nath, V., Sheldon, D. & Alphonso-Gibbs, J. Principal component analysis and entropy-based selection for the improvement of bug triage. In 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA) 541–546 (2021).
Mian, T. S. Automation of bug-report allocation to developer using a deep learning algorithm. In 2021 International Congress of Advanced Technology and Engineering (ICOTEN) 1–7 (2021).
Chmielowski, L. & Kucharzak, M. Impact of software bug report preprocessing and vectorization on bug assignment accuracy. In Progress in Image Processing, Pattern Recognition and Communication Systems (eds Choraś, M. et al.) 153–162 (Springer International Publishing, 2022).
Mahfoodh, H. & Hammad, M. Word2vec duplicate bug records identification prediction using tensorflow. In 2020 International Conference on Innovation and Intelligence for Informatics, Computing and Technologies (3ICT) 1–6 (2020).
Mahfoodh, H. & Obediat, Q. Software risk estimation through bug reports analysis and bug-fix time predictions. In 2020 International Conference on Innovation and Intelligence for Informatics, Computing and Technologies (3ICT) 1–6 (2020).
Xiao, G., Du, X., Sui, Y. & Yue, T. Hindbr: Heterogeneous information network based duplicate bug report prediction. In 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE) 195–206 (2020).
Kucuk, B. & Tuzun, E. Characterizing duplicate bugs: An empirical analysis. In 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 661–668 (2021).
Ali, J., Adnan, M., Gadekallu, T. R., Jhaveri, R. H. & Roh, B.-H. A qos-aware software defined mobility architecture for named data networking. In 2022 IEEE Globecom Workshops (GC Wkshps) 444–449 (2022).
Saad, M. M. et al. Cooperative vehicular networks: An optimal and machine learning approach. Comput. Electr. Eng. 103, 108348 (2022).
doi: 10.1016/j.compeleceng.2022.108348
Pedregosa, F. et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011).
Brownlee, J. Machine Learning Mastery: A Gentle Introduction to k-Fold Cross-Validation. https://machinelearningmastery.com/k-fold-cross-validation/ (2022).
Sammut, C. & Webb, G. I. (eds) Leave-One-Out Cross-Validation 600–601 (Springer, 2010).
Heydarian, M., Doyle, T. E. & Samavi, R. Mlcm: Multi-label confusion matrix. IEEE Access 10, 19083–19095 (2022).
doi: 10.1109/ACCESS.2022.3151048
Pedregosa, F. et al. Scikit-learn: Machine learning in python confusion matrix display. J. Mach. Learn. Res. 12, 2825–2830 (2022).
Yasen Jiao, P. D. Performance measures in evaluating machine learning based bioinformatics predictors for classifications. Quant. Biol. 4, 320 (2016).
doi: 10.1007/s40484-016-0081-2
Banda, J., Angryk, R. & Martens, P. Steps toward a large-scale solar image data analysis to differentiate solar phenomena. Solar Phys. 288, 435–462 (2013).
doi: 10.1007/s11207-013-0304-x
Wikipedia contributors. Confusion matrix: Wikipedia, the free encyclopedia. https://en.wikipedia.org/w/index.php?title=Confusion_matrix &oldid=1107701525 (2022).
Barto, A. G. Adaptive real-time dynamic programming. In Encyclopedia of Machine Learning (eds Sammut, C. & Webb, G. I.) 19–22 (Springer, 2010).
Sokolova, M. & Lapalme, G. A systematic analysis of performance measures for classification tasks. Inf. Process. Manag. 45, 427–437 (2009).
doi: 10.1016/j.ipm.2009.03.002
Flach, P. Performance evaluation in machine learning: The good, the bad, the ugly, and the way forward. Proce. AAAI Conf. Artif. Intell. 33, 9808–9814 (2019).
Garcia-Balboa, J. L., Alba-Fernandez, M. V., Ariza-López, F. J. & Rodriguez-Avi, J. Homogeneity test for confusion matrices: A method and an example. In IGARSS 2018: 2018 IEEE International Geoscience and Remote Sensing Symposium 1203–1205 (2018).
Ariza-Lopez, F., Rodriguez-Avi, J. & Alba-Fernandez, M. Complete control of an observed confusion matrix. In IGARSS 2018: 2018 IEEE International Geoscience and Remote Sensing Symposium 1222–1225 (2018).
Karimi, Z. Confusion Matrix. https://www.researchgate.net/publication/355096788 (2021).