Selecting optimal software code descriptors-The case of Java.

Software Algorithms Programming Languages

Journal

PloS one

ISSN: 1932-6203

Titre abrégé: PLoS One

Pays: United States

ID NLM: 101285081

Informations de publication

Date de publication:
2024

Historique:

received: 09 05 2024

accepted: 27 08 2024

medline: 2 11 2024

pubmed: 2 11 2024

entrez: 1 11 2024

Statut: epublish

Résumé

Over the last 25 years, a considerable proliferation of software metrics and a plethora of tools have emerged to extract them. While this is indeed positive concerning the previous situations of limited data, it still leads to a significant problem arising both from a theoretical and a practical standpoint. From a theoretical perspective, several metrics are likely to result in collinearity, overfitting, etc. From a practical perspective, such a set of metrics is difficult to manage and companies, especially small ones, may feel overwhelmed and unable to select a viable subset of them. Still, so far it has not been fully understood what is a viable subset of metrics suitable to properly manage software projects and products. In this paper, we attempt to address this issue. We focus on the case of programs written in Java and we consider classes and methods. We use Sammon error as a measure of the similarity of metrics. Utilizing both Particle Swarm Optimization and Genetic Algorithm, we adapted a method for the identification of a viable subset of such metrics that could solve the mentioned problem. Furthermore, we experiment with our approach on 800 projects coming from GitHub and validate the results on 200 projects. With the proposed method we got optimal subsets of software engineering metrics. These subsets gave us low values of Sammon error at more than 70% at class and method levels on a validation dataset.

Identifiants

DOI: 10.1371/journal.pone.0310840 PMID: 39485764

pubmed: 39485764

doi: 10.1371/journal.pone.0310840

pii: PONE-D-24-18681

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

e0310840

Informations de copyright

Copyright: © 2024 Bugayenko et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Déclaration de conflit d'intérêts

One of the authors (Giancarlo Succi) is a Board member of PLOS ONE journal.

Selecting optimal software code descriptors-The case of Java.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Déclaration de conflit d'intérêts

Auteurs

Yegor Bugayenko (Y)

Zamira Kholmatova (Z)

Artem Kruglov (A)

Witold Pedrycz (W)

Giancarlo Succi (G)

Articles similaires

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Multilabel SegSRGAN-A framework for parcellation and morphometry of preterm brain in MRI.

Accuracy of web-based automated versus digital manual cephalometric landmark identification.

An arithmetic operation P system based on symmetric ternary system.

Classifications MeSH