Software Maintainability and Refactorings Prediction Based on Technical Debt Issues

Authors

  • Liviu-Marian BERCIU Department of Computer Science, Faculty of Mathematics and Computer Science, Babes-Bolyai University, Cluj-Napoca, Romania. Email: liviu.berciu@ubbcluj.ro. https://orcid.org/0009-0009-7501-2851
  • Vasilica MOLDOVAN Department of Computer Science, Faculty of Mathematics and Computer Science, Babes-Bolyai University, Cluj-Napoca, Romania. Email: vasilica.moldovan@stud.ubbcluj.ro.

DOI:

https://doi.org/10.24193/subbi.2023.2.02

Keywords:

Software Quality, Sonarqube, Refactoring, Code Smells

Abstract

Software maintainability is a crucial factor impacting cost, time and resource allocation for software development. Code refactorings greatly enhance code quality, readability, understandability and extensibility. Hence, accurate prediction methods for both maintainability and refactorings are vital for long-term project sustainability and success, offering substantial benefits to the software community as a whole. This article focuses on prediction of software maintainability and the number of needed code refactorings using technical debt data. Two approaches were explored, one compressing technical debt issues per software component and employing machine learning algorithms such as ExtraTrees, Random Forest, Decision Trees, which all obtained a high accuracy and performance. The second approach retained multiple debt issue entries and utilized a Recurrent Neural Network, although less effectively. In addition to the prediction of the requisite number of code refactorings and software maintainability for individual software components, a comprehensive analysis of technical debt issues was conducted before and after the refactoring process. The outcomes of this study contribute to the advancement of a dependable prediction system for maintainability and refactorings, presenting potential advantages to the software community in effectively managing maintenance resources. Of all the employed models, the ExtraTrees model yielded the most optimal predictive outcomes. To the best of our knowledge no other approaches of using ML techniques for this problem have been reported in literature.

References

Akour, M., Alenezi, M., and Alsghaier, H. Software refactoring prediction using svm and optimization algorithms. Processes 10, 8 (2022).

Arisholm, E., Briand, L. C., and Johannessen, E. B. An empirical study on the relationship between software maintainability and bug-proneness. In 2010 IEEE International Symposium on Software Metrics (METRICS) (2010), IEEE.

Biau, G., and Scornet, E. A random forest guided tour. TEST 25 (2016), 197–227.

Breiman, L. Classification and regression trees. In Decision forests for computer vision and medical image analysis (2017), Springer, pp. 19–38.

CAST. 2018 software intelligence report. Tech. rep., CAST, 2018.

Cortes, C., and Vapnik, V. Support-vector networks. Machine Learning 20, 3 (1995), 273–297.

Drucker, H., Burges, C. J., Kaufman, L., Smola, A. J., and Vapnik, V. Support vector regression machines. Advances in neural information processing systems 9 (1997), 155–161.

Elmidaoui, S., Cheikhi, L., Idri, A., and Abran, A. Machine learning techniques for software maintainability prediction: Accuracy analysis. Journal of Computer Science and Technology 35, 5 (2020), 1147–1174.

Ernst, N. A., and Eichmann, D. A. The future of software maintenance. IEEE Software 16, 1 (1999), 44–50.

Geurts, P., Ernst, D., and Wehenkel, L. Extremely randomized trees. Machine Learning 63, 1 (2006), 3–42.

Guo, G., Wang, H., Bell, D., Bi, Y., and Greer, K. Knn model-based approach in classification. In on the Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE (Berlin, Heidelberg, 2003), R. Meersman, Z. Tari, and D. C. Schmidt, Eds., Springer Berlin Heidelberg, pp. 986–996.

Hegedüs, P., Kádár, I., Ferenc, R., and Gyimóthy, T. Empirical evaluation of software maintainability based on a manually validated refactoring dataset. Information and Software Technology 95 (2018), 313–327.

Jang, J.-S., Sun, C.-T., and Mizutani, E. Neuro-fuzzy and soft computing: a computational approach to learning and machine intelligence. Prentice Hall, 1997.

Kaur, A., and Kaur, K. Statistical comparison of modelling methods for software maintainability prediction. International Journal of Software Engineering and Knowl- edge Engineering 23, 6 (2013), 743–774.

Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T.-Y. Lightgbm: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems (2017), I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30, Curran Associates, Inc.

Marinescu, R. An empirical study of the relationship between code smells and refactoring. Empirical Software Engineering 9, 4 (2004), 429–462.

Molnar, A.-J. Collection of technical debt issues in freemind, jedit and tuxguitar open source software.

Molnar, A.-J., and Motogna, S. Long-term evaluation of technical debt in open- source software. In Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (New York, NY, USA, 2020), ESEM ’20, Association for Computing Machinery.

Molnar, A.-J., and Motogna, S. A study of maintainability in evolving open- source software. In Evaluation of Novel Approaches to Software Engineering (Cham, 2021), R. Ali, H. Kaindl, and L. A. Maciaszek, Eds., Springer International Publishing, pp. 261–282.

Montgomery, D. C., Peck, E. A., and Vining, G. G. Introduction to linear regression analysis. John Wiley & Sons, 2012.

NIST. The economic impacts of inadequate infrastructure for software testing. Technical Report NISTIR 6859, National Institute of Standards and Technology, 2002.

Oman, P., and Hagemeister, J. Metrics for assessing a software system’s maintainability. In Proceedings Conference on Software Maintenance 1992 (Nov 1992), pp. 337– 344.

Pearl, J. Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan Kaufmann (1988).

Rumelhart, D. E., Hinton, G. E., and Williams, R. J. Learning representations by back-propagating errors. Nature 323, 6088 (1986), 533–536.

Taud, H., and Mas, J. Multilayer Perceptron (MLP). Springer International Publishing, Cham, 2018, pp. 451–455.

van Koten, C., and Gray, A. R. An application of Bayesian network for predicting object-oriented software maintainability. Information and Software Technology 48, 1 (2006), 59–67.

Wahler, M., Drofenik, U., and Snipes, W. Improving code maintainability: A case study on the impact of refactoring. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME) (2016), pp. 493–501.

Downloads

Published

2023-12-22

How to Cite

BERCIU, L.-M., & MOLDOVAN, V. . (2023). Software Maintainability and Refactorings Prediction Based on Technical Debt Issues. Studia Universitatis Babeș-Bolyai Informatica, 68(2), 22–40. https://doi.org/10.24193/subbi.2023.2.02

Issue

Section

Articles