TY - GEN
T1 - Learning a classifier for prediction of maintainability based on static analysis tools
AU - Schnappinger, Markus
AU - Osman, Mohd Hafeez
AU - Pretschner, Alexander
AU - Fietzke, Arnaud
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - Static Code Analysis Tools are a popular aid to monitor and control the quality of software systems. Still, these tools only provide a large number of measurements that have to be interpreted by the developers in order to obtain insights about the actual quality of the software. In cooperation with professional quality analysts, we manually inspected source code from three different projects and evaluated its maintainability. We then trained machine learning algorithms to predict the human maintainability evaluation of program classes based on code metrics. The code metrics include structural metrics such as nesting depth, cloning information and abstractions like the number of code smells. We evaluated this approach on a dataset of more than 115,000 Lines of Code. Our model is able to predict up to 81% of the threefold labels correctly and achieves a precision of 80%. Thus, we believe this is a promising contribution towards automated maintainability prediction. In addition, we analyzed the attributes in our created dataset and identified the features with the highest predictive power, i.e. code clones, method length, and the number of alerts raised by the tool Teamscale. This insight provides valuable help for users needing to prioritize tool measurements.
AB - Static Code Analysis Tools are a popular aid to monitor and control the quality of software systems. Still, these tools only provide a large number of measurements that have to be interpreted by the developers in order to obtain insights about the actual quality of the software. In cooperation with professional quality analysts, we manually inspected source code from three different projects and evaluated its maintainability. We then trained machine learning algorithms to predict the human maintainability evaluation of program classes based on code metrics. The code metrics include structural metrics such as nesting depth, cloning information and abstractions like the number of code smells. We evaluated this approach on a dataset of more than 115,000 Lines of Code. Our model is able to predict up to 81% of the threefold labels correctly and achieves a precision of 80%. Thus, we believe this is a promising contribution towards automated maintainability prediction. In addition, we analyzed the attributes in our created dataset and identified the features with the highest predictive power, i.e. code clones, method length, and the number of alerts raised by the tool Teamscale. This insight provides valuable help for users needing to prioritize tool measurements.
KW - Code Comprehension
KW - Maintenance Tools
KW - Software Maintenance
KW - Software Quality
KW - Static Code Analysis
UR - http://www.scopus.com/inward/record.url?scp=85072319546&partnerID=8YFLogxK
U2 - 10.1109/ICPC.2019.00043
DO - 10.1109/ICPC.2019.00043
M3 - Conference contribution
AN - SCOPUS:85072319546
T3 - IEEE International Conference on Program Comprehension
SP - 243
EP - 248
BT - Proceedings - 2019 IEEE/ACM 27th International Conference on Program Comprehension, ICPC 2019
PB - IEEE Computer Society
T2 - 27th IEEE/ACM International Conference on Program Comprehension, ICPC 2019
Y2 - 25 May 2019
ER -