TY - GEN
T1 - Feature sets in just-in-time defect prediction
T2 - 18th ACM International Conference on Predictive Models and Data Analytics in Software Engineering, PROMISE 2022, co-located with the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE 2022
AU - Bludau, Peter
AU - Pretschner, Alexander
N1 - Publisher Copyright:
© 2022 ACM.
PY - 2022/11/2
Y1 - 2022/11/2
N2 - Just-in-time defect prediction assigns a defect risk to each new change to a software repository in order to prioritize review and testing efforts. Over the last decades different approaches were proposed in literature to craft more accurate prediction models. However, defect prediction is still not widely used in industry, due to predictions with varying performance. In this study, we evaluate existing features on six open-source projects and propose two new features sets, not yet discussed in literature. By combining all feature sets, we improve MCC by on average 21%, leading to the best performing models when compared to state-of-the-art approaches. We also evaluate effort-awareness and find that on average 14% more defects can be identified, inspecting 20% of changed lines.
AB - Just-in-time defect prediction assigns a defect risk to each new change to a software repository in order to prioritize review and testing efforts. Over the last decades different approaches were proposed in literature to craft more accurate prediction models. However, defect prediction is still not widely used in industry, due to predictions with varying performance. In this study, we evaluate existing features on six open-source projects and propose two new features sets, not yet discussed in literature. By combining all feature sets, we improve MCC by on average 21%, leading to the best performing models when compared to state-of-the-art approaches. We also evaluate effort-awareness and find that on average 14% more defects can be identified, inspecting 20% of changed lines.
KW - JIT defect prediction
KW - empirical evaluation
KW - machine learning
UR - http://www.scopus.com/inward/record.url?scp=85143202632&partnerID=8YFLogxK
U2 - 10.1145/3558489.3559068
DO - 10.1145/3558489.3559068
M3 - Conference contribution
AN - SCOPUS:85143202632
T3 - PROMISE 2022 - Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering, co-located with ESEC/FSE 2022
SP - 22
EP - 31
BT - PROMISE 2022 - Proceedings of the 18th International Conference on Predictive Models and Data Analytics in Software Engineering, co-located with ESEC/FSE 2022
A2 - McIntosh, Shane
A2 - Shang, Weiyi
A2 - Perez, Gema Rodriguez
PB - Association for Computing Machinery, Inc
Y2 - 17 November 2022
ER -