TY - JOUR
T1 - One Sentence Can Kill the Bug
T2 - Auto-Replay Mobile App Crashes From One-Sentence Overviews
AU - Huang, Yuchao
AU - Wang, Junjie
AU - Liu, Zhe
AU - Li, Mingyang
AU - Wang, Song
AU - Chen, Chunyang
AU - Hu, Yuanzhe
AU - Wang, Qing
N1 - Publisher Copyright:
© 1976-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Crash reports play a crucial role in software maintenance as they inform developers about the issues encountered in mobile applications. Developers must reproduce the reported crash before fixing it, which is extremely time-consuming and tedious. Existing studies have focused on automatic crash reproduction with step-by-step instructions. However, a non-neglectable portion of crash reports only provides a one-sentence overview, which merely describes the final crash-triggering action. These reports require developers to invest more effort in understanding and fixing the issues while existing techniques cannot handle them due to the lack of step-by-step guidance, thus calling for a greater need for automatic support. Leveraging the capability of Large Language Models (LLMs) in combining acting and reasoning, we propose ReActDroid, an automated approach to reproduce mobile application crashes directly from the crash overview. ReActDroid utilizes ReAct prompting to augment the app-specific knowledge and exploration history, enabling the LLM to derive the necessary steps for triggering the crash from a comprehensive and historical perspective. We evaluate ReActDroid on 102 crash reports from 69 popular Android apps and successfully reproduce 57.8% of the crashes, surpassing the performance of state-of-the-art baselines by 69% to 321%. Besides, the average reproducing time is 51.8 seconds, outperforming the baselines by 73% to 949%. We also evaluate the usefulness of ReActDroid with promising results.
AB - Crash reports play a crucial role in software maintenance as they inform developers about the issues encountered in mobile applications. Developers must reproduce the reported crash before fixing it, which is extremely time-consuming and tedious. Existing studies have focused on automatic crash reproduction with step-by-step instructions. However, a non-neglectable portion of crash reports only provides a one-sentence overview, which merely describes the final crash-triggering action. These reports require developers to invest more effort in understanding and fixing the issues while existing techniques cannot handle them due to the lack of step-by-step guidance, thus calling for a greater need for automatic support. Leveraging the capability of Large Language Models (LLMs) in combining acting and reasoning, we propose ReActDroid, an automated approach to reproduce mobile application crashes directly from the crash overview. ReActDroid utilizes ReAct prompting to augment the app-specific knowledge and exploration history, enabling the LLM to derive the necessary steps for triggering the crash from a comprehensive and historical perspective. We evaluate ReActDroid on 102 crash reports from 69 popular Android apps and successfully reproduce 57.8% of the crashes, surpassing the performance of state-of-the-art baselines by 69% to 321%. Besides, the average reproducing time is 51.8 seconds, outperforming the baselines by 73% to 949%. We also evaluate the usefulness of ReActDroid with promising results.
KW - Mobile application testing
KW - issue report
KW - large language model
UR - http://www.scopus.com/inward/record.url?scp=85217678540&partnerID=8YFLogxK
U2 - 10.1109/TSE.2025.3535938
DO - 10.1109/TSE.2025.3535938
M3 - Article
AN - SCOPUS:85217678540
SN - 0098-5589
VL - 51
SP - 975
EP - 989
JO - IEEE Transactions on Software Engineering
JF - IEEE Transactions on Software Engineering
IS - 4
ER -