TY - GEN
T1 - Fill in the Blank
T2 - 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023
AU - Liu, Zhe
AU - Chen, Chunyang
AU - Wang, Junjie
AU - Che, Xing
AU - Huang, Yuekai
AU - Hu, Jun
AU - Wang, Qing
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Automated GUI testing is widely used to help ensure the quality of mobile apps. However, many GUIs require appropriate text inputs to proceed to the next page, which remains a prominent obstacle for testing coverage. Considering the diversity and semantic requirement of valid inputs (e.g., flight departure, movie name), it is challenging to automate the text input generation. Inspired by the fact that the pre-trained Large Language Model (LLM) has made outstanding progress in text generation, we propose an approach named QTypist based on LLM for intelligently generating semantic input text according to the GUI context. To boost the performance of LLM in the mobile testing scenario, we develop a prompt-based data construction and tuning method which automatically extracts the prompts and answers for model tuning. We evaluate QTypist on 106 apps from Google Play, and the result shows that the passing rate of QTypist is 87%, which is 93% higher than the best baseline. We also integrate QTypist with the automated GUI testing tools and it can cover 42% more app activities, 52% more pages, and subsequently help reveal 122% more bugs compared with the raw tool.
AB - Automated GUI testing is widely used to help ensure the quality of mobile apps. However, many GUIs require appropriate text inputs to proceed to the next page, which remains a prominent obstacle for testing coverage. Considering the diversity and semantic requirement of valid inputs (e.g., flight departure, movie name), it is challenging to automate the text input generation. Inspired by the fact that the pre-trained Large Language Model (LLM) has made outstanding progress in text generation, we propose an approach named QTypist based on LLM for intelligently generating semantic input text according to the GUI context. To boost the performance of LLM in the mobile testing scenario, we develop a prompt-based data construction and tuning method which automatically extracts the prompts and answers for model tuning. We evaluate QTypist on 106 apps from Google Play, and the result shows that the passing rate of QTypist is 87%, which is 93% higher than the best baseline. We also integrate QTypist with the automated GUI testing tools and it can cover 42% more app activities, 52% more pages, and subsequently help reveal 122% more bugs compared with the raw tool.
KW - Android app
KW - GUI testing
KW - Large language model
KW - Prompt-tuning
KW - Text input generation
UR - http://www.scopus.com/inward/record.url?scp=85168636859&partnerID=8YFLogxK
U2 - 10.1109/ICSE48619.2023.00119
DO - 10.1109/ICSE48619.2023.00119
M3 - Conference contribution
AN - SCOPUS:85168636859
T3 - Proceedings - International Conference on Software Engineering
SP - 1355
EP - 1367
BT - Proceedings - 2023 IEEE/ACM 45th International Conference on Software Engineering, ICSE 2023
PB - IEEE Computer Society
Y2 - 15 May 2023 through 16 May 2023
ER -