Software Testing With Large Language Models: Survey, Landscape, and Vision

Junjie Wang, Yuchao Huang, Chunyang Chen, Zhe Liu, Song Wang, Qing Wang

Research output: Contribution to journalArticlepeer-review

116 Scopus citations

Abstract

Pre-trained large language models (LLMs) have recently emerged as a breakthrough technology in natural language processing and artificial intelligence, with the ability to handle large-scale datasets and exhibit remarkable performance across a wide range of tasks. Meanwhile, software testing is a crucial undertaking that serves as a cornerstone for ensuring the quality and reliability of software products. As the scope and complexity of software systems continue to grow, the need for more effective software testing techniques becomes increasingly urgent, making it an area ripe for innovative approaches such as the use of LLMs. This paper provides a comprehensive review of the utilization of LLMs in software testing. It analyzes 102 relevant studies that have used LLMs for software testing, from both the software testing and LLMs perspectives. The paper presents a detailed discussion of the software testing tasks for which LLMs are commonly used, among which test case preparation and program repair are the most representative. It also analyzes the commonly used LLMs, the types of prompt engineering that are employed, as well as the accompanied techniques with these LLMs. It also summarizes the key challenges and potential opportunities in this direction. This work can serve as a roadmap for future research in this area, highlighting potential avenues for exploration, and identifying gaps in our current understanding of the use of LLMs in software testing.

Original languageEnglish
Pages (from-to)911-936
Number of pages26
JournalIEEE Transactions on Software Engineering
Volume50
Issue number4
DOIs
StatePublished - 1 Apr 2024

Keywords

  • GPT
  • LLM
  • Pre-trained large language model
  • software testing

Fingerprint

Dive into the research topics of 'Software Testing With Large Language Models: Survey, Landscape, and Vision'. Together they form a unique fingerprint.

Cite this