Benchmarking Generative AI Models for Deep Learning Test Input Generation

Maryam Maryam, Matteo Biagiola, Andrea Stocco, Vincenzo Riccio

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Test Input Generators (TIGs) are crucial to assess the ability of Deep Learning (DL) image classifiers to provide correct predictions for inputs beyond their training and test sets. Recent advancements in Generative AI(GenAI) models have made them a powerful tool for creating and manipulating synthetic images, although these advancements also imply increased complexity and resource demands for training. In this work, we benchmark and combine different GenAI models with TIGs, assessing their effectiveness, efficiency, and quality of the generated test images, in terms of domain validity and label preservation. We conduct an empirical study involving three different GenAI architectures (VAEs, GANs, Diffusion Models), five classification tasks of increasing complexity, and 364 human evaluations. Our results show that simpler architectures, such as VAEs, are sufficient for less complex datasets like MNIST. However, when dealing with feature-rich datasets, such as ImageNet, more sophisticated architectures like Diffusion Models achieve superior performance by generating a higher number of valid, misclassification-inducing inputs.

Original languageEnglish
Title of host publication2025 IEEE Conference on Software Testing, Verification and Validation, ICST 2025
EditorsAnna Rita Fasolino, Sebastiano Panichella, Aldeida Aleti, Ali Mesbah
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages174-185
Number of pages12
ISBN (Electronic)9798331508142
DOIs
StatePublished - 2025
Event18th IEEE Conference on Software Testing, Verification and Validation, ICST 2025 - Naples, Italy
Duration: 31 Mar 20254 Apr 2025

Publication series

Name2025 IEEE Conference on Software Testing, Verification and Validation, ICST 2025

Conference

Conference18th IEEE Conference on Software Testing, Verification and Validation, ICST 2025
Country/TerritoryItaly
CityNaples
Period31/03/254/04/25

Keywords

  • Deep Learning
  • Generative AI
  • Software Testing

Fingerprint

Dive into the research topics of 'Benchmarking Generative AI Models for Deep Learning Test Input Generation'. Together they form a unique fingerprint.

Cite this