GrammarForge: Learning Program Input Grammars for Fuzz Testing

Hannes Sochor, Flavio Ferrarotti, Robert Wille

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Providing good methods for testing properties of software is critical. Such methods often depend on a formal description of the program input language, in particular, when they are based on grammar-based fuzzing. Unfortunately, it cannot be ensured that such a formal description is always available. To tackle this problem, we propose a new method that automates grammar learning for the input language of the software under test, combining classical language membership queries with light weight source code analysis tools such as control flow graphs and program instrumentation. We present a prototype implementation (GrammarForge) of our method, which works following a process of automated conservative substitution of place holders by terminals, starting from an initial grammar structure extracted from the source code. It targets arbitrary parses implemented using recursive descent techniques. We perform extensive experimentation, which shows that GrammarForge outperforms alternative tools with respect to accuracy of the learned grammar. Moreover, and different to all other state of the art methods, our learning process does not need seed inputs from the target language.

Original languageEnglish
Title of host publicationSoftware Engineering and Formal Methods - 22nd International Conference, SEFM 2024, Proceedings
EditorsAlexandre Madeira, Alexander Knapp
PublisherSpringer Science and Business Media Deutschland GmbH
Pages272-289
Number of pages18
ISBN (Print)9783031773815
DOIs
StatePublished - 2025
Externally publishedYes
Event22nd International Conference on Software Engineering and Formal Methods, SEFM 2024 - Aveiro, Portugal
Duration: 6 Nov 20248 Nov 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume15280 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd International Conference on Software Engineering and Formal Methods, SEFM 2024
Country/TerritoryPortugal
CityAveiro
Period6/11/248/11/24

Fingerprint

Dive into the research topics of 'GrammarForge: Learning Program Input Grammars for Fuzz Testing'. Together they form a unique fingerprint.

Cite this