Execution reconstruction: Harnessing failure reoccurrences for failure reproduction

Gefei Zuo, Jiacheng Ma, Andrew Quinn, Pramod Bhatotia, Pedro Fonseca, Baris Kasikci

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

Reproducing production failures is crucial for software reliability. Alas, existing bug reproduction approaches are not suitable for production systems because they are not simultaneously efficient, effective, and accurate. In this work, we survey prior techniques and show that existing approaches over-prioritize a subset of these properties, and sacrifice the remaining ones. As a result, existing tools do not enable the plethora of proposed failure reproduction use-cases (e.g., debugging, security forensics, fuzzing) for production failures. We propose Execution Reconstruction (ER), a technique that strikes a better balance between efficiency, effectiveness and accuracy for reproducing production failures. ER uses hardware-assisted control and data tracing to shepherd symbolic execution and reproduce failures. ER's key novelty lies in identifying data values that are both inexpensive to monitor and useful for eliding the scalability limitations of symbolic execution. ER harnesses failure reoccurrences by iteratively performing tracing and symbolic execution, which reduces runtime overhead. Whereas prior production-grade techniques can only reproduce short executions, ER can reproduce any reoccuring failure. Thus, unlike existing tools, ER reproduces fully replayable executions that can power a variety of debugging and reliabilty use cases. ER incurs on average 0.3% (up to 1.1%) runtime monitoring overhead for a broad range of real-world systems, making it practical for real-world deployment.

Original languageEnglish
Title of host publicationPLDI 2021 - Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation
EditorsStephen N. Freund, Eran Yahav
PublisherAssociation for Computing Machinery
Pages1155-1170
Number of pages16
ISBN (Electronic)9781450383912
DOIs
StatePublished - 18 Jun 2021
Event42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2021 - Virtual, Online, Canada
Duration: 20 Jun 202125 Jun 2021

Publication series

NameProceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)

Conference

Conference42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2021
Country/TerritoryCanada
CityVirtual, Online
Period20/06/2125/06/21

Keywords

  • debugging
  • symbolic execution

Fingerprint

Dive into the research topics of 'Execution reconstruction: Harnessing failure reoccurrences for failure reproduction'. Together they form a unique fingerprint.

Cite this