TY - GEN
T1 - Execution reconstruction
T2 - 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2021
AU - Zuo, Gefei
AU - Ma, Jiacheng
AU - Quinn, Andrew
AU - Bhatotia, Pramod
AU - Fonseca, Pedro
AU - Kasikci, Baris
N1 - Publisher Copyright:
© 2021 ACM.
PY - 2021/6/18
Y1 - 2021/6/18
N2 - Reproducing production failures is crucial for software reliability. Alas, existing bug reproduction approaches are not suitable for production systems because they are not simultaneously efficient, effective, and accurate. In this work, we survey prior techniques and show that existing approaches over-prioritize a subset of these properties, and sacrifice the remaining ones. As a result, existing tools do not enable the plethora of proposed failure reproduction use-cases (e.g., debugging, security forensics, fuzzing) for production failures. We propose Execution Reconstruction (ER), a technique that strikes a better balance between efficiency, effectiveness and accuracy for reproducing production failures. ER uses hardware-assisted control and data tracing to shepherd symbolic execution and reproduce failures. ER's key novelty lies in identifying data values that are both inexpensive to monitor and useful for eliding the scalability limitations of symbolic execution. ER harnesses failure reoccurrences by iteratively performing tracing and symbolic execution, which reduces runtime overhead. Whereas prior production-grade techniques can only reproduce short executions, ER can reproduce any reoccuring failure. Thus, unlike existing tools, ER reproduces fully replayable executions that can power a variety of debugging and reliabilty use cases. ER incurs on average 0.3% (up to 1.1%) runtime monitoring overhead for a broad range of real-world systems, making it practical for real-world deployment.
AB - Reproducing production failures is crucial for software reliability. Alas, existing bug reproduction approaches are not suitable for production systems because they are not simultaneously efficient, effective, and accurate. In this work, we survey prior techniques and show that existing approaches over-prioritize a subset of these properties, and sacrifice the remaining ones. As a result, existing tools do not enable the plethora of proposed failure reproduction use-cases (e.g., debugging, security forensics, fuzzing) for production failures. We propose Execution Reconstruction (ER), a technique that strikes a better balance between efficiency, effectiveness and accuracy for reproducing production failures. ER uses hardware-assisted control and data tracing to shepherd symbolic execution and reproduce failures. ER's key novelty lies in identifying data values that are both inexpensive to monitor and useful for eliding the scalability limitations of symbolic execution. ER harnesses failure reoccurrences by iteratively performing tracing and symbolic execution, which reduces runtime overhead. Whereas prior production-grade techniques can only reproduce short executions, ER can reproduce any reoccuring failure. Thus, unlike existing tools, ER reproduces fully replayable executions that can power a variety of debugging and reliabilty use cases. ER incurs on average 0.3% (up to 1.1%) runtime monitoring overhead for a broad range of real-world systems, making it practical for real-world deployment.
KW - debugging
KW - symbolic execution
UR - http://www.scopus.com/inward/record.url?scp=85108909445&partnerID=8YFLogxK
U2 - 10.1145/3453483.3454101
DO - 10.1145/3453483.3454101
M3 - Conference contribution
AN - SCOPUS:85108909445
T3 - Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)
SP - 1155
EP - 1170
BT - PLDI 2021 - Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation
A2 - Freund, Stephen N.
A2 - Yahav, Eran
PB - Association for Computing Machinery
Y2 - 20 June 2021 through 25 June 2021
ER -