TY - GEN
T1 - Clock delta compression for scalable order-replay of non-deterministic parallel applications
AU - Sato, Kento
AU - Ahn, Dong H.
AU - Laguna, Ignacio
AU - Lee, Gregory L.
AU - Schulz, Martin
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/11/15
Y1 - 2015/11/15
N2 - The ability to record and replay program execution helps significantly in debugging non-deterministic MPI applications by reproducing message-receive orders. However, the large amount of data that traditional record-and-reply techniques record precludes its practical applicability to massively parallel applications. In this paper, we propose a new compression algorithm, Clock Delta Compression (CDC), for scalable record and replay of non-deterministic MPI applications. CDC defines a reference order of message receives based on a totally ordered relation using Lamport clocks, and only records the differences between this reference logical-clock order and an observed order. Our evaluation shows that CDC significantly reduces the record data size. For example, when we apply CDC to Monte Carlo particle transport Benchmark (MCB), which represents common non-deterministic communication patterns, CDC reduces the record size by approximately two orders of magnitude compared to traditional techniques and incurs between 13.1% and 25.5% of runtime overhead.
AB - The ability to record and replay program execution helps significantly in debugging non-deterministic MPI applications by reproducing message-receive orders. However, the large amount of data that traditional record-and-reply techniques record precludes its practical applicability to massively parallel applications. In this paper, we propose a new compression algorithm, Clock Delta Compression (CDC), for scalable record and replay of non-deterministic MPI applications. CDC defines a reference order of message receives based on a totally ordered relation using Lamport clocks, and only records the differences between this reference logical-clock order and an observed order. Our evaluation shows that CDC significantly reduces the record data size. For example, when we apply CDC to Monte Carlo particle transport Benchmark (MCB), which represents common non-deterministic communication patterns, CDC reduces the record size by approximately two orders of magnitude compared to traditional techniques and incurs between 13.1% and 25.5% of runtime overhead.
KW - compression
KW - debugging tools
KW - non-determinism
UR - http://www.scopus.com/inward/record.url?scp=84966661168&partnerID=8YFLogxK
U2 - 10.1145/2807591.2807642
DO - 10.1145/2807591.2807642
M3 - Conference contribution
AN - SCOPUS:84966661168
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2015
PB - IEEE Computer Society
T2 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015
Y2 - 15 November 2015 through 20 November 2015
ER -