TY - GEN
T1 - A graph based approach for MPI deadlock detection
AU - Hilbrich, Tobias
AU - De Supinski, Bronis R.
AU - Schulz, Martin
AU - Müller, Matthias S.
PY - 2009
Y1 - 2009
N2 - The MPI standard defines several usage patterns that can lead to deadlock, some of which involve collective communications or non-deterministic operations such as wildcard receives. Further, some MPI programming deadlocks only occur for some MPI implementations or certain configurations. Many tools to detect MPI deadlocks exist; however, none precisely handles the increased complexity of deadlock detection created by the richness of theMPI standard, which requires a general deadlock model. We present the first general deadlock model for MPI including a novel necessary and sufficient criterion, the OR-Knot, for deadlock in MPI programs. This model enables visualization of MPI deadlocks and motivates the design of a new deadlock detection mechanism. We compare our implementation of this mechanism to the ad-hoc mechanism previously available in Umpire, which reflected MPI non-determinism and, thus, more completely detected MPI deadlocks than any other existing MPI deadlock detection tool. Overall, our results demonstrate that our mechanism improves performance by as much as two orders of magnitude while providing precise characterization of deadlocks.
AB - The MPI standard defines several usage patterns that can lead to deadlock, some of which involve collective communications or non-deterministic operations such as wildcard receives. Further, some MPI programming deadlocks only occur for some MPI implementations or certain configurations. Many tools to detect MPI deadlocks exist; however, none precisely handles the increased complexity of deadlock detection created by the richness of theMPI standard, which requires a general deadlock model. We present the first general deadlock model for MPI including a novel necessary and sufficient criterion, the OR-Knot, for deadlock in MPI programs. This model enables visualization of MPI deadlocks and motivates the design of a new deadlock detection mechanism. We compare our implementation of this mechanism to the ad-hoc mechanism previously available in Umpire, which reflected MPI non-determinism and, thus, more completely detected MPI deadlocks than any other existing MPI deadlock detection tool. Overall, our results demonstrate that our mechanism improves performance by as much as two orders of magnitude while providing precise characterization of deadlocks.
KW - Deadlock detection
KW - MPI
KW - Parallel programming
KW - Umpire
UR - http://www.scopus.com/inward/record.url?scp=70449713896&partnerID=8YFLogxK
U2 - 10.1145/1542275.1542319
DO - 10.1145/1542275.1542319
M3 - Conference contribution
AN - SCOPUS:70449713896
SN - 9781605584980
T3 - Proceedings of the International Conference on Supercomputing
SP - 296
EP - 305
BT - ICS'09 - Proceedings of the 23rd International Conference on Supercomputing
T2 - 23rd International Conference on Supercomputing, ICS'09
Y2 - 8 June 2009 through 12 June 2009
ER -