TY - JOUR
T1 - Combing the communication hairball
T2 - Visualizing parallel execution traces using logical time
AU - Isaacs, Katherine E.
AU - Bremer, Peer Timo
AU - Jusufi, Ilir
AU - Gamblin, Todd
AU - Bhatele, Abhinav
AU - Schulz, Martin
AU - Hamann, Bernd
N1 - Publisher Copyright:
© 1995-2012 IEEE.
PY - 2014/12/31
Y1 - 2014/12/31
N2 - With the continuous rise in complexity of modern supercomputers, optimizing the performance of large-scale parallel programs is becoming increasingly challenging. Simultaneously, the growth in scale magnifies the impact of even minor inefficiencies-potentially millions of compute hours and megawatts in power consumption can be wasted on avoidable mistakes or sub-optimal algorithms. This makes performance analysis and optimization critical elements in the software development process. One of the most common forms of performance analysis is to study execution traces, which record a history of per-process events and interprocess messages in a parallel application. Trace visualizations allow users to browse this event history and search for insights into the observed performance behavior. However, current visualizations are difficult to understand even for small process counts and do not scale gracefully beyond a few hundred processes. Organizing events in time leads to a virtually unintelligible conglomerate of interleaved events and moderately high process counts overtax even the largest display. As an alternative, we present a new trace visualization approach based on transforming the event history into logical time inferred directly from happened-before relationships. This emphasizes the code's structural behavior, which is much more familiar to the application developer. The original timing data, or other information, is then encoded through color, leading to a more intuitive visualization. Furthermore, we use the discrete nature of logical timelines to cluster processes according to their local behavior leading to a scalable visualization of even long traces on large process counts. We demonstrate our system using two case studies on large-scale parallel codes.
AB - With the continuous rise in complexity of modern supercomputers, optimizing the performance of large-scale parallel programs is becoming increasingly challenging. Simultaneously, the growth in scale magnifies the impact of even minor inefficiencies-potentially millions of compute hours and megawatts in power consumption can be wasted on avoidable mistakes or sub-optimal algorithms. This makes performance analysis and optimization critical elements in the software development process. One of the most common forms of performance analysis is to study execution traces, which record a history of per-process events and interprocess messages in a parallel application. Trace visualizations allow users to browse this event history and search for insights into the observed performance behavior. However, current visualizations are difficult to understand even for small process counts and do not scale gracefully beyond a few hundred processes. Organizing events in time leads to a virtually unintelligible conglomerate of interleaved events and moderately high process counts overtax even the largest display. As an alternative, we present a new trace visualization approach based on transforming the event history into logical time inferred directly from happened-before relationships. This emphasizes the code's structural behavior, which is much more familiar to the application developer. The original timing data, or other information, is then encoded through color, leading to a more intuitive visualization. Furthermore, we use the discrete nature of logical timelines to cluster processes according to their local behavior leading to a scalable visualization of even long traces on large process counts. We demonstrate our system using two case studies on large-scale parallel codes.
KW - Information visualization
KW - performance analysis
KW - software visualization
KW - timelines
KW - traces
UR - http://www.scopus.com/inward/record.url?scp=84910066352&partnerID=8YFLogxK
U2 - 10.1109/TVCG.2014.2346456
DO - 10.1109/TVCG.2014.2346456
M3 - Article
AN - SCOPUS:84910066352
SN - 1077-2626
VL - 20
SP - 2349
EP - 2358
JO - IEEE Transactions on Visualization and Computer Graphics
JF - IEEE Transactions on Visualization and Computer Graphics
IS - 12
M1 - 6876005
ER -