TY - GEN
T1 - POSTER
T2 - 21st ACM International Conference on Computing Frontiers, CF 2024
AU - Schäffeler, Jakob
AU - Elis, Bengisu
AU - Raoofy, Amir
AU - Weidendorfer, Josef
AU - Schulz, Martin
N1 - Publisher Copyright:
© 2024 Owner/Author.
PY - 2024/5/7
Y1 - 2024/5/7
N2 - GPUs are growingly dominating the High-Performance Computing ecosystem, and therefore, the ease of their programming is getting increasingly important. Standard and high-level offloading methods, like OpenMP offloading and OpenACC, facilitate portable and efficient offloading across different GPU platforms. However, pinpointing and troubleshooting performance variations among different models, implementations, or architectures poses a challenge due to varying abstraction levels and profilers employed. Therefore, to tackle this problem and to unwind the performance issues related to various offloading abstractions and models that are entangled together in practice, in this work, we introduce a portable tool to enable the comparison of performance profiles acquired from various offloading models and GPU platforms. For this, the tool first processes the collected profiles by different profilers to extract key performance indicatory metrics. For ease of comparison, the tool utilizes plots depicting the metrics of all target variants for relative comparison. Moreover, we demonstrate the tool's capabilities by discussing specific issues discovered by using the tool when comparing OpenMP offloading and CUDA implementations of Babelstream.
AB - GPUs are growingly dominating the High-Performance Computing ecosystem, and therefore, the ease of their programming is getting increasingly important. Standard and high-level offloading methods, like OpenMP offloading and OpenACC, facilitate portable and efficient offloading across different GPU platforms. However, pinpointing and troubleshooting performance variations among different models, implementations, or architectures poses a challenge due to varying abstraction levels and profilers employed. Therefore, to tackle this problem and to unwind the performance issues related to various offloading abstractions and models that are entangled together in practice, in this work, we introduce a portable tool to enable the comparison of performance profiles acquired from various offloading models and GPU platforms. For this, the tool first processes the collected profiles by different profilers to extract key performance indicatory metrics. For ease of comparison, the tool utilizes plots depicting the metrics of all target variants for relative comparison. Moreover, we demonstrate the tool's capabilities by discussing specific issues discovered by using the tool when comparing OpenMP offloading and CUDA implementations of Babelstream.
KW - GPU
KW - High performance computing
KW - Profiling
UR - http://www.scopus.com/inward/record.url?scp=85198906002&partnerID=8YFLogxK
U2 - 10.1145/3649153.3652997
DO - 10.1145/3649153.3652997
M3 - Conference contribution
AN - SCOPUS:85198906002
T3 - Proceedings of the 21st ACM International Conference on Computing Frontiers, CF 2024
SP - 320
EP - 321
BT - Proceedings of the 21st ACM International Conference on Computing Frontiers, CF 2024
PB - Association for Computing Machinery, Inc
Y2 - 7 May 2024 through 9 May 2024
ER -