TY - GEN
T1 - A Machine Learning Framework for Performance Coverage Analysis of Proxy Applications
AU - Islam, Tanzima Z.
AU - Thiagarajan, Jayaraman J.
AU - Bhatele, Abhinav
AU - Schulz, Martin
AU - Gamblin, Todd
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/7/2
Y1 - 2016/7/2
N2 - Proxy applications are written to represent subsets of performance behaviors of larger, and more complex applications that often have distribution restrictions. They enable easy evaluation of these behaviors across systems, e.g., for procurement or co-design purposes. However, the intended correlation between the performance behaviors of proxy applications and their parent codes is often based solely on the developer's intuition. In this paper, we present novel machine learning techniques to methodically quantify the coverage of performance behaviors of parent codes by their proxy applications. We have developed a framework, VERITAS, to answer these questions in the context of on-node performance: A) which hardware resources are covered by a proxy application and how well, and b) which resources are important, but not covered. We present our techniques in the context of two benchmarks, STREAM and DGEMM, and two production applications, OpenMC and CMTnek, and their respective proxy applications.
AB - Proxy applications are written to represent subsets of performance behaviors of larger, and more complex applications that often have distribution restrictions. They enable easy evaluation of these behaviors across systems, e.g., for procurement or co-design purposes. However, the intended correlation between the performance behaviors of proxy applications and their parent codes is often based solely on the developer's intuition. In this paper, we present novel machine learning techniques to methodically quantify the coverage of performance behaviors of parent codes by their proxy applications. We have developed a framework, VERITAS, to answer these questions in the context of on-node performance: A) which hardware resources are covered by a proxy application and how well, and b) which resources are important, but not covered. We present our techniques in the context of two benchmarks, STREAM and DGEMM, and two production applications, OpenMC and CMTnek, and their respective proxy applications.
KW - Machine learning
KW - Performance analysis
KW - Scalability
KW - Unsupervised learning
UR - http://www.scopus.com/inward/record.url?scp=85017250789&partnerID=8YFLogxK
U2 - 10.1109/SC.2016.45
DO - 10.1109/SC.2016.45
M3 - Conference contribution
AN - SCOPUS:85017250789
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
SP - 538
EP - 549
BT - Proceedings of SC 2016
PB - IEEE Computer Society
T2 - 2016 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016
Y2 - 13 November 2016 through 18 November 2016
ER -