TY - GEN
T1 - Scrubjay
T2 - 2017 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2017
AU - Gimenez, Alfredo
AU - Gamblin, Todd
AU - Bhatele, Abhinav
AU - Wood, Chad
AU - Shoga, Kathleen
AU - Marathe, Aniruddha
AU - Bremer, Peer Timo
AU - Hamann, Bernd
AU - Schulz, Martin
N1 - Publisher Copyright:
© 2017 ACM.
PY - 2017
Y1 - 2017
N2 - Modern HPC centers comprise clusters, storage, networks, power and cooling infrastructure, and more. Analyzing the efficiency of these complex facilities is a daunting task. Increasingly, facilities deploy sensors and monitoring tools, but with millions of instrumented components, analyzing collected data manually is intractable. Data from an HPC center comprises different formats, granularities, and semantics, and handwritten scripts no longer suffice to transform the data into a digestible form. We present ScrubJay, an intuitive, scalable framework for automatic analysis of disparate HPC data. ScrubJay decouples the task of specifying data relationships from the task of analyzing data. Domain experts can store reusable transformations that describe relations between domains. ScrubJay also automates performance analysis. Analysts provide a query over logical domains of interest, and ScrubJay automatically derives needed steps to transform raw measurements. ScrubJay makes large-scale analysis tractable, reproducible, and provides insights into HPC facilities.
AB - Modern HPC centers comprise clusters, storage, networks, power and cooling infrastructure, and more. Analyzing the efficiency of these complex facilities is a daunting task. Increasingly, facilities deploy sensors and monitoring tools, but with millions of instrumented components, analyzing collected data manually is intractable. Data from an HPC center comprises different formats, granularities, and semantics, and handwritten scripts no longer suffice to transform the data into a digestible form. We present ScrubJay, an intuitive, scalable framework for automatic analysis of disparate HPC data. ScrubJay decouples the task of specifying data relationships from the task of analyzing data. Domain experts can store reusable transformations that describe relations between domains. ScrubJay also automates performance analysis. Analysts provide a query over logical domains of interest, and ScrubJay automatically derives needed steps to transform raw measurements. ScrubJay makes large-scale analysis tractable, reproducible, and provides insights into HPC facilities.
KW - Facility Monitoring
KW - HPC Performance Analysis
KW - Performance Tools
UR - http://www.scopus.com/inward/record.url?scp=85142259539&partnerID=8YFLogxK
U2 - 10.1145/3126908.3126935
DO - 10.1145/3126908.3126935
M3 - Conference contribution
AN - SCOPUS:85142259539
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - SC 2017 - International Conference for High Performance Computing, Networking, Storage and Analysis
PB - IEEE Computer Society
Y2 - 12 November 2017 through 17 November 2017
ER -