TY - GEN
T1 - Exploiting Reduced Precision for GPU-based Time Series Mining
AU - Ju, Yi
AU - Raoofy, Amir
AU - Yang, Dai
AU - Laure, Erwin
AU - Schulz, Martin
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - The mining of multi-dimensional time series is a crucial step in gaining insights into data obtained from physical systems and from monitoring infrastructures. A widely accepted approach for this challenge is the matrix profile, which, however, is computationally very expensive. It relies on calculating large correlation matrices coupled with sort operations across all dimensions of the data, as well as on performing inclusive scans. All of these steps are inherently data parallel and can, therefore, benefit from execution on GPUs, and even more so from horizontal scaling on multiple GPUs. In addition, the nature of the matrix profile calculation allows the exploitation of reduced precision on GPUs. This offers further improvements to enable the analysis of ever growing data sets in real-world scenarios. Based on these motivations, we introduce the first parallel algorithm for multi-dimensional matrix profile on multiple GPUs exploiting reduced precision modes and provide a highly opti-mized implementation using novel optimization techniques. On one NVIDIA A100 GPU, our implementation achieves a 54x performance improvement in comparison to an optimized single-node execution on a state-of-the-art CPU-based implementation relying on double-precision computation and an additional factor of 1.4x when switching to reduced precision while maintaining sufficient accuracy. We study the accuracy and performance trade-offs for our proposed algorithm in detail and present synthetic and real-world case studies to demonstrate how the reduced precision improves the performance, while accomplishing sufficiently accurate results.
AB - The mining of multi-dimensional time series is a crucial step in gaining insights into data obtained from physical systems and from monitoring infrastructures. A widely accepted approach for this challenge is the matrix profile, which, however, is computationally very expensive. It relies on calculating large correlation matrices coupled with sort operations across all dimensions of the data, as well as on performing inclusive scans. All of these steps are inherently data parallel and can, therefore, benefit from execution on GPUs, and even more so from horizontal scaling on multiple GPUs. In addition, the nature of the matrix profile calculation allows the exploitation of reduced precision on GPUs. This offers further improvements to enable the analysis of ever growing data sets in real-world scenarios. Based on these motivations, we introduce the first parallel algorithm for multi-dimensional matrix profile on multiple GPUs exploiting reduced precision modes and provide a highly opti-mized implementation using novel optimization techniques. On one NVIDIA A100 GPU, our implementation achieves a 54x performance improvement in comparison to an optimized single-node execution on a state-of-the-art CPU-based implementation relying on double-precision computation and an additional factor of 1.4x when switching to reduced precision while maintaining sufficient accuracy. We study the accuracy and performance trade-offs for our proposed algorithm in detail and present synthetic and real-world case studies to demonstrate how the reduced precision improves the performance, while accomplishing sufficiently accurate results.
KW - Multi-GPU algorithms
KW - data mining
KW - matrix profile
KW - multi-dimensional time series
KW - reduced precision
UR - http://www.scopus.com/inward/record.url?scp=85136338890&partnerID=8YFLogxK
U2 - 10.1109/IPDPS53621.2022.00021
DO - 10.1109/IPDPS53621.2022.00021
M3 - Conference contribution
AN - SCOPUS:85136338890
T3 - Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium, IPDPS 2022
SP - 124
EP - 134
BT - Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium, IPDPS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 36th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2022
Y2 - 30 May 2022 through 3 June 2022
ER -