Time series mining at petascale performance

Amir Raoofy, Roman Karlstetter, Dai Yang, Carsten Trinitis, Martin Schulz

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

2 Zitate (Scopus)

Abstract

The mining of time series data plays an important role in modern information retrieval and analysis systems. In particular, the identification of similarities within and across time series has garnered significant attention and effort over the last few years. For this task, the class of matrix profile algorithms, which create a generic structure that encodes correlations among records and dimensions—the matrix profile—is a promising approach, as it allows simplified post-processing and analysis steps by examining the resulting matrix profile structure. However, it is expensive to create a matrix profile: it requires significant computational power to evaluate the distance among all subsequence pairs in a time series, especially for very long and multi-dimensional time series with a large dimensionality. Existing approaches are limited in their scalability, as they do not target High Performance Computing systems, and—for most realistic problems—are suited only for datasets with a small dimensionality. In this paper, we introduce a novel MPI-based approach for the calculation of a matrix profile for multi-dimensional time series that pushes these limits. We evaluate the efficiency of our approach using an analytical performance model combined with experimental data. Finally, we demonstrate our solution on a 128-dimensional time series dataset of 1 million records, solving 274 trillion sorts at a sustained 1.3 Petaflop/s performance on the SuperMUC-NG system.

OriginalspracheEnglisch
TitelHigh Performance Computing - 35th International Conference, ISC High Performance 2020, Proceedings
Redakteure/-innenPonnuswamy Sadayappan, Bradford L. Chamberlain, Guido Juckeland, Hatem Ltaief
Herausgeber (Verlag)Springer
Seiten104-123
Seitenumfang20
ISBN (Print)9783030507428
DOIs
PublikationsstatusVeröffentlicht - 2020
Veranstaltung35th International Conference on High Performance Computing, ISC High Performance 2020 - Frankfurt, Deutschland
Dauer: 22 Juni 202025 Juni 2020

Publikationsreihe

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Band12151 LNCS
ISSN (Print)0302-9743
ISSN (elektronisch)1611-3349

Konferenz

Konferenz35th International Conference on High Performance Computing, ISC High Performance 2020
Land/GebietDeutschland
OrtFrankfurt
Zeitraum22/06/2025/06/20

Fingerprint

Untersuchen Sie die Forschungsthemen von „Time series mining at petascale performance“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren