SASPAR: Shared Adaptive Stream Partitioning

Jeyhun Karimov, Hans Arno Jacobsen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Data partitioning induces network transfers and dominates the cost of stream data analytics. Moreover, partitioning streaming data for multiple stream queries in the same cluster can easily saturate the network bandwidth and lead to high end-to-end latencies.The goal of this paper is to share the partition operation in streaming workloads and maximize the sharing opportunities for multiple stream queries. However, there are several challenges, such as minimizing data copy, optimizing the partitioning strategy for multiple queries, and minimizing latency.We propose SASPAR, Shared Adaptive Stream Partitioner, which is able to share data partitioning among multiple stream queries. Our contributions are threefold. First, we propose a new technique to optimize the partitioning strategy for multiple stream queries. Second, we present an adaptive query execution framework that performs optimizations at run-time, without stopping the query execution plan. Third, we utilize meta-heuristics and machine learning when solving the underlying optimization problem takes more time than expected.SASPAR is designed as a versatile layer to sit on top of a stream processing engine (SPE). We operate SASPAR on top of three state-of-the-art SPEs with hundreds of stream queries. Our experimental results show that SASPAR improves the performance (throughput and latency) of all underlying SPEs by up to 3x.

Original languageEnglish
Title of host publicationProceedings - 2023 IEEE 39th International Conference on Data Engineering, ICDE 2023
PublisherIEEE Computer Society
Pages922-935
Number of pages14
ISBN (Electronic)9798350322279
DOIs
StatePublished - 2023
Externally publishedYes
Event39th IEEE International Conference on Data Engineering, ICDE 2023 - Anaheim, United States
Duration: 3 Apr 20237 Apr 2023

Publication series

NameProceedings - International Conference on Data Engineering
Volume2023-April
ISSN (Print)1084-4627

Conference

Conference39th IEEE International Conference on Data Engineering, ICDE 2023
Country/TerritoryUnited States
CityAnaheim
Period3/04/237/04/23

Keywords

  • Stream processing
  • shared data partitioning

Fingerprint

Dive into the research topics of 'SASPAR: Shared Adaptive Stream Partitioning'. Together they form a unique fingerprint.

Cite this