Abstract
Data processing systems face the challenge of supporting increasingly diverse workloads efficiently. At the same time, they are already bloated with internal complexity, and it is not clear how new hardware can be supported sustainably. In this paper, we aim to resolve these issues by proposing a unified abstraction layer based on declarative sub-operators in addition to relational operators. By exposing this layer to users, they can express their non-relational workloads declaratively with sub-operators. Furthermore, the proposed sub-operators decouple the semantic implementation of operators from the efficient imperative implementation, reducing the implementation complexity for relational operators. Finally, through fine-grained automatic optimizations, the declarative sub-operators allow for automatic morsel-driven parallelism. We demonstrate the benefits not only by providing a specific set of sub-operators but also implementing them in a compiling query engine. With thorough evaluation and analysis, we show that we can support a richer set of workloads while retaining the development complexity low and being competitive in performance even with specialized systems.
Original language | English |
---|---|
Pages (from-to) | 3461-3474 |
Number of pages | 14 |
Journal | Proceedings of the VLDB Endowment |
Volume | 16 |
Issue number | 11 |
DOIs | |
State | Published - 2023 |
Event | 49th International Conference on Very Large Data Bases, VLDB 2023 - Vancouver, Canada Duration: 28 Aug 2023 → 1 Sep 2023 |