Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra

Timo Kersten, Viktor Leis, Thomas Neumann

Research output: Contribution to journalArticlepeer-review

32 Scopus citations

Abstract

Although compiling queries to efficient machine code has become a common approach for query execution, a number of newly created database system projects still refrain from using compilation. It is sometimes claimed that the intricacies of code generation make compilation-based engines too complex. Also, a major barrier for adoption, especially for interactive ad hoc queries, is long compilation time. In this paper, we examine all stages of compiling query execution engines and show how to reduce compilation overhead. We incorporate the lessons learned from a decade of generating code in HyPer into a design that manages complexity and yields high speed. First, we introduce a code generation framework that establishes abstractions to manage complexity, yet generates code in a single fast pass. Second, we present a program representation whose data structures are tuned to support fast code generation and compilation. Third, we introduce a new compiler backend that is optimized for minimal compile time, and simultaneously, yields superior execution performance to competing approaches, e.g., Volcano-style or bytecode interpretation. We implemented these optimizations in our database system Umbra to show that it is possible to unite fast compilation and fast execution. Indeed, Umbra achieves unprecedentedly low query latencies. On small data sets, it is even faster than interpreter engines like DuckDB and PostgreSQL. At the same time, on large data sets, its throughput is on par with the state-of-the-art compiling system HyPer.

Original languageEnglish
Pages (from-to)883-905
Number of pages23
JournalVLDB Journal
Volume30
Issue number5
DOIs
StatePublished - Sep 2021

Keywords

  • Code generation
  • Low latency
  • Relational query execution

Fingerprint

Dive into the research topics of 'Tidy Tuples and Flying Start: fast compilation and fast execution of relational queries in Umbra'. Together they form a unique fingerprint.

Cite this