Scalable garbage collection for inmemory MVCC systems

Jan Bottcher, Viktor Leis, Thomas Neumann, Alfons Kemper

Research output: Contribution to journalConference articlepeer-review

37 Scopus citations

Abstract

To support Hybrid Transaction and Analytical Processing (HTAP), database systems generally rely on Multi-Version Concurrency Control (MVCC). While MVCC elegantly enables lightweight isolation of readers and writers, it also generates outdated tuple versions, which, eventually, have to be reclaimed. Surprisingly, we have found that in HTAP workloads, this reclamation of old versions, i.e., garbage collection, often becomes the performance bottleneck. It turns out that in the presence of long-running queries, state-of-the-art garbage collectors are too coarse-grained. As a consequence, the number of versions grows quickly slowing down the entire system. Moreover, the standard background cleaning approach makes the system vulnerable to sudden spikes in workloads. In this work, we propose a novel garbage collection (GC) approach that prunes obsolete versions eagerly. Its seamless integration into the transaction processing keeps the GC overhead minimal and ensures good scalability. We show that our approach handles mixed workloads well and also speeds up pure OLTP workloads like TPC-C compared to existing state-of-the-art approaches.

Original languageEnglish
Pages (from-to)128-141
Number of pages14
JournalProceedings of the VLDB Endowment
Volume13
Issue number2
DOIs
StatePublished - 2020
Event46th International Conference on Very Large Data Bases, VLDB 2020 - Virtual, Japan
Duration: 31 Aug 20204 Sep 2020

Fingerprint

Dive into the research topics of 'Scalable garbage collection for inmemory MVCC systems'. Together they form a unique fingerprint.

Cite this