TY - JOUR
T1 - Scalable garbage collection for inmemory MVCC systems
AU - Bottcher, Jan
AU - Leis, Viktor
AU - Neumann, Thomas
AU - Kemper, Alfons
N1 - Publisher Copyright:
© VLDB Endowment.
PY - 2020
Y1 - 2020
N2 - To support Hybrid Transaction and Analytical Processing (HTAP), database systems generally rely on Multi-Version Concurrency Control (MVCC). While MVCC elegantly enables lightweight isolation of readers and writers, it also generates outdated tuple versions, which, eventually, have to be reclaimed. Surprisingly, we have found that in HTAP workloads, this reclamation of old versions, i.e., garbage collection, often becomes the performance bottleneck. It turns out that in the presence of long-running queries, state-of-the-art garbage collectors are too coarse-grained. As a consequence, the number of versions grows quickly slowing down the entire system. Moreover, the standard background cleaning approach makes the system vulnerable to sudden spikes in workloads. In this work, we propose a novel garbage collection (GC) approach that prunes obsolete versions eagerly. Its seamless integration into the transaction processing keeps the GC overhead minimal and ensures good scalability. We show that our approach handles mixed workloads well and also speeds up pure OLTP workloads like TPC-C compared to existing state-of-the-art approaches.
AB - To support Hybrid Transaction and Analytical Processing (HTAP), database systems generally rely on Multi-Version Concurrency Control (MVCC). While MVCC elegantly enables lightweight isolation of readers and writers, it also generates outdated tuple versions, which, eventually, have to be reclaimed. Surprisingly, we have found that in HTAP workloads, this reclamation of old versions, i.e., garbage collection, often becomes the performance bottleneck. It turns out that in the presence of long-running queries, state-of-the-art garbage collectors are too coarse-grained. As a consequence, the number of versions grows quickly slowing down the entire system. Moreover, the standard background cleaning approach makes the system vulnerable to sudden spikes in workloads. In this work, we propose a novel garbage collection (GC) approach that prunes obsolete versions eagerly. Its seamless integration into the transaction processing keeps the GC overhead minimal and ensures good scalability. We show that our approach handles mixed workloads well and also speeds up pure OLTP workloads like TPC-C compared to existing state-of-the-art approaches.
UR - http://www.scopus.com/inward/record.url?scp=85086259420&partnerID=8YFLogxK
U2 - 10.14778/3364324.3364328
DO - 10.14778/3364324.3364328
M3 - Conference article
AN - SCOPUS:85086259420
SN - 2150-8097
VL - 13
SP - 128
EP - 141
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 2
T2 - 46th International Conference on Very Large Data Bases, VLDB 2020
Y2 - 31 August 2020 through 4 September 2020
ER -