Rethinking Logging, Checkpoints, and Recovery for High-Performance Storage Engines

Michael Haubenschild, Caetano Sauer, Thomas Neumann, Viktor Leis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

29 Scopus citations

Abstract

For decades, ARIES has been the standard for logging and recovery in database systems. ARIES offers important features like support for arbitrary workloads, fuzzy checkpoints, and transparent index recovery. Nevertheless, many modern in-memory database systems use more lightweight approaches that have less overhead and better multi-core scalability but only work well for the in-memory setting. Recently, a new class of high-performance storage engines has emerged, which exploit fast SSDs to achieve performance close to pure in-memory systems but also allow out-of-memory workloads. For these systems, ARIES is too slow whereas in-memory logging proposals are not applicable. In this work, we propose a new logging and recovery design that supports incremental and fuzzy checkpointing, index recovery, out-of-memory workloads, and low-latency transaction commits. Our continuous checkpointing algorithm guarantees bounded recovery time. Using per-thread logging with minimal synchronization, our implementation achieves near-linear scalability on multi-core CPUs. We implemented and evaluated these techniques in our LeanStore storage engine. For working sets that fit in main memory, we achieve performance close to that of an in-memory approach, even with logging, checkpointing, and dirty page writing enabled. For the out-of-memory scenario, we outperform a state-of-the-art ARIES implementation by a factor of two.

Original languageEnglish
Title of host publicationSIGMOD 2020 - Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages877-892
Number of pages16
ISBN (Electronic)9781450367356
DOIs
StatePublished - 14 Jun 2020
Event2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020 - Portland, United States
Duration: 14 Jun 202019 Jun 2020

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

Conference2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020
Country/TerritoryUnited States
CityPortland
Period14/06/2019/06/20

Keywords

  • checkpointing
  • leanstore
  • logging
  • recovery
  • storage engine

Fingerprint

Dive into the research topics of 'Rethinking Logging, Checkpoints, and Recovery for High-Performance Storage Engines'. Together they form a unique fingerprint.

Cite this