A tailored regression for learned indexes: Logarithmic error regression

Martin Eppert, Philipp Fent, Thomas Neumann

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Although linear regressions are essential for learned index structures, most implementations use Simple Linear Regression, which optimizes the squared error. Since learned indexes use exponential search, regressions that optimize the logarithmic error are much better tailored for the use-case. By using this fitting optimization target, we can significantly improve learned index's lookup performance with no architectural changes. While the log-error is harder to optimize, our novel algorithms and optimization heuristics can bring a practical performance improvement of the lookup latency. Even in cases where fast build times are paramount, log-error regressions still provide a robust fallback for degenerated leaf models. The resulting regressions are much better suited for learned indexes, and speed up lookups on data sets with outliers by over a factor of 2.

Original languageEnglish
Title of host publicationProceedings of the 4th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM 2021
PublisherAssociation for Computing Machinery, Inc
Pages9-15
Number of pages7
ISBN (Electronic)9781450385350
DOIs
StatePublished - 20 Jun 2021
Event4th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM 2021 - Virtual, Online, China
Duration: 20 Jun 202125 Jun 2021

Publication series

NameProceedings of the 4th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM 2021

Conference

Conference4th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management, aiDM 2021
Country/TerritoryChina
CityVirtual, Online
Period20/06/2125/06/21

Fingerprint

Dive into the research topics of 'A tailored regression for learned indexes: Logarithmic error regression'. Together they form a unique fingerprint.

Cite this