TY - GEN
T1 - Freedom for the SQL-Lambda
T2 - 32nd International Conference on Scientific and Statistical Database Management, SSDBM 2020
AU - Schüle, Maximilian E.
AU - Huber, Jakob
AU - Kemper, Alfons
AU - Neumann, Thomas
N1 - Publisher Copyright:
© 2020 ACM.
PY - 2020/7/7
Y1 - 2020/7/7
N2 - As part of the code-generating database system HyPer, SQL lambda functions allow user-defined metrics to be injected into data mining operators during compile time. Since version 11, PostgreSQL has supported just-in-Time compilation with LLVM for expression evaluation. This enables the concept of SQL lambda functions to be transferred to this open-source database system. In this study, we extend PostgreSQL by adding two subquery types for lambda expressions that either pre-materialise the result or return a cursor to request tuples. We demonstrate the usage of these subquery types in conjunction with dedicated table functions for data mining algorithms such as PageRank, k-Means clustering and labelling. Furthermore, we allow four levels of optimisation for query execution, ranging from interpreted function calls to just-in-Time-compiled execution. The latter-with some adjustments to the PostgreSQL's execution engine-transforms our lambda functions into real user-injected code. In our evaluation with the LDBC social network benchmark for PageRank and the Chicago taxi data set for clustering, optimised lambda functions achieved comparable performance to hard-coded implementations and HyPer's data mining algorithms.
AB - As part of the code-generating database system HyPer, SQL lambda functions allow user-defined metrics to be injected into data mining operators during compile time. Since version 11, PostgreSQL has supported just-in-Time compilation with LLVM for expression evaluation. This enables the concept of SQL lambda functions to be transferred to this open-source database system. In this study, we extend PostgreSQL by adding two subquery types for lambda expressions that either pre-materialise the result or return a cursor to request tuples. We demonstrate the usage of these subquery types in conjunction with dedicated table functions for data mining algorithms such as PageRank, k-Means clustering and labelling. Furthermore, we allow four levels of optimisation for query execution, ranging from interpreted function calls to just-in-Time-compiled execution. The latter-with some adjustments to the PostgreSQL's execution engine-transforms our lambda functions into real user-injected code. In our evaluation with the LDBC social network benchmark for PageRank and the Chicago taxi data set for clustering, optimised lambda functions achieved comparable performance to hard-coded implementations and HyPer's data mining algorithms.
KW - Database Operators
KW - Lambda Functions
KW - SQL
UR - http://www.scopus.com/inward/record.url?scp=85090423994&partnerID=8YFLogxK
U2 - 10.1145/3400903.3400915
DO - 10.1145/3400903.3400915
M3 - Conference contribution
AN - SCOPUS:85090423994
T3 - ACM International Conference Proceeding Series
BT - 32nd International Conference on Scientific and Statistical Database Management, SSDBM 2020, Proceedings
A2 - Pourabbas, Elaheh
A2 - Sacharidis, Dimitris
A2 - Stockinger, Kurt
A2 - Vergoulis, Thanasis
PB - Association for Computing Machinery
Y2 - 7 July 2020 through 9 July 2020
ER -