TY - GEN
T1 - Simple, Efficient, and Robust Hash Tables for Join Processing
AU - Birler, Altan
AU - Schmidt, Tobias
AU - Fent, Philipp
AU - Neumann, Thomas
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024/6/10
Y1 - 2024/6/10
N2 - Hash joins play a critical role in relational data processing and their performance is crucial for the overall performance of a database system. Due to the hard to predict nature of intermediate results, an ideal hash join implementation has to be both fast for typical queries and robust against unusual data distributions. In this paper, we present our simple, yet effective unchained in-memory hash table design. Unchained tables combine the techniques of build side partitioning, adjacency array layout, pipelined probes, Bloom filters, and software write-combine buffers to achieve significant improvements in n: m joins with skew, while preserving top-notch performance in 1: n joins. Our hash table outperforms open addressing by 2× on average in relational queries and both chaining and open addressing by up to 20× in graph processing queries.
AB - Hash joins play a critical role in relational data processing and their performance is crucial for the overall performance of a database system. Due to the hard to predict nature of intermediate results, an ideal hash join implementation has to be both fast for typical queries and robust against unusual data distributions. In this paper, we present our simple, yet effective unchained in-memory hash table design. Unchained tables combine the techniques of build side partitioning, adjacency array layout, pipelined probes, Bloom filters, and software write-combine buffers to achieve significant improvements in n: m joins with skew, while preserving top-notch performance in 1: n joins. Our hash table outperforms open addressing by 2× on average in relational queries and both chaining and open addressing by up to 20× in graph processing queries.
UR - http://www.scopus.com/inward/record.url?scp=85195816811&partnerID=8YFLogxK
U2 - 10.1145/3662010.3663442
DO - 10.1145/3662010.3663442
M3 - Conference contribution
AN - SCOPUS:85195816811
T3 - 20th International Workshop on Data Management on New Hardware, DaMoN 2024
BT - 20th International Workshop on Data Management on New Hardware, DaMoN 2024
PB - Association for Computing Machinery, Inc
T2 - 20th International Workshop on Data Management on New Hardware, DaMoN 2024
Y2 - 10 June 2024
ER -