TY - GEN
T1 - A scalable circular pipeline design for multi-way stream joins in hardware
AU - Najafi, Mohammadreza
AU - Sadoghi, Mohammad
AU - Jacobsen, Hans Arno
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/10/24
Y1 - 2018/10/24
N2 - Efficient real-Time analytics are an integral part of a growing number of data management applications such as computational targeted advertising, algorithmic trading, and Internet of Things. In this paper, we primarily focus on accelerating stream joins, arguably one of the most commonly used and resource-intensive operators in stream processing. We propose a scalable circular pipeline design (Circular-MJ) in hardware to orchestrate multi-way join while minimizing data flow disruption. In this circular design, each new tuple (given its origin stream) starts its processing from a specific join core and passes through all respective join cores in a pipeline sequence to produce final results. We further present a novel two-stage pipeline stream join (Stashed-MJ) that uses a best-effort buffering technique (stash) to maintain intermediate results. In a case that an overwrite is detected in the stash, our design automatically resorts to recomputing intermediate results. Our experimental results demonstrate a linear throughput scaling with respect to the number of execution units in hardware.
AB - Efficient real-Time analytics are an integral part of a growing number of data management applications such as computational targeted advertising, algorithmic trading, and Internet of Things. In this paper, we primarily focus on accelerating stream joins, arguably one of the most commonly used and resource-intensive operators in stream processing. We propose a scalable circular pipeline design (Circular-MJ) in hardware to orchestrate multi-way join while minimizing data flow disruption. In this circular design, each new tuple (given its origin stream) starts its processing from a specific join core and passes through all respective join cores in a pipeline sequence to produce final results. We further present a novel two-stage pipeline stream join (Stashed-MJ) that uses a best-effort buffering technique (stash) to maintain intermediate results. In a case that an overwrite is detected in the stash, our design automatically resorts to recomputing intermediate results. Our experimental results demonstrate a linear throughput scaling with respect to the number of execution units in hardware.
KW - FPGA
KW - Hardware acceleration
KW - Heterogeneous hardware
KW - Parallel algorithms
KW - Query optimization
KW - Stream Joins
UR - http://www.scopus.com/inward/record.url?scp=85057113026&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2018.00130
DO - 10.1109/ICDE.2018.00130
M3 - Conference contribution
AN - SCOPUS:85057113026
T3 - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
SP - 1284
EP - 1287
BT - Proceedings - IEEE 34th International Conference on Data Engineering, ICDE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE International Conference on Data Engineering, ICDE 2018
Y2 - 16 April 2018 through 19 April 2018
ER -