TY - GEN
T1 - SplitJoin
T2 - 2016 USENIX Annual Technical Conference, USENIX ATC 2016
AU - Najafi, Mohammadreza
AU - Sadoghi, Mohammad
AU - Jacobsen, Hans Arno
N1 - Publisher Copyright:
© 2016 by The USENIX Association. All Rights Reserved.
PY - 2016
Y1 - 2016
N2 - There is a rising interest in accelerating stream processing through modern parallel hardware, yet it remains a challenge as how to exploit the available resources to achieve higher throughput without sacrificing latency due to the increased length of processing pipeline and communication path and the need for central coordination. To achieve these objectives, we introduce a novel top-down data flow model for stream join processing (arguably, one of the most resource-intensive operators in stream processing), called SplitJoin, that operates by splitting the join operation into independent storing and processing steps that gracefully scale with respect to the number of cores. Furthermore, SplitJoin eliminates the need for global coordination while preserving the order of input streams by re-thinking how streams are channeled into distributed join computation cores and maintaining the order of output streams by proposing a novel distributed punctuation technique. Throughout our experimental analysis, SplitJoin offered up to 60% improvement in throughput while reducing latency by up to 3.3X compared to state-of-the-art solutions.
AB - There is a rising interest in accelerating stream processing through modern parallel hardware, yet it remains a challenge as how to exploit the available resources to achieve higher throughput without sacrificing latency due to the increased length of processing pipeline and communication path and the need for central coordination. To achieve these objectives, we introduce a novel top-down data flow model for stream join processing (arguably, one of the most resource-intensive operators in stream processing), called SplitJoin, that operates by splitting the join operation into independent storing and processing steps that gracefully scale with respect to the number of cores. Furthermore, SplitJoin eliminates the need for global coordination while preserving the order of input streams by re-thinking how streams are channeled into distributed join computation cores and maintaining the order of output streams by proposing a novel distributed punctuation technique. Throughout our experimental analysis, SplitJoin offered up to 60% improvement in throughput while reducing latency by up to 3.3X compared to state-of-the-art solutions.
UR - http://www.scopus.com/inward/record.url?scp=85023188645&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85023188645
T3 - Proceedings of the 2016 USENIX Annual Technical Conference, USENIX ATC 2016
SP - 493
EP - 505
BT - Proceedings of the 2016 USENIX Annual Technical Conference, USENIX ATC 2016
PB - USENIX Association
Y2 - 22 June 2016 through 24 June 2016
ER -