Distributed Stream KNN Join

Amirhesam Shahvarani, Hans Arno Jacobsen

Research output: Contribution to journalConference articlepeer-review

7 Scopus citations


kNN join over data streams is an important operation for location-aware systems, which correlates events from different sources based on their occurrence locations. Combining the complexity of kNN join and the dynamicity of data streams, kNN join in streaming environments is a computationally intensive operator, and its performance can be greatly improved by utilizing the computational capabilities of modern non-uniform memory access (NUMA) computing platforms. However, the conventional approaches to kNN join for prestored datasets do not work efficiently with the kind of highly dynamic data found in streaming environments. Therefore, in this paper, we introduce an adaptive scalable stream kNN join, named ADS-kNN, to address the challenges of performing the kNN join operation on highly dynamic data. We propose a multistage kNN execution plan that enables high-performance kNN queries in distributed settings by overlapping the computation and communication stages. Moreover, we propose an adaptive data partitioning scheme that dynamically adjusts the load among the operators according to the changes in the input values. Combining these two techniques, ADS-kNN provides a scalable and adaptive kNN join operator for data streams. Our experiments using a 56-core system show that ADS-kNN achieves a maximum throughput that is 21 times higher than that of a single-threaded approach.

Original languageEnglish
Pages (from-to)1597-1609
Number of pages13
JournalProceedings of the ACM SIGMOD International Conference on Management of Data
StatePublished - 2021
Externally publishedYes
Event2021 International Conference on Management of Data, SIGMOD 2021 - Virtual, Online, China
Duration: 20 Jun 202125 Jun 2021


  • data streams
  • distributed computing
  • nearest neighbor join


Dive into the research topics of 'Distributed Stream KNN Join'. Together they form a unique fingerprint.

Cite this