An online incremental clustering framework for real-time stream analytics

Carlos Salort Sanchez, Radu Tudoran, Mohamad Al Hajj Hassan, Stefano Bortoli, Goetz Brasche, Jan Baumbach, Cristian Axenie

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

With the evolution of data acquisition methods, our ability to collect real time data has increased. This requires the development of real-time analytics, using the most recent data to generate valuable insights. One example is customer profiling, where we want to identify groups of similar clients who were active recently, and improve the quality of the suggestions. Traditional clustering algorithms perform well on finite datasets, but their execution is often not compatible with real-time requirements, especially for rapid changing trends. In this context, we propose a novel approach for the definition of incremental clustering algorithms to work within real-time constraints, in an online fashion, while preserving accuracy. We show the general applicability of the framework by employing this method to three different clustering algorithms. We compare the experimental results between traditional and online approaches evaluating accuracy and computational cost. The results show that algorithms executed in our framework are comparable to their offline implementation in terms of accuracy and with a high gain in execution time, up to three orders of magnitude on average.

Original languageEnglish
Title of host publicationProceedings - 18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019
EditorsM. Arif Wani, Taghi M. Khoshgoftaar, Dingding Wang, Huanjing Wang, Naeem Seliya
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1480-1485
Number of pages6
ISBN (Electronic)9781728145495
DOIs
StatePublished - Dec 2019
Event18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019 - Boca Raton, United States
Duration: 16 Dec 201919 Dec 2019

Publication series

NameProceedings - 18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019

Conference

Conference18th IEEE International Conference on Machine Learning and Applications, ICMLA 2019
Country/TerritoryUnited States
CityBoca Raton
Period16/12/1919/12/19

Keywords

  • Data Stream
  • Data Stream Clustering
  • Online Clustering
  • Online Learning

Fingerprint

Dive into the research topics of 'An online incremental clustering framework for real-time stream analytics'. Together they form a unique fingerprint.

Cite this