Continuous Learning Graphical Knowledge Unit for Cluster Identification in High Density Data Sets

K. K.L.B. Adikaram, Mohamed A. Hussein, Mathias Effenberger, Thomas Becker

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Big data are visually cluttered by overlapping data points. Rather than removing, reducing or reformulating overlap, we propose a simple, effective and powerful technique for density cluster generation and visualization, where point marker (graphical symbol of a data point) overlap is exploited in an additive fashion in order to obtain bitmap data summaries in which clusters can be identified visually, aided by automatically generated contour lines. In the proposed method, the plotting area is a bitmap and the marker is a shape of more than one pixel. As the markers overlap, the red, green and blue (RGB) colour values of pixels in the shared region are added. Thus, a pixel of a 24-bit RGB bitmap can code up to 224 (over 1.6 million) overlaps. A higher number of overlaps at the same location makes the colour of this area identical, which can be identified by the naked eye. A bitmap is a matrix of colour values that can be represented as integers. The proposed method updates this matrix while adding new points. Thus, this matrix can be considered as an up-to-time knowledge unit of processed data. Results show cluster generation, cluster identification, missing and out-of-range data visualization, and outlier detection capability of the newly proposed method.

Original languageEnglish
Article number152
JournalSymmetry
Volume8
Issue number12
DOIs
StatePublished - Dec 2016

Keywords

  • Big data
  • Clustering
  • Contour lines
  • Data and knowledge visualization
  • Knowledge retrieval
  • Mining methods and algorithms
  • Missing data
  • Real-time systems

Fingerprint

Dive into the research topics of 'Continuous Learning Graphical Knowledge Unit for Cluster Identification in High Density Data Sets'. Together they form a unique fingerprint.

Cite this