BACR: Set similarities with lower bounds and application to spatial trajectories

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

This paper proposes a length-independent feature representation of sets of strings based on Bloom filters called BACR for similarity search in databases. Further, we show how a Z-curve-based discretization of geospatial trajectories can be used in order to search for similar trajectories in large databases. Additionally to the already-known estimation of the size of the union and the intersection of sets from Bloom filters, we propose a way to calculate an upper bound for the intersection and a lower bound for the union of sets. Consequently, we show that the Jaccard distance and many other similarity measures allow for a lower bound. This makes exact similarity search on large databases of this type feasible. Finally, we show that the Jaccard distance is incompatible with the union of sets and replace the Jaccard distance appropriately in a way such that even collections of sets of strings can be represented with a single BACR feature vector at least for similarity search applications. The algorithms are thoroughly evaluated and motivated by real-world examples.

Original languageEnglish
Title of host publication23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015
EditorsYan Huang, Mohamed Ali, Jagan Sankaranarayanan, Matthias Renz, Michael Gertz
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450339674
DOIs
StatePublished - 3 Nov 2015
Externally publishedYes
Event23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015 - Seattle, United States
Duration: 3 Nov 20156 Nov 2015

Publication series

NameGIS: Proceedings of the ACM International Symposium on Advances in Geographic Information Systems
Volume03-06-November-2015

Conference

Conference23rd ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, ACM SIGSPATIAL GIS 2015
Country/TerritoryUnited States
CitySeattle
Period3/11/156/11/15

Keywords

  • Big data
  • Moving objects
  • Multi-modal trajectory
  • Trajectory

Fingerprint

Dive into the research topics of 'BACR: Set similarities with lower bounds and application to spatial trajectories'. Together they form a unique fingerprint.

Cite this