Spatial data locality in scalable and fault-tolerant distributed spatial computing systems

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the last decade, spatial datasets started to grow from small collections of high quality geospatial information into huge collections of data covering the whole planet with varying formats and qualities. Large-scale spatial datasets are about to create significant value in varying application fields including navigation, autonomous driving, urban geography, agriculture, and climate research. Therefore, large datasets are actively acquired. In addition, social networks such as Facebook, Twitter, and Flickr provide text, video, and images with associated geospatial information from the crowd. These sources are highly interesting as they provide near-realtime insights into aspects of human behavior and dynamics. Finally, global and long-running satellite missions such as Landsat, Sentinel, World- View, or TerraSAR add large amounts of geospatial information. It is a matter of fact that these data collections are putting challenges to the computational infrastructure used for spatial computing. Not only do we need a lot of computation, we also need to think about how to organize and design distributed systems that can help tackle the volume, velocity, and variety of current and future geospatial datasets. Modern big data systems employ data replication for two main reasons: first, for increased fault tolerance, and, second, for higher flexibility in scheduling tasks across a large cluster of machines. This paper proposes and compares novel data replication schemata for scalable spatial computing and analyzes the impact on the communication complexity of global spatial joins of a large collection of tweets collected from the Twitter API and building polygons extracted from OpenStreetMap.

Original languageEnglish
Title of host publicationProceedings of the 7th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2018
PublisherAssociation for Computing Machinery, Inc
Pages47-56
Number of pages10
ISBN (Electronic)9781450360418
DOIs
StatePublished - 6 Nov 2018
Externally publishedYes
Event7th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2018 - Seattle, United States
Duration: 6 Nov 20186 Nov 2018

Publication series

NameProceedings of the 7th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2018

Conference

Conference7th ACM SIGSPATIAL International Workshop on Analytics for Big Geospatial Data, BigSpatial 2018
Country/TerritoryUnited States
CitySeattle
Period6/11/186/11/18

Keywords

  • Data Replication and Distribution; Spatial Join
  • Spatial Big Data

Fingerprint

Dive into the research topics of 'Spatial data locality in scalable and fault-tolerant distributed spatial computing systems'. Together they form a unique fingerprint.

Cite this