Parallel Processing Strategies for Big Geospatial Data

Research output: Contribution to journalReview articlepeer-review

9 Scopus citations

Abstract

This paper provides an abstract analysis of parallel processing strategies for spatial and spatio-temporal data. It isolates aspects such as data locality and computational locality as well as redundancy and locally sequential access as central elements of parallel algorithm design for spatial data. Furthermore, the paper gives some examples from simple and advanced GIS and spatial data analysis highlighting both that big data systems have been around long before the current hype of big data and that they follow some design principles which are inevitable for spatial data including distributed data structures and messaging, which are, however, incompatible with the popular MapReduce paradigm. Throughout this discussion, the need for a replacement or extension of the MapReduce paradigm for spatial data is derived. This paradigm should be able to deal with the imperfect data locality inherent to spatial data hindering full independence of non-trivial computational tasks. We conclude that more research is needed and that spatial big data systems should pick up more concepts like graphs, shortest paths, raster data, events, and streams at the same time instead of solving exactly the set of spatially separable problems such as line simplifications or range queries in manydifferent ways.

Original languageEnglish
Article number44
JournalFrontiers in Big Data
Volume2
DOIs
StatePublished - 3 Dec 2019
Externally publishedYes

Keywords

  • MapReduce
  • big data
  • cloud computing
  • spatial HPC
  • spatial computing models

Fingerprint

Dive into the research topics of 'Parallel Processing Strategies for Big Geospatial Data'. Together they form a unique fingerprint.

Cite this