Scalable analytics on fast data

Andreas Kipf, Varun Pandey, Jan Böttcher, Lucas Braun, Thomas Neumann, Alfons Kemper

Research output: Contribution to journalArticlepeer-review

7 Scopus citations

Abstract

Today's streaming applications demand increasingly high event throughput rates and are often subject to strict latency constraints. To allow for more complex workloads, such as window-based aggregations, streaming systems need to support stateful event processing. This introduces new challenges for streaming engines as the state needs to be maintained in a consistent and durable manner and simultaneously accessed by complex queries for real-time analytics. Modern streaming systems, such as Apache Flink, do not allow for efficiently exposing the state to analytical queries. Thus, data engineers are forced to keep the state in external data stores, which significantly increases the latencies until events become visible to analytical queries. Proprietary solutions have been created to meet data freshness constraints. These solutions are expensive, error-prone, and difficult to maintain. Main-memory database systems, such as HyPer, achieve extremely low query response times while maintaining high update rates, which makes them well-suited for analytical streaming workloads. In this article, we explore extensions to database systems to match the performance and usability of streaming systems.

Original languageEnglish
Article number1
JournalACM Transactions on Database Systems
Volume44
Issue number1
DOIs
StatePublished - Jan 2019

Keywords

  • Event processing
  • Multi-version concurrency control
  • Real-time analytics
  • User-space networking

Fingerprint

Dive into the research topics of 'Scalable analytics on fast data'. Together they form a unique fingerprint.

Cite this