TY - JOUR
T1 - Scalable analytics on fast data
AU - Kipf, Andreas
AU - Pandey, Varun
AU - Böttcher, Jan
AU - Braun, Lucas
AU - Neumann, Thomas
AU - Kemper, Alfons
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019/1
Y1 - 2019/1
N2 - Today's streaming applications demand increasingly high event throughput rates and are often subject to strict latency constraints. To allow for more complex workloads, such as window-based aggregations, streaming systems need to support stateful event processing. This introduces new challenges for streaming engines as the state needs to be maintained in a consistent and durable manner and simultaneously accessed by complex queries for real-time analytics. Modern streaming systems, such as Apache Flink, do not allow for efficiently exposing the state to analytical queries. Thus, data engineers are forced to keep the state in external data stores, which significantly increases the latencies until events become visible to analytical queries. Proprietary solutions have been created to meet data freshness constraints. These solutions are expensive, error-prone, and difficult to maintain. Main-memory database systems, such as HyPer, achieve extremely low query response times while maintaining high update rates, which makes them well-suited for analytical streaming workloads. In this article, we explore extensions to database systems to match the performance and usability of streaming systems.
AB - Today's streaming applications demand increasingly high event throughput rates and are often subject to strict latency constraints. To allow for more complex workloads, such as window-based aggregations, streaming systems need to support stateful event processing. This introduces new challenges for streaming engines as the state needs to be maintained in a consistent and durable manner and simultaneously accessed by complex queries for real-time analytics. Modern streaming systems, such as Apache Flink, do not allow for efficiently exposing the state to analytical queries. Thus, data engineers are forced to keep the state in external data stores, which significantly increases the latencies until events become visible to analytical queries. Proprietary solutions have been created to meet data freshness constraints. These solutions are expensive, error-prone, and difficult to maintain. Main-memory database systems, such as HyPer, achieve extremely low query response times while maintaining high update rates, which makes them well-suited for analytical streaming workloads. In this article, we explore extensions to database systems to match the performance and usability of streaming systems.
KW - Event processing
KW - Multi-version concurrency control
KW - Real-time analytics
KW - User-space networking
UR - http://www.scopus.com/inward/record.url?scp=85061189821&partnerID=8YFLogxK
U2 - 10.1145/3283811
DO - 10.1145/3283811
M3 - Article
AN - SCOPUS:85061189821
SN - 0362-5915
VL - 44
JO - ACM Transactions on Database Systems
JF - ACM Transactions on Database Systems
IS - 1
M1 - 1
ER -