Living on the edge: Efficient handling of large scale sensor data

Roman Karlstetter, Amir Raoofy, Martin Radev, Carsten Trinitis, Jakob Hermann, Martin Schulz

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

4 Zitate (Scopus)

Abstract

Real-time sensor monitoring is critical in many industrial applications and is, e.g., used to model and predict operating conditions to optimize operations as well as to prevent damage in machinery and systems. In many cases, this data is generated by a myriad of sensors and stored or transmitted for post-processing by data analysts. Handling this data near its origin - on the edge - imposes significant challenges for storage and compression: it is necessary to store it in a format that is suitable for large data analytics algorithms, which in most cases means columnar storage. Furthermore, to provide efficient storage and transmission of such sensor data, it must be compressed efficiently. However, existing solutions do not address these challenges sufficiently. In this work, we present a holistic approach for fast streaming of large scale sensor data directly into columnar storage and integrate it with a proven compression scheme. Our approach uses a pipelined scheme for streaming and transposing the data layout, combined with a byte-level transformation of data representation and compression, which we evaluate in comprehensive experiments. As a result, our approach enables transformation of large scale sensor data streams into an efficient, analytics-friendly format already at the sensor site, i.e., on the edge, at data ingestion time. By implementing our optimized approach in the open and widely used columnar storage format Apache Parquet, which we already partly upstreamed, we ensure its accessibility to the community.

OriginalspracheEnglisch
TitelProceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021
Redakteure/-innenLaurent Lefevre, Stacy Patterson, Young Choon Lee, Haiying Shen, Shashikant Ilager, Mohammad Goudarzi, Adel N. Toosi, Rajkumar Buyya
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten1-10
Seitenumfang10
ISBN (elektronisch)9781728195865
DOIs
PublikationsstatusVeröffentlicht - Mai 2021
Veranstaltung21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021 - Virtual, Melbourne, Australien
Dauer: 10 Mai 202113 Mai 2021

Publikationsreihe

NameProceedings - 21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021

Konferenz

Konferenz21st IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGrid 2021
Land/GebietAustralien
OrtVirtual, Melbourne
Zeitraum10/05/2113/05/21

Fingerprint

Untersuchen Sie die Forschungsthemen von „Living on the edge: Efficient handling of large scale sensor data“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren