Bridging the Gap between Data Lakes and RDBMSs Efficient Query Processing with Parquet

Alice Rey, Thomas Neumann

Publikation: Beitrag in FachzeitschriftKonferenzartikelBegutachtung

Abstract

In the age of massive data, databases are getting less convenient for data exploration tasks due to the costly loading phase. Still, the highly optimized query engines of database systems are greatly beneficial for the performance of data analysis tasks. With our research, we want to bridge this gap and provide paramount analytical performance without the need of static data loading. Our approach enables the integration of Parquet files - one of the most used columnar file format in the data lake context - into the data processing pipeline of a database system in a convenient way. We allow end-users to benefit from the database system performance without a costly and time-consuming loading phase.

OriginalspracheEnglisch
FachzeitschriftCEUR Workshop Proceedings
Jahrgang3651
PublikationsstatusVeröffentlicht - 2024
VeranstaltungWorkshops of the EDBT/ICDT 2024 Joint Conference, EDBT/ICDT-WS 2024 - Paestum, Italien
Dauer: 25 März 2024 → …

Fingerprint

Untersuchen Sie die Forschungsthemen von „Bridging the Gap between Data Lakes and RDBMSs Efficient Query Processing with Parquet“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren