A bigbench implementation in the hadoop ecosystem

Badrul Chowdhury, Tilmann Rabl, Pooya Saadatpanah, Jiang Du, Hans Arno Jacobsen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

BigBench is the first proposal for an end to end big data analytics benchmark. It features a rich query set with complex, realistic queries. BigBench was developed based on the decision support benchmark TPC-DS. The first proof-of-concept implementation was built for the Teradata Aster parallel database system and the queries were formulated in the proprietary SQL-MR query language. To test other systems, the queries have to be translated.

In this paper, an alternative implementation of BigBench for the Hadoop ecosystem is presented. All 30 queries of BigBench were realized using Apache Hive, Apache Hadoop, Apache Mahout, and NLTK. We will present the different design choices we took and show a proof of concept evaluation.

Original languageEnglish
Title of host publicationAdvancing Big Data Benchmarks - Proceedings of the 2013 Workshop Series on Big Data Benchmarking WBDB.cn and WBDB.us, Revised Selected Papers
EditorsMeikel Poess, Tilmann Rabl, Hans-Arno Jacobsen, Chaitanya Baru, Nambiar Raghunath, Milind Bhandarkar
PublisherSpringer Verlag
Pages3-18
Number of pages16
ISBN (Electronic)9783319105956
DOIs
StatePublished - 2014
Externally publishedYes
Event2013 Workshop Series on Big Data Benchmarking, WBDB.cn and WBDB.us. - Boston, United States
Duration: 14 Sep 201418 Sep 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8585
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2013 Workshop Series on Big Data Benchmarking, WBDB.cn and WBDB.us.
Country/TerritoryUnited States
CityBoston
Period14/09/1418/09/14

Fingerprint

Dive into the research topics of 'A bigbench implementation in the hadoop ecosystem'. Together they form a unique fingerprint.

Cite this