Scalable Infrastructure for Workload Characterization of Cluster Traces

Thomas van Loo, Anshul Jindal, Shajulin Benedict, Mohak Chadha, Michael Gerndt

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In the recent past, characterizing workloads has been attempted to gain a foothold in the emerging serverless cloud market, especially in the large production cloud clusters of Google, AWS, and so forth. While analyzing and characterizing real workloads from a large production cloud cluster benefits cloud providers, researchers, and daily users, analyzing the workload traces of these clusters has been an arduous task due to the heterogeneous nature of data. This article proposes a scalable infrastructure based on Google’s dataproc for analyzing the workload traces of cloud environments. We evaluated the functioning of the proposed infrastructure using the workload traces of Google cloud cluster-usage-traces-v3. We perform the workload characterization on this dataset, focusing on the heterogeneity of the workload, the variations in job durations, aspects of resources consumption, and the overall availability of resources provided by the cluster. The findings reported in the paper will be beneficial for cloud infrastructure providers and users while managing the cloud computing resources, especially serverless platforms.

Original languageEnglish
Title of host publicationProceedings of the 12th International Conference on Cloud Computing and Services Science, CLOSER 2022
EditorsMaarten van Steen, Donald Ferguson, Claus Pahl
PublisherScience and Technology Publications, Lda
Pages254-263
Number of pages10
ISBN (Electronic)9789897585708
DOIs
StatePublished - 2022
Event12th International Conference on Cloud Computing and Services Science, CLOSER 2022 - Virtual, Online
Duration: 27 Apr 202229 Apr 2022

Publication series

NameInternational Conference on Cloud Computing and Services Science, CLOSER - Proceedings
ISSN (Electronic)2184-5042

Conference

Conference12th International Conference on Cloud Computing and Services Science, CLOSER 2022
CityVirtual, Online
Period27/04/2229/04/22

Keywords

  • Cloud Computing
  • Dataproc
  • Google Cloud
  • Google Cluster Traces
  • Scalable
  • Workload Characterization

Fingerprint

Dive into the research topics of 'Scalable Infrastructure for Workload Characterization of Cluster Traces'. Together they form a unique fingerprint.

Cite this