Power efficient job scheduling by predicting the impact of processor manufacturing variability

Dimitrios Chasapis, Miquel Moretó, Martin Schulz, Barry Rountree, Mateo Valero, Marc Casas

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Modern CPUs suffer from performance and power consumption variability due to the manufacturing process. As a result, systems that do not consider such variability caused by manufacturing issues lead to performance degradations and wasted power. In order to avoid such negative impact, users and system administrators must actively counteract any manufacturing variability. In this work we show that parallel systems benefit from taking into account the consequences of manufacturing variability when making scheduling decisions at the job scheduler level. We also show that it is possible to predict the impact of this variability on specific applications by using variability-aware power prediction models. Based on these power models, we propose two job scheduling policies that consider the effects of manufacturing variability for each application and that ensure that power consumption stays under a system-wide power budget. We evaluate our policies under different power budgets and traffic scenarios, consisting of both single- and multi-node parallel applications, utilizing up to 4096 cores in total. We demonstrate that they decrease job turnaround time, compared to contemporary scheduling policies used on production clusters, up to 31% while saving up to 5.5% energy.

Original languageEnglish
Title of host publicationICS 2019 - International Conference on Supercomputing
PublisherAssociation for Computing Machinery
Pages296-307
Number of pages12
ISBN (Electronic)9781450360791
DOIs
StatePublished - 26 Jun 2019
Event33rd ACM International Conference on Supercomputing, ICS 2019, held in conjunction with the Federated Computing Research Conference, FCRC 2019 - Phoenix, United States
Duration: 26 Jun 2019 → …

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference33rd ACM International Conference on Supercomputing, ICS 2019, held in conjunction with the Federated Computing Research Conference, FCRC 2019
Country/TerritoryUnited States
CityPhoenix
Period26/06/19 → …

Keywords

  • Energy efficient
  • HPC
  • Job scheduling
  • Manufacturing variability
  • Power prediction

Fingerprint

Dive into the research topics of 'Power efficient job scheduling by predicting the impact of processor manufacturing variability'. Together they form a unique fingerprint.

Cite this