TY - GEN
T1 - Power efficient job scheduling by predicting the impact of processor manufacturing variability
AU - Chasapis, Dimitrios
AU - Moretó, Miquel
AU - Schulz, Martin
AU - Rountree, Barry
AU - Valero, Mateo
AU - Casas, Marc
N1 - Publisher Copyright:
© 2019 ACM.
PY - 2019/6/26
Y1 - 2019/6/26
N2 - Modern CPUs suffer from performance and power consumption variability due to the manufacturing process. As a result, systems that do not consider such variability caused by manufacturing issues lead to performance degradations and wasted power. In order to avoid such negative impact, users and system administrators must actively counteract any manufacturing variability. In this work we show that parallel systems benefit from taking into account the consequences of manufacturing variability when making scheduling decisions at the job scheduler level. We also show that it is possible to predict the impact of this variability on specific applications by using variability-aware power prediction models. Based on these power models, we propose two job scheduling policies that consider the effects of manufacturing variability for each application and that ensure that power consumption stays under a system-wide power budget. We evaluate our policies under different power budgets and traffic scenarios, consisting of both single- and multi-node parallel applications, utilizing up to 4096 cores in total. We demonstrate that they decrease job turnaround time, compared to contemporary scheduling policies used on production clusters, up to 31% while saving up to 5.5% energy.
AB - Modern CPUs suffer from performance and power consumption variability due to the manufacturing process. As a result, systems that do not consider such variability caused by manufacturing issues lead to performance degradations and wasted power. In order to avoid such negative impact, users and system administrators must actively counteract any manufacturing variability. In this work we show that parallel systems benefit from taking into account the consequences of manufacturing variability when making scheduling decisions at the job scheduler level. We also show that it is possible to predict the impact of this variability on specific applications by using variability-aware power prediction models. Based on these power models, we propose two job scheduling policies that consider the effects of manufacturing variability for each application and that ensure that power consumption stays under a system-wide power budget. We evaluate our policies under different power budgets and traffic scenarios, consisting of both single- and multi-node parallel applications, utilizing up to 4096 cores in total. We demonstrate that they decrease job turnaround time, compared to contemporary scheduling policies used on production clusters, up to 31% while saving up to 5.5% energy.
KW - Energy efficient
KW - HPC
KW - Job scheduling
KW - Manufacturing variability
KW - Power prediction
UR - http://www.scopus.com/inward/record.url?scp=85074475762&partnerID=8YFLogxK
U2 - 10.1145/3330345.3330372
DO - 10.1145/3330345.3330372
M3 - Conference contribution
AN - SCOPUS:85074475762
T3 - Proceedings of the International Conference on Supercomputing
SP - 296
EP - 307
BT - ICS 2019 - International Conference on Supercomputing
PB - Association for Computing Machinery
T2 - 33rd ACM International Conference on Supercomputing, ICS 2019, held in conjunction with the Federated Computing Research Conference, FCRC 2019
Y2 - 26 June 2019
ER -