Dynamic power sharing for higher job throughput

Daniel A. Ellsworth, Allen D. Malony, Barry Rountree, Martin Schulz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

47 Scopus citations

Abstract

Current trends for high-performance systems are leading towards hardware overprovisioning where it is no longer possible to run all components at peak power without exceeding a system- or facility-wide power bound. The standard practice of static power scheduling is likely to lead to inefficiencies with over- and under-provisioning of power to components at runtime. In this paper we investigate the performance and scalability of an application agnostic runtime power scheduler (POWsched) that is capable of enforcing a system-wide power limit. Our experimental results show POWsched is robust, has negligible overhead, and can take advantage of opportunities to shift wasted power to more power-intensive applications, improving overall workload runtime by as much as 14% without job scheduler integration or application specific profiling. In addition, we conduct scalability studies to determine POWsched's overhead for large node counts. Lastly, we contribute a model and simulator (POWsim) for investigating dynamic power scheduling behavior and enforcement at scale.

Original languageEnglish
Title of host publicationProceedings of SC 2015
Subtitle of host publicationThe International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherIEEE Computer Society
ISBN (Electronic)9781450337236
DOIs
StatePublished - 15 Nov 2015
Externally publishedYes
EventInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015 - Austin, United States
Duration: 15 Nov 201520 Nov 2015

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
Volume15-20-November-2015
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Conference

ConferenceInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015
Country/TerritoryUnited States
CityAustin
Period15/11/1520/11/15

Keywords

  • HPC
  • RAPL
  • hardware over-provisioning
  • power bound

Fingerprint

Dive into the research topics of 'Dynamic power sharing for higher job throughput'. Together they form a unique fingerprint.

Cite this