Runtime-guided mitigation of manufacturing variability in power-constrained multi-socket NUMA nodes

Dimitrios Chasapis, Marc Casas, Miquel Moretó, Martin Schulz, Eduard Ayguadé, Jesus Labarta, Mateo Valero

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations

Abstract

Current large scale systems show increasing power demands, to the point that it has become a huge strain on facilities and budgets. Researchers in academia, labs and industry are focusing on dealing with this "power wall", striving to find a balance between performance and power consumption. Some commodity processors enable power capping, which opens up new opportunities for applications to directly manage their power behavior at user level. However, while power capping ensures a system will never exceed a given power limit, it also leads to a new form of heterogeneity: natural manufacturing variability, which was previously hidden by varying power to achieve homogeneous performance, now results in heterogeneous performance caused by different CPU frequencies, potentially for each core, to enforce the power limit. In this work we show how a parallel runtime system can be used to effectively deal with this new kind of performance heterogeneity by compensating the uneven effects of power capping. In the context of a NUMA node composed of several multi-core sockets, our system is able to optimize the energy and concurrency levels assigned to each socket to maximize performance. Applied transparently within the parallel runtime system, it does not require any programmer interaction like changing the application source code or manually reconfiguring the parallel system. We compare our novel runtime analysis with an offline approach and demonstrate that it can achieve equal performance at a fraction of the cost.

Original languageEnglish
Title of host publicationProceedings of the 2016 International Conference on Supercomputing, ICS 2016
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450343619
DOIs
StatePublished - 1 Jun 2016
Externally publishedYes
Event30th International Conference on Supercomputing, ICS 2016 - Istanbul, Turkey
Duration: 1 Jun 20163 Jun 2016

Publication series

NameProceedings of the International Conference on Supercomputing
Volume01-03-June-2016

Conference

Conference30th International Conference on Supercomputing, ICS 2016
Country/TerritoryTurkey
CityIstanbul
Period1/06/163/06/16

Keywords

  • High performance computing
  • Manufacturing variability
  • Parallel architectures
  • Parallel programming
  • Pararallel runtimes
  • Power and energy

Fingerprint

Dive into the research topics of 'Runtime-guided mitigation of manufacturing variability in power-constrained multi-socket NUMA nodes'. Together they form a unique fingerprint.

Cite this