Towards Dynamic Resource Management with MPI Sessions and PMIx

Dominik Huber, Maximilian Streubel, Isaías Comprés, Martin Schulz, Martin Schreiber, Howard Pritchard

Publikation: Beitrag in Buch/Bericht/KonferenzbandKonferenzbeitragBegutachtung

14 Zitate (Scopus)

Abstract

Job management software on peta- and exascale supercomputers continues to provide static resource allocations, from a program's start until its end. Dynamic resource allocation and management is a research direction that has the potential to improve the efficiency of HPC systems and applications by dynamically adapting the resources of an application during its runtime. Resources can be adapted based on past, current or even future system conditions and matching optimization targets. However, the implementation of dynamic resource management is challenging as it requires support across many layers of the software stack, including the programming model. In this paper, we focus on the latter and present our approach to extend MPI Sessions to support dynamic resource allocations within MPI applications. While some forms of dynamicity already exist in MPI, it is currently limited by requiring global synchronization, being application or application-domain specific, or by suffering from limited support in current HPC system software stacks. We overcome these limitations with a simple, yet powerful abstraction: resources as process sets, and changes of resources as set operations leading to a graph-based perspective on resource changes. As the main contribution of this work, we provide an implementation of this approach based on MPI Sessions and PMIx. In addition, an illustration of its usage is provided, as well as a discussion about the required extensions of the PMIx standard. We report results based on a prototype implementation with Open MPI using a synthetic application, as well as a PDE solver benchmark on up to four nodes and a total of 112 cores. Overall, our results show the feasibility of our approach, which has only very moderate overheads. We see this first proof-of-concept as an important step towards resource adaptivity based on MPI Sessions.

OriginalspracheEnglisch
TitelProceedings of 2022 29th European MPI Users' Group Meeting, EuroMPI/USA 2022
Herausgeber (Verlag)Association for Computing Machinery
Seiten57-67
Seitenumfang11
ISBN (elektronisch)9781450397995
DOIs
PublikationsstatusVeröffentlicht - 14 Sept. 2022
Veranstaltung29th European MPI Users' Group Meeting, EuroMPI/USA 2022 - Chattanooga, USA/Vereinigte Staaten
Dauer: 26 Sept. 202228 Sept. 2022

Publikationsreihe

NameACM International Conference Proceeding Series

Konferenz

Konferenz29th European MPI Users' Group Meeting, EuroMPI/USA 2022
Land/GebietUSA/Vereinigte Staaten
OrtChattanooga
Zeitraum26/09/2228/09/22

Fingerprint

Untersuchen Sie die Forschungsthemen von „Towards Dynamic Resource Management with MPI Sessions and PMIx“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren