Infrastructure and API extensions for elastic execution of MPI applications

Isaías Comprés, Ao Mo-Hellenbrand, Michael Gerndt, Hans Joachim Bungartz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Scopus citations

Abstract

Dynamic Processes support was added to MPI in version 2.0 of the standard. This feature of MPI has not been widely used by application developers in part due to the performance cost and limitations of the spawn operation. In this paper, we propose an extension to MPI that consists of four new operations. These operations allow an application to be initialized in an elastic mode of execution and enter an adaptation window when necessary, where resources are incorporated into or released from the application's world communicator. A prototype solution based on the MPICH library and the SLURM resource manager is presented and evaluated alongside an elastic scientific application that makes use of the new MPI extensions. The cost of these new operations is shown to be negligible due mainly to the latency hiding design, leaving the application's time for data redistribution as the only significant performance cost.

Original languageEnglish
Title of host publicationProceedings of the 23rd European MPI Users' Group Meeting, EuroMPI 2016
PublisherAssociation for Computing Machinery
Pages82-97
Number of pages16
ISBN (Electronic)9781450342346
DOIs
StatePublished - 25 Sep 2016
Event23rd European MPI Users' Group Meeting, EuroMPI 2016 - Edinburgh, United Kingdom
Duration: 25 Sep 201628 Sep 2016

Publication series

NameACM International Conference Proceeding Series
Volume25-28-September-2016

Conference

Conference23rd European MPI Users' Group Meeting, EuroMPI 2016
Country/TerritoryUnited Kingdom
CityEdinburgh
Period25/09/1628/09/16

Keywords

  • Elastic computing
  • MPI
  • MPICH
  • Malleable applications
  • Message passing
  • Resource aware computing
  • SLURM

Fingerprint

Dive into the research topics of 'Infrastructure and API extensions for elastic execution of MPI applications'. Together they form a unique fingerprint.

Cite this