Non-Blocking GPU-CPU Notifications to Enable More GPU-CPU Parallelism

Bengisu Elis, Olga Pearce, David Boehme, Jason Burmark, Martin Schulz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

GPUs are increasingly popular in HPC systems, and more applications are adopting GPUs each day. However, the control synchronization of GPUs with CPUs is suboptimal and only possible after GPU kernel termination points, resulting in serialized host and device tasks. In this paper, we propose a novel CPU-GPU notification method that enables non-blocking in-kernel control synchronization of device and host tasks in combination with persistent GPU kernels. Using this notification method, we increase the overlap of CPU and GPU execution and with that parallelism. We present the concept and structure of the proposed notification mechanism together with in-kernel GPU-CPU control synchronization, using halo-exchange as an example. We analyze the performance of the halo-exchange pattern using our new notification method, as well as the interference between CPU and GPU operations due to the execution overlap. Finally, we verify our results using a performance model covering the halo-exchange pattern with the new notification method.

Original languageEnglish
Title of host publicationBDSIC2023 - 2023 5th International Conference on Big-data Service and Intelligent Computation
PublisherAssociation for Computing Machinery
Pages1-11
Number of pages11
ISBN (Electronic)9798400708923
DOIs
StatePublished - 20 Oct 2023
Event5th International Conference on Big-data Service and Intelligent Computation - Singapore, Singapore
Duration: 20 Oct 202322 Oct 2023

Publication series

NameACM International Conference Proceeding Series

Conference

Conference5th International Conference on Big-data Service and Intelligent Computation
Country/TerritorySingapore
CitySingapore
Period20/10/2322/10/23

Keywords

  • GPU
  • Halo-exchange
  • MPI
  • Synchronization

Fingerprint

Dive into the research topics of 'Non-Blocking GPU-CPU Notifications to Enable More GPU-CPU Parallelism'. Together they form a unique fingerprint.

Cite this