A memory bandwidth-efficient hybrid radix Sort on GPUs

Elias Stehle, Hans Arno Jacobsen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

53 Scopus citations

Abstract

Sorting is at the core of many database operations, such as index creation, sort-merge joins, and user-requested output sorting. As GPUs are emerging as a promising platform to accelerate various operations, sorting on GPUs becomes a viable endeavour. Over the past few years, several improvements have been proposed for sorting on GPUs, leading to the first radix sort implementations that achieve a sorting rate of over one billion 32-bit keys per second. Yet, state-of-the-art approaches are heavily memory bandwidthbound, as they require substantially more memory transfers than their CPU-based counterparts. Our work proposes a novel approach that almost halves the amount of memory transfers and, therefore, considerably lifts the memory bandwidth limitation. Being able to sort two gigabytes of eightbyte records in as little as 50 milliseconds, our approach achieves a 2:32-fold improvement over the state-of-the-art GPU-based radix sort for uniform distributions, sustaining a minimum speed-up of no less than a factor of 1:66 for skewed distributions. To address inputs that either do not reside on the GPU or exceed the available device memory, we build on our efficient GPU sorting approach with a pipelined heterogeneous sorting algorithm that mitigates the overhead associated with PCIe data transfers. Comparing the end-toend sorting performance to the state-of-the-art CPU-based radix sort running 16 threads, our heterogeneous approach achieves a 2:06-fold and a 1:53-fold improvement for sorting 64 GB key-value pairs with a skewed and a uniform distribution, respectively.

Original languageEnglish
Title of host publicationSIGMOD 2017 - Proceedings of the 2017 ACM International Conference on Management of Data
PublisherAssociation for Computing Machinery
Pages417-432
Number of pages16
ISBN (Electronic)9781450341974
DOIs
StatePublished - 9 May 2017
Event2017 ACM SIGMOD International Conference on Management of Data, SIGMOD 2017 - Chicago, United States
Duration: 14 May 201719 May 2017

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
VolumePart F127746
ISSN (Print)0730-8078

Conference

Conference2017 ACM SIGMOD International Conference on Management of Data, SIGMOD 2017
Country/TerritoryUnited States
CityChicago
Period14/05/1719/05/17

Fingerprint

Dive into the research topics of 'A memory bandwidth-efficient hybrid radix Sort on GPUs'. Together they form a unique fingerprint.

Cite this