Batched generation of incomplete sparse approximate inverses on GPUs

Hartwig Anzt, Edmond Chow, Thomas Huckle, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

Incomplete Sparse Approximate Inverses (ISAI) have recently been shown to be an attractive alternative to exact sparse triangular solves in the context of incomplete factorization preconditioning. In this paper we propose a batched GPU-kernel for the efficient generation of ISAI matrices. Utilizing only thread-local memory allows for computing the ISAI matrix with very small memory footprint. We demonstrate that this strategy is faster than the existing strategy for generating ISAI matrices, and use a large number of test matrices to assess the algorithm's efficiency in an iterative solver setting.

Original languageEnglish
Title of host publicationProceedings of ScalA 2016
Subtitle of host publication7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC16: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages49-56
Number of pages8
ISBN (Electronic)9781509052226
DOIs
StatePublished - 30 Jan 2017
Event7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2016 - Salt Lake City, United States
Duration: 13 Nov 201618 Nov 2016

Publication series

NameProceedings of ScalA 2016: 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - Held in conjunction with SC16: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2016
Country/TerritoryUnited States
CitySalt Lake City
Period13/11/1618/11/16

Keywords

  • Batched routines
  • GPU
  • Incomplete Sparse Approximate Inverses
  • Preconditioning

Fingerprint

Dive into the research topics of 'Batched generation of incomplete sparse approximate inverses on GPUs'. Together they form a unique fingerprint.

Cite this