Scalable algorithmic detection of silent data corruption for high-dimensional PDEs

Alfredo Parra Hinojosa, Hans Joachim Bungartz, Dirk Pflüger

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

In this paper we show how to benefit from the numerical properties of a well-established extrapolation method—the combination technique—to make it tolerant to silent data corruption (SDC). The term SDC refers to errors in data not detected by the system. We use the hierarchical structure of the combination technique to detect if parts of the floating point data are corrupted. The method we present is based on robust regression and other well-known outlier detection techniques. It is a lossy approach, meaning we sacrifice some accuracy but we benefit from the small computational overhead. We test our algorithms on a d-dimensional advection-diffusion equation and inject SDC of different orders of magnitude. We show that our method has a very good detection rate: large errors are always detected, and the small errors that go undetected do not noticeably damage the solution. We also carry out scalability tests for a 5D scenario. We finally discuss how to deal with false positives and how to extend these ideas to more general quantities of interest.

Original languageEnglish
Title of host publicationLecture Notes in Computational Science and Engineering
PublisherSpringer Verlag
Pages93-115
Number of pages23
DOIs
StatePublished - 2018

Publication series

NameLecture Notes in Computational Science and Engineering
Volume123
ISSN (Print)1439-7358

Fingerprint

Dive into the research topics of 'Scalable algorithmic detection of silent data corruption for high-dimensional PDEs'. Together they form a unique fingerprint.

Cite this