Scalable algorithmic detection of silent data corruption for high-dimensional PDEs

Alfredo Parra Hinojosa, Hans Joachim Bungartz, Dirk Pflüger

Publikation: Beitrag in Buch/Bericht/KonferenzbandKapitelBegutachtung

Abstract

In this paper we show how to benefit from the numerical properties of a well-established extrapolation method—the combination technique—to make it tolerant to silent data corruption (SDC). The term SDC refers to errors in data not detected by the system. We use the hierarchical structure of the combination technique to detect if parts of the floating point data are corrupted. The method we present is based on robust regression and other well-known outlier detection techniques. It is a lossy approach, meaning we sacrifice some accuracy but we benefit from the small computational overhead. We test our algorithms on a d-dimensional advection-diffusion equation and inject SDC of different orders of magnitude. We show that our method has a very good detection rate: large errors are always detected, and the small errors that go undetected do not noticeably damage the solution. We also carry out scalability tests for a 5D scenario. We finally discuss how to deal with false positives and how to extend these ideas to more general quantities of interest.

OriginalspracheEnglisch
TitelLecture Notes in Computational Science and Engineering
Herausgeber (Verlag)Springer Verlag
Seiten93-115
Seitenumfang23
DOIs
PublikationsstatusVeröffentlicht - 2018

Publikationsreihe

NameLecture Notes in Computational Science and Engineering
Band123
ISSN (Print)1439-7358

Fingerprint

Untersuchen Sie die Forschungsthemen von „Scalable algorithmic detection of silent data corruption for high-dimensional PDEs“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren