ELZAR: Triple modular redundancy using intel AVX (practical experience report)

Dmitrii Kuvaiskii, Oleskii Oleksenko, Pramod Bhatotia, Pascal Felber, Christof Fetzer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

Instruction-Level Redundancy (ILR) is a well-known approach to tolerate transient CPU faults. It replicates instructions in a program and inserts periodic checks to detect and correct CPU faults using majority voting, which essentially requires three copies of each instruction and leads to high performance overheads. As SIMD technology can operate simultaneously on several copies of the data, it appears to be a good candidate for decreasing these overheads. To verify this hypothesis, we propose ELZAR, a compiler framework that transforms unmodified multithreaded applications to support triple modular redundancy using Intel AVX extensions for vectorization. Our experience with several benchmark suites and real-world case-studies yields mixed results: while SIMD may be beneficial for some workloads, e.g., CPU-intensive ones with many floating-point operations, it exposes higher overhead than ILR in many applications we tested.

Original languageEnglish
Title of host publicationProceedings - 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages646-653
Number of pages8
ISBN (Electronic)9781467388917
DOIs
StatePublished - 29 Sep 2016
Externally publishedYes
Event46th IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2016 - Toulouse, France
Duration: 28 Jun 20161 Jul 2016

Publication series

NameProceedings - 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2016

Conference

Conference46th IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2016
Country/TerritoryFrance
CityToulouse
Period28/06/161/07/16

Keywords

  • Fault Tolerance
  • Hardware faults
  • SIMD

Fingerprint

Dive into the research topics of 'ELZAR: Triple modular redundancy using intel AVX (practical experience report)'. Together they form a unique fingerprint.

Cite this