HAFT: Hardware-assisted fault tolerance

Dmitrii Kuvaiskii, Rasha Faqeh, Pramod Bhatotia, Pascal Felber, Christof Fetzer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

48 Scopus citations

Abstract

Transient hardware faults during the execution of a program can cause data corruptions. We present HAFT, a fault tolerance technique using hardware extensions of commodity CPUs to protect unmodified multithreaded applications against such corruptions. HAFT utilizes instruction-level redundancy for fault detection and hardware transactional memory for fault recovery. We evaluated HAFT with Phoenix and PARSEC benchmarks. The observed normalized runtime is 2x, with 98.9% of the injected data corruptions being detected and 91.2% being corrected. To demonstrate the effectiveness of HAFT, we applied it to real-world case studies including Memcached, Apache, and SQLite.

Original languageEnglish
Title of host publicationProceedings of the 11th European Conference on Computer Systems, EuroSys 2016
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9781450342407
DOIs
StatePublished - 18 Apr 2016
Externally publishedYes
Event11th European Conference on Computer Systems, EuroSys 2016 - London, United Kingdom
Duration: 18 Apr 201621 Apr 2016

Publication series

NameProceedings of the 11th European Conference on Computer Systems, EuroSys 2016

Conference

Conference11th European Conference on Computer Systems, EuroSys 2016
Country/TerritoryUnited Kingdom
CityLondon
Period18/04/1621/04/16

Fingerprint

Dive into the research topics of 'HAFT: Hardware-assisted fault tolerance'. Together they form a unique fingerprint.

Cite this