Abstract
Many threats that can undermine the reliability of a system can be realized at design, while others only during its online operation. As the availability of system monitoring sensors and run-time software increases in heterogeneous platforms, there is a demand for a novel platform-independent framework that can capture and deliver, in a holistic way, system level self-assessment and adaptation capabilities at run-time. In this paper, two groups from academia and one from industry present the following three contributions. First, system reliability is considered from the perspective of novel timing guardband designs for aging mitigation. Effective timing guardband models are presented from the physical to the system level, while targeting multiple wear-out mechanisms. Second, a technique for correlating complex software and micro-architectural events with power integrity loss is presented. The presented technique uses an embedded voltage noise sensor, a power-network model and a genetic algorithm for identifying workload that triggers power-network resonances which can ultimately lead to system failures. Third, the 'PRiME' cross-layer programming framework is presented that unites available sensors and dynamic-voltage and frequency scaling actuators with learning-based run-time process mapping and scheduling algorithms. Scenarios on exploring the energy efficiency and reliability of heterogeneous platforms using run-time software derived from the developed framework are also reviewed.
| Original language | English |
|---|---|
| Title of host publication | 2017 IEEE Int. Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2017 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1-5 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781538603628 |
| DOIs | |
| State | Published - 28 Jun 2017 |
| Externally published | Yes |
| Event | 13th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2017 - Cambridge, United Kingdom Duration: 23 Oct 2017 → 25 Oct 2017 |
Publication series
| Name | 2017 IEEE Int. Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2017 |
|---|---|
| Volume | 2018-January |
Conference
| Conference | 13th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFT 2017 |
|---|---|
| Country/Territory | United Kingdom |
| City | Cambridge |
| Period | 23/10/17 → 25/10/17 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 7 Affordable and Clean Energy
Fingerprint
Dive into the research topics of 'Hardware and software innovations in energy-efficient system-reliability monitoring'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver