Temperature-Aware Memory Mapping and Active Cooling of Neural Processing Units

Vahidreza Moghaddas, Hammam Kattan, Tim Bucher, Mikail Yayla, Jian Jia Chen, Hussam Amrouch

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Neural processing units (NPUs) have become indispensable for meeting the high computational demands of deep neural networks (DNNs). They provide a very efficient solution, thanks to having a huge MAC array that enables massive parallelism. Nevertheless, such an architecture exhibits excessive on-chip power densities leading to a localized hot-spot that seriously heats its surroundings. This work demonstrates how the on-chip temperatures induced by the MAC array create a spatial thermal gradient through the on-chip SRAM memory. This makes the memory regions sensitive to different error probabilities (Perror), leading to significant accuracy drops when DNNs are being executed. To surmount this challenge, we employ on-chip superlattice thermoelectric (TEC) cooling devices that effectively reduce the memory temperature. Although scaling the memory voltage makes SRAM cells more sensitive to errors, it significantly decreases the leakage power, which compensates for the power consumed by the incorporated TEC devices. Furthermore, operating the SRAM at a lower voltage and temperature substantially increases its lifetime because voltage and temperature are key stimuli of transistor aging. By running multi-physics simulations using commercial finite-element tools and SPICE simulations for the 14nm FinFET technology, we accurately derive the relation between the Perror in different memory regions and the corresponding cooling cost. We then propose a three-stage temperature-aware layer-wise memory mapping that exploits different degrees of the sensitivity of NN layers to errors towards maximizing the DNN accuracy while minimizing the cooling cost. Experimental results reveal that our method notably improves the DNN accuracy compared to existing temperature-oblivious memory mapping.

Original languageEnglish
Title of host publication2023 IEEE/ACM International Symposium on Low Power Electronics and Design, ISLPED 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350311754
DOIs
StatePublished - 2023
Event2023 IEEE/ACM International Symposium on Low Power Electronics and Design, ISLPED 2023 - Vienna, Austria
Duration: 7 Aug 20238 Aug 2023

Publication series

NameProceedings of the International Symposium on Low Power Electronics and Design
Volume2023-August
ISSN (Print)1533-4678

Conference

Conference2023 IEEE/ACM International Symposium on Low Power Electronics and Design, ISLPED 2023
Country/TerritoryAustria
CityVienna
Period7/08/238/08/23

Keywords

  • Neural processing unit (NPU)
  • On-chip memory
  • Thermal management
  • Thermoelectric cooling (TEC)

Fingerprint

Dive into the research topics of 'Temperature-Aware Memory Mapping and Active Cooling of Neural Processing Units'. Together they form a unique fingerprint.

Cite this