TY - GEN
T1 - Fast cache simulation for host-compiled simulation of embedded software
AU - Lu, Kun
AU - Müller-Gritschneder, Daniel
AU - Schlichtmann, Ulf
PY - 2013
Y1 - 2013
N2 - Host-compiled simulation has been proposed for software performance estimation, because of its high simulation speed. However, the simulation speed may be significantly lowered due to the cache simulation overhead. In this paper, we propose an approach that can reduce much of the cache simulation overhead, while still calculating cache misses precisely. For instruction cache, we statically analyze possible cache conflicts and perform cache conflicts aware annotation for host-compiled simulation. Within loops, the conflicts are dynamically captured by tagging the basic blocks instead of performing the expensive cache simulation. In this way, a vast majority of the cache accesses can be saved from simulation. For data cache, aggregated cache simulation is used for a large data block. Further, the data locality can be bound by considering the data allocation principle of a program. Experiments show that our approach improves the speed of host-compiled simulation by one order of magnitude, while providing the cache miss numbers with high accuracy.
AB - Host-compiled simulation has been proposed for software performance estimation, because of its high simulation speed. However, the simulation speed may be significantly lowered due to the cache simulation overhead. In this paper, we propose an approach that can reduce much of the cache simulation overhead, while still calculating cache misses precisely. For instruction cache, we statically analyze possible cache conflicts and perform cache conflicts aware annotation for host-compiled simulation. Within loops, the conflicts are dynamically captured by tagging the basic blocks instead of performing the expensive cache simulation. In this way, a vast majority of the cache accesses can be saved from simulation. For data cache, aggregated cache simulation is used for a large data block. Further, the data locality can be bound by considering the data allocation principle of a program. Experiments show that our approach improves the speed of host-compiled simulation by one order of magnitude, while providing the cache miss numbers with high accuracy.
UR - https://www.scopus.com/pages/publications/84885600400
U2 - 10.7873/date.2013.139
DO - 10.7873/date.2013.139
M3 - Conference contribution
AN - SCOPUS:84885600400
SN - 9783981537000
T3 - Proceedings -Design, Automation and Test in Europe, DATE
SP - 637
EP - 642
BT - Proceedings - Design, Automation and Test in Europe, DATE 2013
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 16th Design, Automation and Test in Europe Conference and Exhibition, DATE 2013
Y2 - 18 March 2013 through 22 March 2013
ER -