TY - GEN
T1 - GPU Partitioning & Neural Architecture Sizing for Safety-Driven Sensing in Autonomous Systems
AU - Xu, Shengjie
AU - Hobbs, Clara
AU - Song, Yukai
AU - Ghosh, Bineet
AU - Zhu, Tingan
AU - Aktar, Sharmin
AU - Yang, Lei
AU - Sheng, Yi
AU - Jiang, Weiwen
AU - Hu, Jingtong
AU - Duggirala, Parasara Sridhar
AU - Chakraborty, Samarjit
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Neural networks are now routinely used for perception processing in autonomous systems. Often, these neural networks are used to estimate the state of the system, such as distance and velocity of the car in front, that is used in downstream control tasks. While significant advances in neural architecture search and sizing have been made towards improving inference quality, they do not take into account the effect of these improvements in the performance of the overall system. In this paper, we examine a setup where multiple neural networks for estimating various state components of the same system share the same graphics processing unit (GPU) - a limited computational resource. We address the problem of optimal resource allocation for each neural network, e.g., how to suitably size these networks, while improving the overall performance - specifically, safety - of the system. In particular, we distinguish between optimizing the performance of individual neural networks, versus optimizing the system-level performance or safety. Our main technical contribution is a set of techniques for neural architecture sizing with the goal of optimizing overall system safety for a given GPU capacity. Our evaluation on two different benchmarks shows that we can explore the architecture space with 10x to 100x improvements in running time.
AB - Neural networks are now routinely used for perception processing in autonomous systems. Often, these neural networks are used to estimate the state of the system, such as distance and velocity of the car in front, that is used in downstream control tasks. While significant advances in neural architecture search and sizing have been made towards improving inference quality, they do not take into account the effect of these improvements in the performance of the overall system. In this paper, we examine a setup where multiple neural networks for estimating various state components of the same system share the same graphics processing unit (GPU) - a limited computational resource. We address the problem of optimal resource allocation for each neural network, e.g., how to suitably size these networks, while improving the overall performance - specifically, safety - of the system. In particular, we distinguish between optimizing the performance of individual neural networks, versus optimizing the system-level performance or safety. Our main technical contribution is a set of techniques for neural architecture sizing with the goal of optimizing overall system safety for a given GPU capacity. Our evaluation on two different benchmarks shows that we can explore the architecture space with 10x to 100x improvements in running time.
KW - GPU partitioning
KW - autonomous system
KW - optimal neural architecture sizing
KW - reachability
KW - sensitivity
KW - uncertainty
UR - https://www.scopus.com/pages/publications/85214015946
U2 - 10.1109/ICAA64256.2024.00018
DO - 10.1109/ICAA64256.2024.00018
M3 - Conference contribution
AN - SCOPUS:85214015946
T3 - Proceedings - 2024 International Conference on Assured Autonomy, ICAA 2024
SP - 67
EP - 76
BT - Proceedings - 2024 International Conference on Assured Autonomy, ICAA 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd International Conference on Assured Autonomy, ICAA 2024
Y2 - 10 October 2024 through 11 October 2024
ER -