TY - JOUR
T1 - Inter-Cell Network Slicing With Transfer Learning Empowered Multi-Agent Deep Reinforcement Learning
AU - Hu, Tianlun
AU - Liao, Qi
AU - Liu, Qiang
AU - Carle, Georg
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2023
Y1 - 2023
N2 - Network slicing enables operators to cost-efficiently support diverse applications on a common physical infrastructure. The ever-increasing densification of network deployment leads to complex and non-trivial inter-cell interference, which requires more than inaccurate analytic models to dynamically optimize resource management for network slices. In this paper, we develop a DRIP algorithm with multiple deep reinforcement learning (DRL) agents to cooperatively optimize resource partition in individual cells to fulfill the requirements of each slice, based on two alternative reward functions with max-min fairness and logarithmic utility. Nevertheless, existing DRL approaches usually tie the pretrained model parameters to specific network environments with poor transferability, which raises practical deployment concerns in large-scale mobile networks. Hence, we design a novel transfer learning-aided DIRP (TL-DIRP) algorithm to ease the transfer of DRIP agents across different network environments in terms of sample efficiency, model reproducibility, and algorithm scalability. The TL-DIRP algorithm first centrally trains a generalized model and then transfers the 'generalist' to each local agent (a.k.a., the 'specialist') with distributed finetuning and execution. TL-DIRP consists of two steps: 1) centralized training of a generalized distributed model, and 2) transferring the 'generalist' to each local agent with distributed finetuning and execution. We comprehensively investigate different types of transferable knowledge: model transfer, instance transfer, and combined model and instance transfer. We evaluate the proposed algorithms in a system-level network simulator with 12 cells. The numerical results show that not only DIRP outperforms existing baseline approaches in terms of faster convergence and higher reward, but more importantly, TL-DIRP significantly improves the service performance, with reduced exploration cost, accelerated convergence rate, and enhanced model reproducibility. As compared to a traffic-aware baseline, TL-DIRP provides about 15% less violation ratio of the quality of service (QoS) for the worst slice service and 8.8% less violation on the average service QoS.
AB - Network slicing enables operators to cost-efficiently support diverse applications on a common physical infrastructure. The ever-increasing densification of network deployment leads to complex and non-trivial inter-cell interference, which requires more than inaccurate analytic models to dynamically optimize resource management for network slices. In this paper, we develop a DRIP algorithm with multiple deep reinforcement learning (DRL) agents to cooperatively optimize resource partition in individual cells to fulfill the requirements of each slice, based on two alternative reward functions with max-min fairness and logarithmic utility. Nevertheless, existing DRL approaches usually tie the pretrained model parameters to specific network environments with poor transferability, which raises practical deployment concerns in large-scale mobile networks. Hence, we design a novel transfer learning-aided DIRP (TL-DIRP) algorithm to ease the transfer of DRIP agents across different network environments in terms of sample efficiency, model reproducibility, and algorithm scalability. The TL-DIRP algorithm first centrally trains a generalized model and then transfers the 'generalist' to each local agent (a.k.a., the 'specialist') with distributed finetuning and execution. TL-DIRP consists of two steps: 1) centralized training of a generalized distributed model, and 2) transferring the 'generalist' to each local agent with distributed finetuning and execution. We comprehensively investigate different types of transferable knowledge: model transfer, instance transfer, and combined model and instance transfer. We evaluate the proposed algorithms in a system-level network simulator with 12 cells. The numerical results show that not only DIRP outperforms existing baseline approaches in terms of faster convergence and higher reward, but more importantly, TL-DIRP significantly improves the service performance, with reduced exploration cost, accelerated convergence rate, and enhanced model reproducibility. As compared to a traffic-aware baseline, TL-DIRP provides about 15% less violation ratio of the quality of service (QoS) for the worst slice service and 8.8% less violation on the average service QoS.
KW - Transfer learning
KW - deep reinforcement learning
KW - multi-agent coordination
KW - network slicing
KW - resource allocation
UR - http://www.scopus.com/inward/record.url?scp=85159791222&partnerID=8YFLogxK
U2 - 10.1109/OJCOMS.2023.3273310
DO - 10.1109/OJCOMS.2023.3273310
M3 - Article
AN - SCOPUS:85159791222
SN - 2644-125X
VL - 4
SP - 1141
EP - 1155
JO - IEEE Open Journal of the Communications Society
JF - IEEE Open Journal of the Communications Society
ER -