TY - GEN
T1 - PIPO
T2 - 2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI 2022
AU - Zhang, Ruiqi
AU - Chen, Guang
AU - Hou, Jing
AU - Li, Zhijun
AU - Knoll, Alois
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - For large-scale multi-agent systems (MAS), ensuring the safety and effectiveness of navigation in complicated scenarios is a challenging task. With the agent scale increasing, most existing centralized methods lose their magic for the lack of scalability, and the popular decentralized approaches are hampered by high latency and computing requirements. In this research, we offer PIPO, a novel policy optimization algorithm for decentralized MAS navigation with permutation-invariant constraints. To conduct navigation and avoid un-necessary exploration in the early episodes, we first defined a guide-policy. Then, we introduce the permutation invariant property in decentralized multi-agent systems and leverage the graph convolution network to produce the same output under shuffled observations. Our approach can be easily scaled to an arbitrary number of agents and used in large-scale systems for its decentralized training and execution. We also provide extensive experiments to demonstrate that our PIPO significantly outperforms the baselines of multi-agent reinforcement learning algorithms and other leading methods in variant scenarios.
AB - For large-scale multi-agent systems (MAS), ensuring the safety and effectiveness of navigation in complicated scenarios is a challenging task. With the agent scale increasing, most existing centralized methods lose their magic for the lack of scalability, and the popular decentralized approaches are hampered by high latency and computing requirements. In this research, we offer PIPO, a novel policy optimization algorithm for decentralized MAS navigation with permutation-invariant constraints. To conduct navigation and avoid un-necessary exploration in the early episodes, we first defined a guide-policy. Then, we introduce the permutation invariant property in decentralized multi-agent systems and leverage the graph convolution network to produce the same output under shuffled observations. Our approach can be easily scaled to an arbitrary number of agents and used in large-scale systems for its decentralized training and execution. We also provide extensive experiments to demonstrate that our PIPO significantly outperforms the baselines of multi-agent reinforcement learning algorithms and other leading methods in variant scenarios.
UR - http://www.scopus.com/inward/record.url?scp=85140995299&partnerID=8YFLogxK
U2 - 10.1109/MFI55806.2022.9913862
DO - 10.1109/MFI55806.2022.9913862
M3 - Conference contribution
AN - SCOPUS:85140995299
T3 - IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems
BT - 2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 20 September 2022 through 22 September 2022
ER -