PIPO: Policy Optimization with Permutation-Invariant Constraint for Distributed Multi-Robot Navigation

Ruiqi Zhang, Guang Chen, Jing Hou, Zhijun Li, Alois Knoll

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

For large-scale multi-agent systems (MAS), ensuring the safety and effectiveness of navigation in complicated scenarios is a challenging task. With the agent scale increasing, most existing centralized methods lose their magic for the lack of scalability, and the popular decentralized approaches are hampered by high latency and computing requirements. In this research, we offer PIPO, a novel policy optimization algorithm for decentralized MAS navigation with permutation-invariant constraints. To conduct navigation and avoid un-necessary exploration in the early episodes, we first defined a guide-policy. Then, we introduce the permutation invariant property in decentralized multi-agent systems and leverage the graph convolution network to produce the same output under shuffled observations. Our approach can be easily scaled to an arbitrary number of agents and used in large-scale systems for its decentralized training and execution. We also provide extensive experiments to demonstrate that our PIPO significantly outperforms the baselines of multi-agent reinforcement learning algorithms and other leading methods in variant scenarios.

Original languageEnglish
Title of host publication2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665460262
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI 2022 - Bedford, United Kingdom
Duration: 20 Sep 202222 Sep 2022

Publication series

NameIEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems
Volume2022-September

Conference

Conference2022 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems, MFI 2022
Country/TerritoryUnited Kingdom
CityBedford
Period20/09/2222/09/22

Fingerprint

Dive into the research topics of 'PIPO: Policy Optimization with Permutation-Invariant Constraint for Distributed Multi-Robot Navigation'. Together they form a unique fingerprint.

Cite this