TY - GEN
T1 - Model-Aided Federated Reinforcement Learning for Multi-UAV Trajectory Planning in IoT Networks
AU - Chen, Jichao
AU - Esrafilian, Omid
AU - Bayerlein, Harald
AU - Gesbert, David
AU - Caccamo, Marco
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Deploying teams of unmanned aerial vehicles (UAVs) to harvest data from distributed Internet of Things (IoT) devices requires efficient trajectory planning and coordination algorithms. Multi-agent reinforcement learning (MARL) has emerged as a solution, but requires extensive and costly real-world training data. To tackle this challenge, we propose a novel model-aided federated MARL algorithm to coordinate multiple UAVs on a data harvesting mission with only limited knowledge about the environment. The proposed algorithm alternates between building an environment simulation model from real-world measurements, specifically learning the radio channel characteristics and estimating unknown IoT device positions, and federated QMIX training in the simulated environment. Each UAV agent trains a local QMIX model in its simulated environment and continuously consolidates it through federated learning with other agents, accelerating the learning process. A performance comparison with standard MARL algorithms demonstrates that our proposed model-aided FedQMIX algorithm reduces the need for real-world training experiences by around three magnitudes while attaining similar data collection performance.
AB - Deploying teams of unmanned aerial vehicles (UAVs) to harvest data from distributed Internet of Things (IoT) devices requires efficient trajectory planning and coordination algorithms. Multi-agent reinforcement learning (MARL) has emerged as a solution, but requires extensive and costly real-world training data. To tackle this challenge, we propose a novel model-aided federated MARL algorithm to coordinate multiple UAVs on a data harvesting mission with only limited knowledge about the environment. The proposed algorithm alternates between building an environment simulation model from real-world measurements, specifically learning the radio channel characteristics and estimating unknown IoT device positions, and federated QMIX training in the simulated environment. Each UAV agent trains a local QMIX model in its simulated environment and continuously consolidates it through federated learning with other agents, accelerating the learning process. A performance comparison with standard MARL algorithms demonstrates that our proposed model-aided FedQMIX algorithm reduces the need for real-world training experiences by around three magnitudes while attaining similar data collection performance.
UR - http://www.scopus.com/inward/record.url?scp=85190276583&partnerID=8YFLogxK
U2 - 10.1109/GCWkshps58843.2023.10465088
DO - 10.1109/GCWkshps58843.2023.10465088
M3 - Conference contribution
AN - SCOPUS:85190276583
T3 - 2023 IEEE Globecom Workshops, GC Wkshps 2023
SP - 818
EP - 823
BT - 2023 IEEE Globecom Workshops, GC Wkshps 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE Globecom Workshops, GC Wkshps 2023
Y2 - 4 December 2023 through 8 December 2023
ER -