Abstract
We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator's otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.
| Originalsprache | Englisch |
|---|---|
| Seiten (von - bis) | 1284-1296 |
| Seitenumfang | 13 |
| Fachzeitschrift | Proceedings of Machine Learning Research |
| Jahrgang | 211 |
| Publikationsstatus | Veröffentlicht - 2023 |
| Veranstaltung | 5th Annual Conference on Learning for Dynamics and Control, L4DC 2023 - Philadelphia, USA/Vereinigte Staaten Dauer: 15 Juni 2023 → 16 Juni 2023 |
Fingerprint
Untersuchen Sie die Forschungsthemen von „Hybrid Multi-agent Deep Reinforcement Learning for Autonomous Mobility on Demand Systems“. Zusammen bilden sie einen einzigartigen Fingerprint.Dieses zitieren
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver