TY - GEN
T1 - Reinforcement Learning for Optimizing Routing in the Production Supply of Matrix Production Systems
AU - Ried, Florian
AU - Niederdränk, Simon
AU - Fottner, Johannes
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
PY - 2025
Y1 - 2025
N2 - Matrix production systems offer the flexibility to meet an increasingly individualized and volatile customer demand. However, production supply processes within these systems have rarely been investigated in detail despite playing an integral role in their performance. Contributing to closing this research gap, this work utilizes reinforcement learning for routing in the production supply of matrix production systems. In particular, it focuses on dispatching orders to the vehicles and scheduling the orders within a route. Various constraints are considered to simulate a realistic setting, including order time windows, vehicle battery limitations, and a vehicle capacity allowing to transport multiple items at once. A reinforcement learning framework is conceptualized and implemented, assigning orders to vehicles based on various route construction heuristics. Its observation space contains abstract information about current orders of the matrix production supply environment and specific data on the vehicles for the reinforcement learning agent to select both a vehicle and a heuristic. The action and observation spaces are complemented by a multi-criteria reward function, prompting the agent to learn not to violate any constraints of the environment while simultaneously choosing actions that lead to the most cost-effective routes after route optimization. The reinforcement learning route constructor approach is trained and deployed on a discrete-event simulation of a matrix production system, which is connected to the reinforcement learning framework via a socket interface. The approach has proven to be successful by outperforming two non-reinforcement learning heuristics for route construction.
AB - Matrix production systems offer the flexibility to meet an increasingly individualized and volatile customer demand. However, production supply processes within these systems have rarely been investigated in detail despite playing an integral role in their performance. Contributing to closing this research gap, this work utilizes reinforcement learning for routing in the production supply of matrix production systems. In particular, it focuses on dispatching orders to the vehicles and scheduling the orders within a route. Various constraints are considered to simulate a realistic setting, including order time windows, vehicle battery limitations, and a vehicle capacity allowing to transport multiple items at once. A reinforcement learning framework is conceptualized and implemented, assigning orders to vehicles based on various route construction heuristics. Its observation space contains abstract information about current orders of the matrix production supply environment and specific data on the vehicles for the reinforcement learning agent to select both a vehicle and a heuristic. The action and observation spaces are complemented by a multi-criteria reward function, prompting the agent to learn not to violate any constraints of the environment while simultaneously choosing actions that lead to the most cost-effective routes after route optimization. The reinforcement learning route constructor approach is trained and deployed on a discrete-event simulation of a matrix production system, which is connected to the reinforcement learning framework via a socket interface. The approach has proven to be successful by outperforming two non-reinforcement learning heuristics for route construction.
KW - Dynamic pickup-and-delivery problem
KW - Matrix production systems
KW - Reinforcement learning
KW - Routing
UR - http://www.scopus.com/inward/record.url?scp=85219168204&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-80760-2_17
DO - 10.1007/978-3-031-80760-2_17
M3 - Conference contribution
AN - SCOPUS:85219168204
SN - 9783031807596
T3 - Communications in Computer and Information Science
SP - 270
EP - 281
BT - Innovative Intelligent Industrial Production and Logistics - 5th International Conference, IN4PL 2024, Proceedings
A2 - Dassisti, Michele
A2 - Madani, Kurosh
A2 - Panetto, Hervé
PB - Springer Science and Business Media Deutschland GmbH
T2 - 5th International Conference on Innovative Intelligent Industrial Production and Logistics, IN4PL 2024
Y2 - 21 November 2024 through 22 November 2024
ER -