Deep Reinforcement Learning-based Multi-Objective Scheduling for Distributed Heterogeneous Hybrid Flow Shops with Blocking Constraints

Xueyan Sun, Weiming Shen, Jiaxin Fan, Birgit Vogel-Heuser, Fandi Bi, Chunjiang Zhang

Research output: Contribution to journalArticlepeer-review

Abstract

This paper investigates a distributed heterogeneous hybrid blocking flow-shop scheduling problem (DHHBFSP) designed to minimize the total tardiness and total energy consumption simultaneously, and proposes an improved proximal policy optimization (IPPO) method to make real-time decisions for the DHHBFSP. A multi-objective Markov decision process is modeled for the DHHBFSP, where the reward function is represented by a vector with dynamic weights instead of the common objective-related scalar value. A factory agent (FA) is formulated for each factory to select unscheduled jobs and is trained by the proposed IPPO to improve the decision quality. Multiple FAs work asynchronously to allocate jobs that arrive randomly at the shop. A two-stage training strategy is introduced in the IPPO, which learns from both single- and dual-policy data for better data utilization. The proposed IPPO is tested on randomly generated instances and compared with variants of the basic proximal policy optimization (PPO), dispatch rules, multi-objective metaheuristics, and multi-agent reinforcement learning methods. Extensive experimental results suggest that the proposed strategies offer significant improvements to the basic PPO, and the proposed IPPO outperforms the state-of-the-art scheduling methods in both convergence and solution quality.

Original languageEnglish
Pages (from-to)278-291
Number of pages14
JournalEngineering
Volume46
DOIs
StatePublished - Mar 2025

Keywords

  • Blocking constraints
  • Distributed hybrid flow-shop scheduling
  • Multi-agent deep reinforcement learning
  • Multi-objective Markov decision process
  • Proximal policy optimization

Fingerprint

Dive into the research topics of 'Deep Reinforcement Learning-based Multi-Objective Scheduling for Distributed Heterogeneous Hybrid Flow Shops with Blocking Constraints'. Together they form a unique fingerprint.

Cite this