Efficiently learning a distributed control policy in cyber-physical production systems via simulation optimization

Minjie Zou, Edward Huang, Birgit Vogel-Heuser, Chun Hung Chen

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

The manufacturing industry is becoming more dynamic than ever. The limitations of non-deterministic network delays and real-time requirements call for decentralized control. For such dynamic and complex systems, learning methods stand out as a transformational technology to have a more flexible control solution. Using simulation for learning enables the description of highly dynamic systems and provides samples without occupying a real facility. However, it requires prohibitively expensive computation. In this paper, we argue that simulation optimization is a powerful tool that can be applied to various simulation-based learning processes for tremendous effects. We proposed an efficient policy learning framework, ROSA (Reinforcement-learning enhanced by Optimal Simulation Allocation), with unprecedented integration of learning, control, and simulation optimization techniques, which can drastically improve the efficiency of policy learning in a cyber-physical system. A proof-of-concept is implemented on a conveyer-switch network, demonstrating how ROSA can be applied for efficient policy learning, with an emphasis on the industrial distributed control system.

Original languageEnglish
Article number9249228
Pages (from-to)645-651
Number of pages7
JournalIEEE International Conference on Automation Science and Engineering
Volume2020-January
DOIs
StatePublished - 2020
Event16th IEEE International Conference on Automation Science and Engineering, CASE 2020 - Hong Kong, Hong Kong
Duration: 20 Aug 202021 Aug 2020

Keywords

  • Cyber-physical system
  • Distributed control
  • Multi-agent
  • Reinforcement learning
  • Simulation optimization

Fingerprint

Dive into the research topics of 'Efficiently learning a distributed control policy in cyber-physical production systems via simulation optimization'. Together they form a unique fingerprint.

Cite this