Learning throttle valve control using policy search

Bastian Bischoff, Duy Nguyen-Tuong, Torsten Koller, Heiner Markert, Alois Knoll

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

18 Scopus citations

Abstract

The throttle valve is a technical device used for regulating a fluid or a gas flow. Throttle valve control is a challenging task, due to its complex dynamics and demanding constraints for the controller. Using state-of-the-art throttle valve control, such as model-free PID controllers, time-consuming and manual adjusting of the controller is necessary. In this paper, we investigate how reinforcement learning (RL) can help to alleviate the effort of manual controller design by automatically learning a control policy from experiences. In order to obtain a valid control policy for the throttle valve, several constraints need to be addressed, such as no-overshoot. Furthermore, the learned controller must be able to follow given desired trajectories, while moving the valve from any start to any goal position and, thus, multi-targets policy learning needs to be considered for RL. In this study, we employ a policy search RL approach, Pilco [2], to learn a throttle valve control policy. We adapt the Pilco algorithm, while taking into account the practical requirements and constraints for the controller. For evaluation, we employ the resulting algorithm to solve several control tasks in simulation, as well as on a physical throttle valve system. The results show that policy search RL is able to learn a consistent control policy for complex, real-world systems.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Proceedings
Pages49-64
Number of pages16
EditionPART 1
DOIs
StatePublished - 2013
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2013 - Prague, Czech Republic
Duration: 23 Sep 201327 Sep 2013

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume8188 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2013
Country/TerritoryCzech Republic
CityPrague
Period23/09/1327/09/13

Fingerprint

Dive into the research topics of 'Learning throttle valve control using policy search'. Together they form a unique fingerprint.

Cite this