TY - GEN
T1 - Recurrent Soft Actor Critic Reinforcement Learning for Demand Response Problems
AU - Ludolfinger, Ulrich
AU - Zinsmeister, Daniel
AU - Perić, Vedran S.
AU - Hamacher, Thomas
AU - Hauke, Sascha
AU - Martens, Maren
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Demand response problems are typically solved with rule-based or model predictive control solutions. While rule-based controls do not take into account the future development of uncertain variables, model predictive control solutions often require very accurate predictions. To solve demand response problems, which incorporate the inaccuracies of predictions in their decision making, deep reinforcement learning methods have become popular. For their implementation, current literature defines the demand response problem as a fully observable Markov decision process. However, the assumption of full observability is usually not satisfied in reality. An alternative idea is to describe the problem as partially observable and to use recurrency in the policy function. In this paper, we adapt this idea and propose a novel deep reinforcement learning control solution for demand response problems, based on the soft actor critic framework. Controlling a heat pump with a mixture of discrete and continuous action capabilities, we show that significant performance improvement can be achieved by using recurrency in the policy compared to a non recurrent policy function.
AB - Demand response problems are typically solved with rule-based or model predictive control solutions. While rule-based controls do not take into account the future development of uncertain variables, model predictive control solutions often require very accurate predictions. To solve demand response problems, which incorporate the inaccuracies of predictions in their decision making, deep reinforcement learning methods have become popular. For their implementation, current literature defines the demand response problem as a fully observable Markov decision process. However, the assumption of full observability is usually not satisfied in reality. An alternative idea is to describe the problem as partially observable and to use recurrency in the policy function. In this paper, we adapt this idea and propose a novel deep reinforcement learning control solution for demand response problems, based on the soft actor critic framework. Controlling a heat pump with a mixture of discrete and continuous action capabilities, we show that significant performance improvement can be achieved by using recurrency in the policy compared to a non recurrent policy function.
KW - Demand Response
KW - Home Energy Management
KW - Machine Learning
KW - Partially Observable Markov Decision Process
KW - Recurrent Soft Actor Critic
KW - Reinforcement Learning
UR - http://www.scopus.com/inward/record.url?scp=85169434383&partnerID=8YFLogxK
U2 - 10.1109/PowerTech55446.2023.10202844
DO - 10.1109/PowerTech55446.2023.10202844
M3 - Conference contribution
AN - SCOPUS:85169434383
T3 - 2023 IEEE Belgrade PowerTech, PowerTech 2023
BT - 2023 IEEE Belgrade PowerTech, PowerTech 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2023 IEEE Belgrade PowerTech, PowerTech 2023
Y2 - 25 June 2023 through 29 June 2023
ER -