TY - GEN
T1 - Toward Understanding State Representation Learning in MuZero
T2 - 62nd IEEE Conference on Decision and Control, CDC 2023
AU - Tian, Yi
AU - Zhang, Kaiqing
AU - Tedrake, Russ
AU - Sra, Suvrit
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - We study the problem of representation learning for control from partial and potentially high-dimensional observations. We approach this problem via direct latent model learning, where one directly learns a dynamical model in some latent state space by predicting costs. In particular, we establish finite-sample guarantees of finding a near-optimal representation function and a near-optimal controller using the directly learned latent model for infinite-horizon time-invariant Linear Quadratic Gaussian (LQG) control. A part of our approach to latent model learning closely resembles MuZero, a recent breakthrough in empirical reinforcement learning, in that it learns latent dynamics implicitly by predicting cumulative costs. A key technical contribution of this work is to prove persistency of excitation for a new stochastic process that arises from the analysis of quadratic regression in our approach.
AB - We study the problem of representation learning for control from partial and potentially high-dimensional observations. We approach this problem via direct latent model learning, where one directly learns a dynamical model in some latent state space by predicting costs. In particular, we establish finite-sample guarantees of finding a near-optimal representation function and a near-optimal controller using the directly learned latent model for infinite-horizon time-invariant Linear Quadratic Gaussian (LQG) control. A part of our approach to latent model learning closely resembles MuZero, a recent breakthrough in empirical reinforcement learning, in that it learns latent dynamics implicitly by predicting cumulative costs. A key technical contribution of this work is to prove persistency of excitation for a new stochastic process that arises from the analysis of quadratic regression in our approach.
UR - http://www.scopus.com/inward/record.url?scp=85184809803&partnerID=8YFLogxK
U2 - 10.1109/CDC49753.2023.10383754
DO - 10.1109/CDC49753.2023.10383754
M3 - Conference contribution
AN - SCOPUS:85184809803
T3 - Proceedings of the IEEE Conference on Decision and Control
SP - 6166
EP - 6171
BT - 2023 62nd IEEE Conference on Decision and Control, CDC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 13 December 2023 through 15 December 2023
ER -