TY - GEN
T1 - Neural network position and orientation control of an inverted pendulum on wheels
AU - Dengler, Christian
AU - Boris, Lohmann
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/12
Y1 - 2019/12
N2 - In this contribution, we develop a feedback controller for a wheeled inverted pendulum in the form of a neural network that is not only stabilizing the unstable system, but also allows the wheeled robot to drive to arbitrary positions within a certain radius and take a desired orientation, without the need to compute a feasible trajectory to the desired position online. While some techniques from the reinforcement learning community can be used to optimize the parameters of a general feedback controller, i.e. policy gradient methods, the method used in this work is an approach related to imitation learning or learning from demonstration. The demonstration data however does not result from e.g. a human demonstrator, but is a set of precomputed optimal trajectories. The neural network is trained to imitate the behavior of those optimal trajectories. We show that a good choice of initial states and a large number of training targets can be used to alleviate a problem of imitation learning, namely deviating from training trajectories, and we demonstrate results in simulation as well as on the physical system.
AB - In this contribution, we develop a feedback controller for a wheeled inverted pendulum in the form of a neural network that is not only stabilizing the unstable system, but also allows the wheeled robot to drive to arbitrary positions within a certain radius and take a desired orientation, without the need to compute a feasible trajectory to the desired position online. While some techniques from the reinforcement learning community can be used to optimize the parameters of a general feedback controller, i.e. policy gradient methods, the method used in this work is an approach related to imitation learning or learning from demonstration. The demonstration data however does not result from e.g. a human demonstrator, but is a set of precomputed optimal trajectories. The neural network is trained to imitate the behavior of those optimal trajectories. We show that a good choice of initial states and a large number of training targets can be used to alleviate a problem of imitation learning, namely deviating from training trajectories, and we demonstrate results in simulation as well as on the physical system.
UR - http://www.scopus.com/inward/record.url?scp=85084284651&partnerID=8YFLogxK
U2 - 10.1109/ICAR46387.2019.8981659
DO - 10.1109/ICAR46387.2019.8981659
M3 - Conference contribution
AN - SCOPUS:85084284651
T3 - 2019 19th International Conference on Advanced Robotics, ICAR 2019
SP - 350
EP - 355
BT - 2019 19th International Conference on Advanced Robotics, ICAR 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th International Conference on Advanced Robotics, ICAR 2019
Y2 - 2 December 2019 through 6 December 2019
ER -