Abstract: In this paper, an off-policy reinforcement learning algorithm is designed to solve the continuous-time linear quadratic regulator (LQR) problem using only input-state data measured from the ...
From the blog of William Goloboy at The Times of Israel ...