Search Machine Learning Repository:
A Convergent $O(n)$ Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation
Authors: Richard S. Sutton, Hamid R. Maei and Csaba Szepesvári
Conference: Advances in Neural Information Processing Systems 21
authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).