Search Machine Learning Repository: A Convergent $O(n)$ Temporal-difference Algorithm for Off-policy Learning with Linear Function Approximation
Authors: Richard S. Sutton, Hamid R. Maei and Csaba Szepesvári
Conference: Advances in Neural Information Processing Systems 21
Year: 2008
Pages: 1609--1616
[pdf] [BibTeX]

authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).