Search Machine Learning Repository: Scaling Up Approximate Value Iteration with Options: Better Policies with Fewer Iterations
Authors: Timothy Mann and Shie Mannor
Conference: Proceedings of the 31st International Conference on Machine Learning (ICML-14)
Year: 2014
Pages: 127-135
Abstract: We show how options, a class of control structures encompassing primitive and temporally extended actions, can play a valuable role in planning in MDPs with continuous state-spaces. Analyzing the convergence rate of Approximate Value Iteration with options reveals that for pessimistic initial value function estimates, options can speed up convergence compared to planning with only primitive actions even when the temporally extended actions are suboptimal and sparsely scattered throughout the state-space. Our experimental results in an optimal replacement task and a complex inventory management task demonstrate the potential for options to speed up convergence in practice. We show that options induce faster convergence to the optimal value function, which implies deriving better policies with fewer iterations.
[pdf] [BibTeX]

authors venues years
Suggest Changes to this paper.
Brought to you by the WUSTL Machine Learning Group. We have open faculty positions (tenured and tenure-track).