Steffen Grünewälder, Klaus Obermayer, Sepp Hochreiter,
"Optimality of LSTD and its Relation to MC"
: Proceedings of the International Joint Conference on Neural Networks, IJCNN, 2007, Proceedings of the International Joint Conference on Neural Networks, IJCNN 2007, Orlando, Florida, August 2007, pp. 338-343
Original Titel:
Optimality of LSTD and its Relation to MC
Sprache des Titels:
Englisch
Original Buchtitel:
Proceedings of the International Joint Conference on Neural Networks, IJCNN
Original Kurzfassung:
In this analytical study we compare the risk of
the Monte Carlo (MC) and the least-squares TD (LSTD)
estimator. We prove that for the case of acyclic Markov Reward
Processes (MRPs) LSTD has minimal risk for any convex loss
function in the class of unbiased estimators. When comparing
the Monte Carlo estimator, which does not assume a Markov
structure, and LSTD, we find that the Monte Carlo estimator is
equivalent to LSTD if both estimators have the same amount of
information. Theoretical results are supported by an empirical
evaluation of the estimators.
Sprache der Kurzfassung:
Deutsch
Erscheinungsjahr:
2007
Notiz zum Zitat:
Proceedings of the International Joint Conference on Neural Networks, IJCNN 2007, Orlando, Florida, August 2007, pp. 338-343