Linear Approximations For Factored Markov Decision Processes

Patrascu, Relu-Eugen

dc.contributor.author	Patrascu, Relu-Eugen	en
dc.date.accessioned	2006-08-22 14:21:09 (GMT)
dc.date.available	2006-08-22 14:21:09 (GMT)
dc.date.issued	2004	en
dc.date.submitted	2004	en
dc.identifier.uri	http://hdl.handle.net/10012/1171
dc.description.abstract	A Markov Decision Process (MDP) is a model employed to describe problems in which a decision must be made at each one of several stages, while receiving feedback from the environment. This type of model has been extensively studied in the operations research community and fundamental algorithms have been developed to solve associated problems. However, these algorithms are quite inefficient for very large problems, leading to a need for alternatives; since MDP problems are provably hard on compressed representations, one becomes content even with algorithms which may perform well at least on specific classes of problems. The class of problems we deal with in this thesis allows succinct representations for the MDP as a dynamic Bayes network, and for its solution as a weighted combination of basis functions. We develop novel algorithms for producing, improving, and calculating the error of approximate solutions for MDPs using a compressed representation. Specifically, we develop an efficient branch-and-bound algorithm for computing the Bellman error of the compact approximate solution regardless of its provenance. We introduce an efficient direct linear programming algorithm which, using incremental constraints generation, achieves run times significantly smaller than existing approximate algorithms without much loss of accuracy. We also show a novel direct linear programming algorithm which, instead of employing constraints generation, transforms the exponentially many constraints into a compact form more amenable for tractable solutions. In spite of its perceived importance, the efficient optimization of the Bellman error towards an approximate MDP solution has eluded current algorithms; to this end we propose a novel branch-and-bound approximate policy iteration algorithm which makes direct use of our branch-and-bound method for computing the Bellman error. We further investigate another procedure for obtaining an approximate solution based on the dual of the direct, approximate linear programming formulation for solving MDPs. To address both the loss of accuracy resulting from the direct, approximate linear program solution and the question of where basis functions come from we also develop a principled system able not only to produce the initial set of basis functions, but also able to augment it with new basis functions automatically generated such that the approximation error decreases according to the user's requirements and time limitations.	en
dc.format	application/pdf	en
dc.format.extent	1048298 bytes
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.rights	Copyright: 2004, Patrascu, Relu-Eugen. All rights reserved.	en
dc.subject	Computer Science	en
dc.subject	mpd	en
dc.subject	linear approximation	en
dc.subject	basis functions	en
dc.title	Linear Approximations For Factored Markov Decision Processes	en
dc.type	Doctoral Thesis	en
dc.pending	false	en
uws-etd.degree.department	School of Computer Science	en
uws-etd.degree	Doctor of Philosophy	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: rpatrasc2004.pdf
Size:: 1023.Kb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record