Prediction and Planning in Dynamical Systems with Underlying Markov Decision Processes

Banijamali, Seyedershad

dc.contributor.author	Banijamali, Seyedershad
dc.date.accessioned	2021-08-24 16:11:57 (GMT)
dc.date.available	2021-08-24 16:11:57 (GMT)
dc.date.issued	2021-08-24
dc.date.submitted	2021-08-13
dc.identifier.uri	http://hdl.handle.net/10012/17233
dc.description.abstract	Predicting the future state of a scene with moving objects is a task that humans handle with ease. This is due to our understanding about the dynamics of the objects in the scene and the way they interact. However, teaching machines such understanding has always been a challenging task in machine learning. In recent years, with the abundance of data and enormous growth in computational power, there have been an outstanding progress in filling the gap between humans and machines perception and prediction. Deep learning, specifically, has been the main framework to address this problem. Prediction models are not only crucial problems by themselves but also many downstream tasks in machine learning and robotics rely on the quality of output of these models. Model-based control and planning require an accurate modelling of the underlying dynamics of the systems. A common assumption about the underlying dynamics, which is also the main theme of this thesis, is that it can be expressed using Markov Decision Processes (MDPs). However, the major portion of the thesis is dedicated to the problems in which we do not have access to the actual underlying MDP and only observe some high-dimensional observations from the dynamical system. The objective is then to model the underlying dynamics from the data and built a model that can potentially be used for planning and control. We consider both single-agent and multi-agent systems and employ deep generative models for modelling the dynamics. For the single-agent problem we propose a model that maps the high-dimensional observations to a low-dimensional space in which the dynamics of the system is modelled by a locally-linear function. We find this mapping by a proper modelling of the variables using graphical models and show that the mapping is robust against dynamics noise and suitable for control. For the multi-agent problem we provide a formulation that describes the prediction problem in terms of the reaction of the environment to the action of one agent (ego-agent) and show that such formulation can improve the prediction accuracy as well as broaden the range of environment conditions. From a different perspective, we also consider the problem in which we have access to the MDP and would like to obtain the optimal policy. More specifically, given a set of base policies on the MDP, we want to find the best policy in their convex hull. We show that this problem is NP-hard in general and provide an approximating algorithm with linear complexity, which outputs a policy that performs close to the optimal policy. This policy can be found under the condition that base policies have overlap in the occupancy measure space.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	prediction	en
dc.subject	planning	en
dc.subject	deep learning	en
dc.subject	Markov decision process	en
dc.subject	reinforcement learning	en
dc.title	Prediction and Planning in Dynamical Systems with Underlying Markov Decision Processes	en
dc.type	Doctoral Thesis	en
dc.pending	false
uws-etd.degree.department	David R. Cheriton School of Computer Science	en
uws-etd.degree.discipline	Computer Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Doctor of Philosophy	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Ghodsi, Ali
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Banijamali_Seyedershad.pdf
Size:: 7.368Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record