Techniques to learn constraints from demonstrations

Gaurav, Ashish

Techniques to learn constraints from demonstrations

Files

Gaurav_Ashish.pdf (3.99 MB)

Date

2025-05-27

Authors

Gaurav, Ashish

Advisor

Poupart, Pascal

Publisher

University of Waterloo

Abstract

Given demonstrations from an optimal expert, inverse reinforcement learning aims to learn an underlying reward function. However, it is limiting to assume that the reward function fully explains the expert behaviour, since in many real world settings the expert might be acting to satisfy additional behavioural constraints. Recovering these additional constraints falls within the paradigm of constraint learning from demonstrations. Specifically, in this work, we focus on the setting of inverse constraint learning (ICL), where we wish to learn a single but arbitrarily complex constraint from demonstrations assuming the reward is known in advance. For this setting, we first provide a framework to learn an expected constraint from constrained expert demonstrations. We then show how to translate an expected constraint into a probabilistic constraint and additionally extend the proposed framework to learn a probabilistic constraint from constrained expert demonstrations. Here, an expected constraint refers to a constraint that bounds the cumulative costs averaged over a batch of trajectories to be within a budget. Similarly, a probabilistic constraint upper bounds the probability that cumulative costs are above a certain threshold. Finally, we provide convergence guarantees for the proposed frameworks. Following these approaches, we consider the complementary challenge of learning a constraint in a high dimensional state-action space. In such a setting, the constraint function may truly depend only on a subset of the input features. We propose using a simple test from the hypothesis testing literature to select this subset of features in order to construct a reduced input space for the constraint function. We also discuss the implications of using this approach in conjunction with an ICL algorithm. To validate our proposed approaches, we conduct experiments with synthetic, robotics and environments based on real-world driving datasets. For feature selection, we test our approach by considering environments with varying state-action space sizes.

Keywords

reinforcement learning, inverse reinforcement learning, machine learning, constraint learning

URI

https://hdl.handle.net/10012/21804

Collections

Theses
Computer Science

Full item page

Techniques to learn constraints from demonstrations

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections