Techniques to learn constraints from demonstrations

No Thumbnail Available

Date

2025-05-27

Advisor

Poupart, Pascal

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Given demonstrations from an optimal expert, inverse reinforcement learning aims to learn an underlying reward function. However, it is limiting to assume that the reward function fully explains the expert behaviour, since in many real world settings the expert might be acting to satisfy additional behavioural constraints. Recovering these additional constraints falls within the paradigm of constraint learning from demonstrations. Specifically, in this work, we focus on the setting of inverse constraint learning (ICL), where we wish to learn a single but arbitrarily complex constraint from demonstrations assuming the reward is known in advance. For this setting, we first provide a framework to learn an expected constraint from constrained expert demonstrations. We then show how to translate an expected constraint into a probabilistic constraint and additionally extend the proposed framework to learn a probabilistic constraint from constrained expert demonstrations. Here, an expected constraint refers to a constraint that bounds the cumulative costs averaged over a batch of trajectories to be within a budget. Similarly, a probabilistic constraint upper bounds the probability that cumulative costs are above a certain threshold. Finally, we provide convergence guarantees for the proposed frameworks. Following these approaches, we consider the complementary challenge of learning a constraint in a high dimensional state-action space. In such a setting, the constraint function may truly depend only on a subset of the input features. We propose using a simple test from the hypothesis testing literature to select this subset of features in order to construct a reduced input space for the constraint function. We also discuss the implications of using this approach in conjunction with an ICL algorithm. To validate our proposed approaches, we conduct experiments with synthetic, robotics and environments based on real-world driving datasets. For feature selection, we test our approach by considering environments with varying state-action space sizes.

Description

Keywords

reinforcement learning, inverse reinforcement learning, machine learning, constraint learning

LC Subject Headings

Citation