Self-supervised Video Representation Learning by Exploiting Video Speed Changes
Loading...
Date
2022-04-29
Authors
Chen, Lizhe
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
In recent research, the self-supervised video representation learning methods have achieved
improvement by exploring video’s temporal properties, such as playing speeds and temporal
order. These works inspire us to exploit a new artificial supervision signal for self-supervised
representation learning: the change of video playing speed. Specifically, we formulate two
novel speediness-related pretext tasks, i.e. speediness change classification and speediness
change localization, that jointly supervise a shared backbone for video representation learn ing. This self-supervision approach solves the tasks altogether and encourages the backbone
network to learn local and long-ranged motion and context representations. It outperforms
prior arts on multiple downstream tasks, such as action recognition, video retrieval, and
action localization.
Description
Keywords
video representation learning, Self-supervised Learning, Contrastive Learning