Weakly-supervised Semantic Segmentation with Regularized Loss Hyperparameter Search

Ji, Zongliang

Weakly-supervised Semantic Segmentation with Regularized Loss Hyperparameter Search

Files

Ji_Zongliang.pdf (21.94 MB)

Date

2021-09-20

Authors

Ji, Zongliang

Advisor

Veksler, Olga

Publisher

University of Waterloo

Abstract

Weakly supervised segmentation signi cantly reduces user annotation e ort. Recently, regularized loss was proposed for single object class segmentation under image-level weak supervision. Regularized loss consists of several components. Each component, if used in isolation, would lead to some trivial solution. However, a weighted combination of the loss components introduces a balance between the individual biases. The weight of each component in regularized loss is controlled by a hyperparameter. We propose an approach that searches for regularized loss hyperparameters. The main idea is to set the most important regularized loss component to a high weight while ensuring the other loss components are set to weights just su ciently high to prevent the trivial solution favoured by the most important component. Our approach results in a signi cantly improved performance over prior work with xed hyperparameters and improves the state of the art in salient and semantic image level supervised segmentation. In addition to image level weak supervision, we propose a new approach for semantic segmentation with weak supervision using bounding box annotations. Our new approach to weak supervision from bounding boxes also makes use of hyperparameter search regularized loss. Previous work on weak supervision from bounding boxes constructs pseudo-ground truth by segmenting each box into the object and the background for each box independently from all the other boxes in the dataset. We argue that the collection of boxes for the same class naturally provides a dataset from which we can learn the appearance of that object class. Learning a good appearance model, in turn, leads to a better segmentation of each individual box. Thus for each class, we propose to train a segmentation CNN as from the dataset consisting of the bounding boxes for that class using our proposed single object approach. After we train these single-class CNNs, we apply them back to the training bounding boxes to obtain object/background segmentations and merge them to construct pseudo-ground truth. The obtained pseudo-ground truth is used for training a standard segmentation CNN. We improve the state of the art on Pascal VOC 2012 benchmark in bounding box weak supervision setting.