Accelerating the Training of Convolutional Neural Networks for Image Segmentation with Deep Active Learning

Chen, Wei Tao

Accelerating the Training of Convolutional Neural Networks for Image Segmentation with Deep Active Learning

Files

Chen_Weitao.pdf (7.96 MB)

Date

2020-01-23

Authors

Chen, Wei Tao

Advisor

Czarnecki, Krzysztof

Publisher

University of Waterloo

Abstract

Image semantic segmentation is an important problem in computer vision. However, Training a deep neural network for semantic segmentation in supervised learning requires expensive manual labeling. Active learning (AL) addresses this problem by automatically selecting a subset of the dataset to label and iteratively improve the model. This minimizes labeling costs while maximizing performance. Yet, deep active learning for image segmentation has not been systematically studied in the literature. This thesis offers three contributions. First, we compare six different state-of-the-art querying methods, including uncertainty, Bayesian, and out-of-distribution methods, in the context of active learning for image segmentation. The comparison uses the standard dataset Cityscapes, as well as randomly generated data, and the state-of-the-art image segmentation architecture DeepLab. Our results demonstrate subtle but robust differences between the querying methods, which we analyze and explain. Second, we propose a novel way to query images by counting the number of pixels with acquisition values above a certain threshold. Our counting method outperforms the standard averaging method. Lastly, we demonstrate that the previous two findings remain consistent for both whole images and image crops. Furthermore, we provide an in-depth discussion of deep active learning and results from supplementary experiments. First, we studied active learning in the context of image classification with the MNIST dataset. We observed an interesting phenomenon where active learning querying methods perform worse than random sampling in the early cycles but overtake random sampling at a break-even point. This break-even point can be controlled by varying model capacity, sample diversity, and temperature scaling. The difference in performances of the six querying methods is larger than in the case of image segmentation. Second, we attempt to explore the theoretical optimal query by querying samples with the lowest accuracy and querying with a trained expert model. Although they turned out to be suboptimal, their results would hopefully shed light on the subject. Lastly, we present the experiment results from using SegNet and FCN. With these architectures, our querying methods did not perform any better than random sampling. Nevertheless, those negative results demonstrate some of the difficulties of active learning for image segmentation.