dc.description.abstract | Unsupervised losses are common for tasks with limited human annotations. In clustering, they
are used to group data without any labels. In semi-supervised or weakly-supervised learning, they
are applied to the unannotated part of the training data. In self-supervised settings, they are used
for representation learning. They appear in diverse forms enforcing different prior knowledge.
However, formulating and optimizing such losses poses challenges. Firstly, translating prior
knowledge into mathematical formulations can be non-trivial. Secondly, the properties of standard
losses may not be obvious across different tasks. Thirdly, standard optimization algorithms may
not work effectively or efficiently, thus requiring the development of customized algorithms.
This thesis addresses several related classification and segmentation problems in computer
vision, using unsupervised image- or pixel-level losses under a shortage of labels. First, we
focus on the entropy-based decisiveness as a standard unsupervised loss for softmax models.
While discussing it in the context of clustering, we prove that it leads to margin maximization,
typically associated with supervised learning. In the context of weakly-supervised semantic
segmentation, we combine decisiveness with the standard pairwise regularizer, the Potts model.
We study the conceptual and empirical properties of different relaxations of the latter. For both
clustering and segmentation problems, we provide new self-labeling optimization algorithms
for the corresponding unsupervised losses. Unlike related prior work, we use soft hidden labels
that can represent the estimated class uncertainty. Training network models with such soft
pseudo-labels motivates a new form of cross-entropy maximizing the probability of “collision”
between the predicted and estimated classes. The proposed losses and algorithms achieve the
state-of-the-art on standard benchmarks. The thesis also introduces new geometrically motivated
unsupervised losses for estimating thin structures, e.g. complex vasculature trees at near-capillary
resolution in 3D medical data. | en |