Regularizing Deep Models for Visual Recognition

Barshan Tashnizi, Elnaz

dc.contributor.author	Barshan Tashnizi, Elnaz
dc.date.accessioned	2016-10-26 17:02:09 (GMT)
dc.date.available	2016-10-26 17:02:09 (GMT)
dc.date.issued	2016-10-26
dc.date.submitted	2016-10-25
dc.identifier.uri	http://hdl.handle.net/10012/11031
dc.description.abstract	Image understanding is a shared goal in all computer vision problems. This objective includes decomposing the image into a set of primitive components through which one can perform region segmentation, region labeling, object recognition and finally modeling the interactions between recognized objects. However, due to the large intra-class variations in appearance, shape and structure, extracting image primitives is highly challenging. While images come in the form of intensity matrices, in order to cope with this large variations, a high-level abstraction of images is required. Therefore, the main challenge is to bridge the gap between the low-level pixel representation and the high-level abstract image descriptors. In recent years, we have witnessed a striking popularity of the learned image descriptors using deep networks for visual recognition. The multi-layer architecture of these networks is particularly useful in capturing the hierarchical structure of the image data: simple features are detected at lower layers and fed into higher layers for extracting more complex and abstract representations. Despite the remarkable representational power of deep networks, training these models is computationally expensive. In addition, considering the lack of enough labeled training data in many applications, over-fitting is a serious threat for deep models with large number of free parameters. Also, there are innate issues with the gradient-based optimization procedure used for parameter learning in these models. This research is aimed at addressing the above issues by leveraging domain knowledge. Particularly, we focus on tailoring deep networks for visual recognition through exploiting the characteristics of the image data. These modifications tend to regularize deep models and therefore, improve their generalization performance. We propose novel ways for incorporation of image-specific domain knowledge into deep networks. As part of this thesis, we show how one can significantly decrease the number of free parameters in fully-connected architectures by exploiting the global characteristics of the image data. For convolutional networks, a new multi-neighborhood architecture is introduced which can capture scale-dependent features. In this architecture, the fine-scale image structures (i.e., appearance features) are captured using a small-sized neighborhood while coarse-scale characteristics (i.e., shape features) are detected by considering a wider range area around each pixel. Besides, we propose an effective regularization method for deep networks in which a frequency parameter is devised to specifically treat the issues of gradient-based optimization for training these models. Finally, we introduce a stage-wise training framework for deep networks in which the learning process is broken down into a number of related sub-tasks completed stage-by-stage, where the learned parameters at each stage acts as a prior for the next stage. This goal is achieved through "gradual" injection of the information presented in the training data so that in the early stages of training, the "coarse-scale" properties of the data are captured while the "finer-scale" characteristics are learned in later stages. The performance of the proposed methods are assessed on a number of image classification data sets. Our comprehensive empirical analysis demonstrates that these "regularized" networks offer a better discrimination and generalization performance compared to their domain-oblivious counterparts.	en
dc.language.iso	en	en
dc.publisher	University of Waterloo	en
dc.subject	Deep Learning	en
dc.subject	Visual Recognition	en
dc.subject	Feature Learning	en
dc.subject	Regularization	en
dc.subject	Stage-wise Training	en
dc.title	Regularizing Deep Models for Visual Recognition	en
dc.type	Doctoral Thesis	en
dc.pending	false
uws-etd.degree.department	Systems Design Engineering	en
uws-etd.degree.discipline	System Design Engineering	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.degree	Doctor of Philosophy	en
uws.contributor.advisor	Fieguth, Paul
uws.contributor.affiliation1	Faculty of Engineering	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.typeOfResource	Text	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en

Files in this item

Name:: Barshan_Tashnizi_Elnaz.pdf
Size:: 3.958Mb
Format:: PDF

View/ Open

This item appears in the following Collection(s)

Show simple item record