Show simple item record

dc.contributor.authorJiang, Yanbing 17:46:50 (GMT) 17:46:50 (GMT)
dc.description.abstractSource coding and deep learning are two major branches in the field of information processing. Source coding encodes information that can be summarised with patterns into certain representation without semantic consideration. On the other hand, deep learning utilises multi-layers of representations with increasing levels of abstraction to learn the patterns that cannot be summarised easily. What is interesting is that source coding itself makes great contributions to the field of deep learning. The key that makes deep learning successful is the inclusion of cascading non-linear layers that help the network to abstract multi-level features. Source coding, such as image compression, contains fundamental non-linear operations including quantisation and rounding. How the non-linearity from the compression could further help deep learning is the inspiration of this research even though common sense tells us that compression usually results a worse ability to do recognition. This paper proposes the idea of integrating source coding and deep learning to have better accuracy performance in image classification. Image classification is one of the most popular tasks in the field of deep learning. Based on human vision’s perception to classify object(s) in images, when the images are compressed, such as by JPEG, the human’s recognition ability deteriorates. Nonetheless, it is not usually the case in machine's perspective. Compressed images may be recognised better by machine based on our observation. In order to improve the accuracy of image recognition, this study focuses on improving the pre-processing operation before image input into the neural network. At the meantime, we proposed a new Convolutional Neural Network (CNN) topology, which absorbs original input along with its various compressed versions. JPEG image compression is friendly for human when the images are compressed with higher quality. However, what level of the compressed image is machine friendly is uncertain. This topology facilitates the compressed information across the compression inputs from low to high qualities and lets the machine to learn from all potential compressed information by itself. We trained the topology with proposed Block-by-block training method and were able to increase the accuracy of state-of-art CNN for image classification: 0.374% increase in Top-1 accuracy, 0.346% increase in Top-5 accuracy in terms of Inception V3 model and 0.39% increase in Top-1 accuracy and 0.228% increase in Top-5 accuracy in terms of ResNet-50 V2 model. What's more, we can state that compression can highlight the contrast of the objects and discard interference information which helps our topology improve the accuracy of image classification based on visual observations. Furthermore, we believe the accuracy performance could be even more outstanding if our topology is applied to the state-of-art EfficientNet (published May 2019).en
dc.publisherUniversity of Waterlooen
dc.subjectImage Compressionen
dc.subjectMachine Learningen
dc.subjectDeep Learningen
dc.subjectGPU Utilizationen
dc.titleNew Convolutional Neural Network Topology with Compressed Information to Enhance Accuracy for Image Classification Tasken
dc.typeMaster Thesisen
dc.pendingfalse and Computer Engineeringen and Computer Engineeringen of Waterlooen
uws-etd.degreeMaster of Applied Scienceen
uws.contributor.advisorEn-Hui, Yang
uws.contributor.affiliation1Faculty of Engineeringen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages