Show simple item record

dc.contributor.authorBidart, Rene
dc.date.accessioned2023-08-14 14:21:57 (GMT)
dc.date.available2023-08-14 14:21:57 (GMT)
dc.date.issued2023-08-14
dc.date.submitted2023-06-29
dc.identifier.urihttp://hdl.handle.net/10012/19682
dc.description.abstractDeep neural networks have transformed a wide variety of domains including natural language processing, image and video processing, and robotics. However, the computational cost of training and inference with these models is high, and the rise of unsupervised pretraining has allowed ever larger networks to be used to further improve performance. Running these large neural networks in compute constrained environments such as on edge devices is infeasible, and the alternative of doing inference using cloud compute can be exceedingly expensive, with the largest language models needing to be distributed across multiple GPUs. Because of these constraints, size reduction and improving inference speed has been a main focus in neural network research. A wide variety of techniques have been proposed to improve the efficiency of existing neural networks including pruning, quantization, and knowledge distillation. In addition there is extensive effort on creating more efficient networks through hand design or an automated process called neural architecture search. However, there remain key domains where where there is significant room for improvement, which we demonstrate in this thesis. In this thesis we aim to improve the efficiency of deep neural networks in terms of inference latency, model size and latent representation size. We take an alternative approach to previous research and instead investigate redundant representations in neural networks. Across three domains of text classification, image classification and generative models we hypothesize that current neural networks contain representational redundancy and show that through the removal of this redundancy we can improve their efficiency. For image classification we hypothesize that convolution kernels contain redundancy in terms of unnecessary channel wise flexibility, and test this by introducing additional weight sharing into the network, preserving or even increasing classification performance while requiring fewer parameters. We show the benefits of this approach on convolution layers on the CIFAR and Imagenet datasets, on both standard models and models explicitly designed to be parameter efficient. For generative models we show it is possible to reduce the size of the latent representation of the model while preserving the quality of the generated images through the unsupervised disentanglement of shape and orientation. To do this we introduce the affine variational autoencoder, a novel training procedure, and demonstrate its effectiveness on the problem of generating 2 dimensional images, as well as 3 dimensional voxel representations of objects. Finally, looking at the transformer model, we note that there is a mismatch between the tasks used for pretraining and the downstream tasks models are fine tuned on, such as text classification.We hypothesize that this results in a redundancy in terms of unnecessary spatial information, and remove it through the introduction of learned sequence length bottlenecks. We aim to create task specific networks given a dataset and performance requirements through the use of a neural architecture search method and learned downsampling. We show that these task specific networks achieve superior performance in terms of inference latency and accuracy tradeoff to standard models without requiring additional pretraining.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectArtificial intelligenceen
dc.subjectcomputer visionen
dc.subjectnlpen
dc.titleRepresentational Redundancy Reduction Strategies for Efficient Neural Network Architectures for Visual and Language Tasksen
dc.typeDoctoral Thesisen
dc.pendingfalse
uws-etd.degree.departmentSystems Design Engineeringen
uws-etd.degree.disciplineSystem Design Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws-etd.embargo.terms0en
uws.contributor.advisorAlexander, Wong
uws.contributor.affiliation1Faculty of Engineeringen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages