Show simple item record

dc.contributor.authorCaterini, Anthony 16:57:11 (GMT) 16:57:11 (GMT)
dc.description.abstractOver the past decade, Deep Neural Networks (DNNs) have become very popular models for processing large amounts of data because of their successful application in a wide variety of fields. These models are layered, often containing parametrized linear and non-linear transformations at each layer in the network. At this point, however, we do not rigorously understand why DNNs are so effective. In this thesis, we explore one way to approach this problem: we develop a generic mathematical framework for representing neural networks, and demonstrate how this framework can be used to represent specific neural network architectures. In chapter 1, we start by exploring mathematical contributions to neural networks. We can rigorously explain some properties of DNNs, but these results fail to fully describe the mechanics of a generic neural network. We also note that most approaches to describing neural networks rely upon breaking down the parameters and inputs into scalars, as opposed to referencing their underlying vector spaces, which adds some awkwardness into their analysis. Our framework strictly operates over these spaces, affording a more natural description of DNNs once the mathematical objects that we use are well-defined and understood. We then develop the generic framework in chapter 3. We are able to describe an algorithm for calculating one step of gradient descent directly over the inner product space in which the parameters are defined. Also, we can represent the error backpropagation step in a concise and compact form. Besides a standard squared loss or cross-entropy loss, we also demonstrate that our framework, including gradient calculation, extends to a more complex loss function involving the first derivative of the network. After developing the generic framework, we apply it to three specific network examples in chapter 4. We start with the Multilayer Perceptron, the simplest type of DNN, and show how to generate a gradient descent step for it. We then represent the Convolutional Neural Network (CNN), which contains more complicated input spaces, parameter spaces, and transformations at each layer. The CNN, however, still fits into the generic framework. The last structure that we consider is the Deep Auto-Encoder, which has parameters that are not completely independent at each layer. We are able to extend the generic framework to handle this case as well. In chapter 5, we use some of the results from the previous chapters to develop a framework for Recurrent Neural Networks (RNNs), the sequence-parsing DNN architecture. The parameters are shared across all layers of the network, and thus we require some additional machinery to describe RNNs. We describe a generic RNN first, and then the specific case of the vanilla RNN. We again compute gradients directly over inner product spaces.en
dc.publisherUniversity of Waterlooen
dc.subjectNeural Networksen
dc.subjectConvolutional Neural Networksen
dc.subjectDeep Neural Networksen
dc.subjectMachine Learningen
dc.subjectRecurrent Neural Networksen
dc.titleA Novel Mathematical Framework for the Analysis of Neural Networksen
dc.typeMaster Thesisen
dc.pendingfalse Mathematicsen Mathematicsen of Waterlooen
uws-etd.degreeMaster of Mathematicsen
uws.contributor.advisorChang, Dong Eui
uws.contributor.affiliation1Faculty of Mathematicsen

Files in this item


This item appears in the following Collection(s)

Show simple item record


University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages