Trust Region Methods for Training Neural Networks

Kinross, Colleen

Trust Region Methods for Training Neural Networks

dc.contributor.advisor	Li, Yuying
dc.contributor.advisor	Wan, Justin
dc.contributor.author	Kinross, Colleen
dc.date.accessioned	2017-11-09T19:32:22Z
dc.date.available	2017-11-09T19:32:22Z
dc.date.issued	2017-11-09
dc.date.submitted	2017-11-07
dc.description.abstract	Artificial feed-forward neural networks (ff-ANNs) serve as powerful machine learning models for supervised classification problems. They have been used to solve problems stretching from natural language processing to computer vision. ff-ANNs are typically trained using gradient based approaches, which only require the computation of first order derivatives. In this thesis we explore the benefits and drawbacks of training an ff-ANN with a method which requires the computation of second order derivatives of the objective function. We also explore whether stochastic approximations can be used to decrease the computation time of such a method. A numerical investigation was performed into the behaviour of trust region methods, a type of second order numerical optimization method, when used to train ff-ANNs on several datasets. Our study compares a classical trust region approach and evaluates the effect of adapting this method using stochastic variations. The exploration includes three approaches to reducing the computations required to perform the classical method: stochastic subsampling of training examples, stochastic subsampling of parameters and using a gradient based approach in combination with the classical trust region method. We found that stochastic subsampling methods can, in some cases, reduce the CPU time required to reach a reasonable solution when compared to the classical trust region method but this was not consistent across all datasets. We also found that using the classical trust region method in combination with mini-batch gradient descent either successfully matched (within 0.1s) or decreased the CPU time required to reach a reasonable solution for all datasets. This was achieved by only computing the trust region step when training progress using the gradient approach had stalled.	en
dc.identifier.uri	http://hdl.handle.net/10012/12621
dc.language.iso	en	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	Optimization	en
dc.subject	Neural Networks	en
dc.subject	Trust Region Method	en
dc.subject	Machine Learning	en
dc.title	Trust Region Methods for Training Neural Networks	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Mathematics	en
uws-etd.degree.department	David R. Cheriton School of Computer Science	en
uws-etd.degree.discipline	Computer Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws.contributor.advisor	Li, Yuying
uws.contributor.advisor	Wan, Justin
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Kinross_Colleen.pdf
Size:: 1.5 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.08 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Computer Science