Decay Makes Supervised Predictive Coding Generative
Loading...
Date
2020-08-19
Authors
Sun, Wei
Advisor
Orchard, Jeff
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Predictive Coding is a hierarchical model of neural computation that approximates backpropagation using only local computations and local learning rules. An important aspect of Predictive Coding is the presence of feedback connections between layers. These feedback connections allow Predictive Coding networks to potentially be generative as well as discriminative. However, Predictive Coding networks trained on supervised classification tasks cannot generate accurate input samples close to the training inputs from the class vectors alone.
This problem arises from the fact that generating inputs from classes requires solving an underdetermined system, which contains an infinite number of solutions. Generating the correct inputs involves reaching a specific solution in that infinite solution space. But by imposing a minimum-norm constraint on the state nodes and the synaptic weights of a Predictive Coding network, the solution space collapses to a unique solution that is close to the training inputs. This minimum-norm constraint can be enforced by adding decay to the Predictive Coding equations.
Decay is implemented in the form of weight decay and activity decay. Analyses done on linear Predictive Coding networks show that applying weight decay during training helps the network learn weights that can generate the correct input samples from the class vectors, while applying activity decay during input generation helps to lower the variance in the network's generated samples. Additionally, weight decay regularizes the values of the network weights, avoiding extreme values, and improves the rate at which the network converges to equilibrium by regularizing the eigenvalues of the Jacobian at the equilibrium.
Experiments on the MNIST dataset of handwritten digits provide evidence that decay makes Predictive Coding networks generative even when the network contains deep layers and uses nonlinear tanh activations. A Predictive Coding network equipped with weight and activity decay successfully generates images resembling MNIST digits from the class vectors alone.
Description
Keywords
neural network, predictive coding, weight decay, regularization, supervised learning, generative