Studying CNN representations through activation dimensionality reduction and visualization
Abstract
The field of explainable artificial intelligence (XAI) aims to explain the decisions of DNNs. Complete DNN explanations accurately reflect the inner workings of the DNN while interpretable explanations are easy for humans to understand. Developing methods for explaining the representations learned by DNNs that are both complete and interpretable is a grand challenge in the field of XAI.
This thesis makes contributions to the field of XAI by proposing and evaluating novel methods for studying DNN representations. During forward propagation, each DNN layer non-linearly transforms the input space in some way that is useful for minimizing a loss function. To understand how DNNs represent their inputs, this work develops methods to examine each DNN layer’s activation space.
The first article contributes an unsupervised framework for identifying and interpreting “tuning dimensions” in the activation space of DNN layers. The method consists of fitting a dimensionality reduction model to a layer’s activations, then visualizing points along the axes defined by each of the reduced dimensions. The method correctly identifies the tuning dimensions of a synthetic Gabor filter bank, and those of the first two layers of InceptionV1 trained on ImageNet.
The second article builds upon the first article with a simple and greatly improved visualization method that enables studying every layer of AlexNet. Through a quantitative comparison, the article demonstrates that the principal component analysis (PCA) basis for activation space offers more complete and more interpretable explanations than the traditional neuron basis. This thesis provides deep learning researchers with tools to better understand the representations learned by DNNs.
Collections
Cite this version of the work
Nolan Simran Dey
(2021).
Studying CNN representations through activation dimensionality reduction and visualization. UWSpace.
http://hdl.handle.net/10012/17608
Other formats