UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

A Representational Response Analysis Framework For Convolutional Neural Networks

Loading...
Thumbnail Image

Date

2024-04-25

Authors

Hryniowski, Andrew

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Over the past decade, convolutional neural networks (CNNs) have become the defacto machine learning model for image processing due to their inherent ability to capitalize on modern data availability and computational resources. Much of a CNN's capabilities come from their modularity and flexibility in model design. As such, practitioners have been able to successfully tackle applications not previously possible with other contemporary methods. The downside to this flexibility is that it makes designing and improving upon a CNN's performance an arduous task. Designing a CNN is not a straightforward process. Model architecture design, learning strategies, and data selection and processing must all be precisely tuned for a researcher to produce even a non-random performing model. Finding the correct balance to achieve start-of-the-art can be its own challenge requiring months or years of effort. When building a new model, researchers will rely on quantitative metrics to guide the development process. Typically, these metrics revolve around model performance characteristic constraints (e.g., accuracy, recall, precision, robustness) and computational (e.g., number of parameters, number of FLOPS), while the learned internal data processing behaviour of a CNN is ignored. Some research investigating the internal behaviour of CNNs has been proposed and adopted by a niche group within the broader deep learning community. Because these methods operate on extremely high dimensional latent embeddings (between one to three orders of magnitude larger than the input data) they are computationally expensive to compute. In addition, many of the most common methods do not share a common root from which downstream metrics can be computed, thus making the use of multiple metrics prohibitive. In this work we propose a novel analytic framework that offers a broad range of complementary metrics that can be used by a researcher to study the internal behaviour of a CNN, and whose findings can be used to guide model performance improvements. We call the proposed framework Representational Response Analysis (RRA). The RRA framework is built around a common computational kNN based model of the latent embeddings of a dataset at each layer in a CNN. Using the information contained within these kNNs, we propose three complementary metrics that extract targeted information and provides a researcher with the ability to investigate specific behaviours of a CNN across all of its layers. For this work we focus our attention on classification based CNNs and perform two styles of experiments using the proposed RRA framework. The first set of experiments revolve around better understanding RRA hyper-parameter selection and the impacts on the downstream metrics with regards to observed characteristics of a CNN. From this first set of experiments we determine the effects of adjusting specific RRA hyper-parameters, and we propose general guidelines for selecting these hyper-parameters. The second set of experiments investigates the impact of specific CNN design choices. To be more precise, we use RRA to investigate the consequences on a CNN's latent representation when training with and without data augmentations, and to understand the latent embedding symmetries across different pooled spatial resolutions. For each of these experiments RRA provides novel insights into the internal workings of a CNN. Using the insights from the pooled spatial resolution experiments we propose a novel CNN attention-based building block that is specifically designed to take advantage of key latent properties of a ResNet. We call the proposed building block the Scale Transformed Attention Condenser (STAC) module. We demonstrate that the proposed STAC module not only improves a model's performance across a selection of model-dataset pairs, but that it does so with an improved performance-to-computational-cost tradeoff when compared to other CNN spatial attention-based modules of similar FLOPS or number of parameters.

Description

Keywords

convolutional neural network, manifold, latent space, representational response analysis, deep learning, spatial attention, augmentation

LC Keywords

Citation