Analysis of Neural Networks with Physics Applications

Mohamed, Ahmed

Analysis of Neural Networks with Physics Applications

Files

Mohamed_Ahmed.pdf (4.1 MB)

Date

2026-03-30

Authors

Mohamed, Ahmed

Advisor

Yevick, David

Publisher

University of Waterloo

Abstract

This thesis investigates core aspects of machine learning, spanning foundational studies on generalization phenomena in neural networks, novel architectural strategies for enhancing representation learning and classification performance, and high accuracy predictive and inverse modeling of emerging nanoelectronic devices. Together, these studies highlight the significance of data and model structure, the impact of nonlinearity, and the potential of interpretable, generalizable machine learning methods for scientific and engineering applications. For generalization in neural networks, the thesis focuses on the phenomenon of grokking, a delayed generalization effect where models initially overfit but eventually learn to generalize well after extended training. Through a series of interconnected studies, this work proposes insights and practical tools to diagnose, forecast, and enhance generalization in modern machine learning systems. The first part of the thesis examines grokking in modular arithmetic tasks, revealing how dropout induced variance, embedding similarity, activation sparsity, and weight entropy evolve across training, and hence introduces diagnostic metrics to capture phase transitions between memorization and generalization. Further analysis shows that nonlinearity, network depth, and symmetry in data collectively modulate grokking behavior, linking model architecture to its capacity for structured generalization. Next, the thesis introduces a Branched Variational Autoencoder (BVAE), a hybrid architecture that integrates generative and discriminative objectives. By shaping latent representations through a supervised branch, the BVAE achieves improved class separability and interpretability on benchmark datasets, illustrating the potential of structured latent shaping for semi-supervised learning. Finally, the research extends to scientific machine learning, demonstrating how neural and ensemble models as Random Forests can accelerate the modeling and inverse design of Carbon Nanotube Tunnel Field-Effect Transistors (CNT TFETs). By coupling physical insights with machine learning interpretability techniques, this work bridges the gap between theoretical ML and real-world scientific applications.

URI

https://hdl.handle.net/10012/22983

Collections

Theses
Physics and Astronomy

Full item page

Analysis of Neural Networks with Physics Applications

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections