Analysis of Neural Networks with Physics Applications
| dc.contributor.author | Mohamed, Ahmed | |
| dc.date.accessioned | 2026-03-30T19:49:10Z | |
| dc.date.available | 2026-03-30T19:49:10Z | |
| dc.date.issued | 2026-03-30 | |
| dc.date.submitted | 2026-03-27 | |
| dc.description.abstract | This thesis investigates core aspects of machine learning, spanning foundational studies on generalization phenomena in neural networks, novel architectural strategies for enhancing representation learning and classification performance, and high accuracy predictive and inverse modeling of emerging nanoelectronic devices. Together, these studies highlight the significance of data and model structure, the impact of nonlinearity, and the potential of interpretable, generalizable machine learning methods for scientific and engineering applications. For generalization in neural networks, the thesis focuses on the phenomenon of grokking, a delayed generalization effect where models initially overfit but eventually learn to generalize well after extended training. Through a series of interconnected studies, this work proposes insights and practical tools to diagnose, forecast, and enhance generalization in modern machine learning systems. The first part of the thesis examines grokking in modular arithmetic tasks, revealing how dropout induced variance, embedding similarity, activation sparsity, and weight entropy evolve across training, and hence introduces diagnostic metrics to capture phase transitions between memorization and generalization. Further analysis shows that nonlinearity, network depth, and symmetry in data collectively modulate grokking behavior, linking model architecture to its capacity for structured generalization. Next, the thesis introduces a Branched Variational Autoencoder (BVAE), a hybrid architecture that integrates generative and discriminative objectives. By shaping latent representations through a supervised branch, the BVAE achieves improved class separability and interpretability on benchmark datasets, illustrating the potential of structured latent shaping for semi-supervised learning. Finally, the research extends to scientific machine learning, demonstrating how neural and ensemble models as Random Forests can accelerate the modeling and inverse design of Carbon Nanotube Tunnel Field-Effect Transistors (CNT TFETs). By coupling physical insights with machine learning interpretability techniques, this work bridges the gap between theoretical ML and real-world scientific applications. | en |
| dc.identifier.uri | https://hdl.handle.net/10012/22983 | |
| dc.language.iso | en | |
| dc.pending | false | |
| dc.publisher | University of Waterloo | en |
| dc.title | Analysis of Neural Networks with Physics Applications | |
| dc.type | Doctoral Thesis | |
| uws-etd.degree | Doctor of Philosophy | |
| uws-etd.degree.department | Physics and Astronomy | |
| uws-etd.degree.discipline | Physics | |
| uws-etd.degree.grantor | University of Waterloo | en |
| uws-etd.embargo.terms | 0 | |
| uws.contributor.advisor | Yevick, David | |
| uws.contributor.affiliation1 | Faculty of Science | |
| uws.peerReviewStatus | Unreviewed | en |
| uws.published.city | Waterloo | en |
| uws.published.country | Canada | en |
| uws.published.province | Ontario | en |
| uws.scholarLevel | Graduate | en |
| uws.typeOfResource | Text | en |