UWSpace

UWSpace is the University of Waterloo’s institutional repository for the free, secure, and long-term home of research produced by faculty, students, and staff.

Depositing Theses/Dissertations or Research to UWSpace

Are you a Graduate Student depositing your thesis to UWSpace? See our Thesis Deposit Help and UWSpace Thesis FAQ pages to learn more.

Are you a Faculty or Staff member depositing research to UWSpace? See our Waterloo Research Deposit Help and Self-Archiving pages to learn more.

Photo by Waterloo staff

Recent Submissions

  • Item type: Item ,
    Women’s Health and Wellbeing post COVID: A Case Study From Sub-Saharan Africa
    (University of Waterloo, 2026-01-20) Walwanga, Isaiah
    The Covid-19 pandemic swept across the globe causing hundreds of thousands of deaths, shutting down economies, closing borders and wreaking havoc on an unparalleled level. Countries around the world responded by enforcing nonpharmaceutical interventions in an attempt to flatten the curve and control transmission, morbidity and mortality as well as help ease pressure on healthcare systems. These interventions, though effective in flattening the transmission curve and easing pressure on healthcare systems, came with a heavy social and economic toll globally. Women and girls suffered greater impacts compared to men and boys. Loss of employment, economic distress, school dropout rates, intimate partner violence, as well as domestic violence were overexpressed among women and girls compared to men and boys. Low- and Middle-Income Countries, not being structurally resilient, are unable to quickly recover from such negative shocks. Little is still known as to how vulnerable populations such as women are recovering post pandemic regarding health and wellbeing. Drawing from Sen’s capability approach this thesis aimed to evaluate the differences in health and wellbeing among women in Kenya and Uganda post-pandemic compared to during the pandemic, and the determining factors. Health and wellbeing were operationalized using a wellbeing scale that measures the standard of living, General Health Questionnaire-12 that measures probable emotional distress, and perceived state of health relative to others of own age. 1:1 optimal pair propensity score matching was used to identify socio-demographically comparable participants from two cross-sectional surveys – done in 2021 and in 2023 in Kisumu, Western Kenya as well as in Mukono district, Central Uganda; hence producing two samples of 405 women in Kenya and 186 women in Uganda for each of the two timepoints. McNemar test was then used to compare health and wellbeing between the two timepoints while generalized estimating equations regression with exchangeable correlation structure was used to explore the factors associated with differences in health and wellbeing outcomes. The results show that probable emotional distress levels increased from 21.5% to 52.6% in Kenya but reduced from 88.2% to 61.8% in Uganda, and the proportion that reported poor/fair relative health increased from 25.7% to 35.6% in Kenya and from 25.7% to 35.6% in Uganda. Moreover, the proportion of women that perceived the quality of healthcare services in their community as poor/fair increased from 33.3% to 40.2% in Kenya and from 28.5% to 62.4% in Uganda. The accessibility of healthcare services was also increasingly being perceived as poor/fair in Kenya (23.2% to 29.4%) as well as in Uganda (29.0% to 67.7%). This research also found no significant differences in the proportion of women with health insurance post pandemic relative to during pandemic in both countries – Kenya (23.2% - 27.4%; p=0.196) and Uganda (2.7% - 3.8%; p=0.771). Wellbeing improved in Kenya but worsened in Uganda between the two timepoints – a worrying trend considering Ugandan women were much older, hence heightening their vulnerability. During the pandemic as well as post pandemic, women who were older, who lived in rental houses, who experienced high levels of water/Water Sanitation and Hygiene insecurity, and who lacked health insurance were likely to report poor health and wellbeing outcomes. Women in these categories were also more likely to report worsening health and wellbeing outcomes post pandemic relative to during the pandemic. Thus, recovering better from COVID-19 should involve ambitious plans that rebuild health, social and economic systems with a stronger focus on marginalized populations such as women and older persons. This research proposes that propensity score matching can be used to compare outcomes for samples from two repeated cross-sectional studies to help eliminate or reduce bias associated with differences in sample selection. To inform policy this research proposes that interventions should be focused on improving economic conditions, healthcare infrastructure, and water, sanitation and hygiene access with priority on the structurally vulnerable populations such as elderly women.
  • Item type: Item ,
    Ethotic Heuristics in Artificial Intelligence: A Rhetorical Framework for Guiding Responsible Data Design Praxis in Healthcare and Surveillance
    (University of Waterloo, 2026-01-20) Lubin, Kem-Laurin
    This thesis investigates the convergence of artificial intelligence (AI), human-centered design, and rhetoric across three interconnected essays. This dissertation centers on design heuristics as its primary analytic and unifying framework, drawing from traditions such as Data Feminism and rhetorical inquiry. It explores three interrelated domains: (1) AI-driven human-computer interaction (HCI) design; (2) the implications of AI-powered design for women’s health privacy, particularly in the post-Roe v. Wade U.S. context; and (3) critical discourse surrounding AI in surveillance technologies. Using a multi-method approach—including rhetorical analysis, Critical Discourse Analysis (CDA), case studies, and stakeholder perspectives—this research interrogates how AI systems construct algorithmic ethopoeic representations that commodify user data. The first essay introduces a set of practical heuristics for HCI designers by integrating principles from Design Thinking, thereby fostering ethical dialogue, and strengthening human-centered approaches in the context of rapid AI development. The second essay employs rhetorical analysis to examine the construction of “algorithmic ethopoeia”—the process through which AI systems perform moral characterization through data practices, design choices, and institutional logics—within sensitive socio-technical domains. Algorithmic ethopoeia is a concept central to this dissertation and is defined in more detail on page 3 of this essay. By foregrounding this concept, the essay emphasizes the urgent need for robust protections surrounding personal data integrity and highlights how algorithmic systems actively participate in shaping judgments about identity, risk, and responsibility. This section, grounded in Data Feminism, empowers designers, activists, and policymakers to advocate for more secure and transparent AI applications, particularly in the domain of women’s health privacy. The final essay employs CDA to critique the discourse surrounding AI-driven surveillance, focusing on predictive policing and facial recognition technologies. Through the analysis of competing narratives and stakeholder perspectives, it reveals ethical dilemmas related to systemic biases and authoritarian practices, arguing for rigorous oversight and regulatory frameworks. Surveillance contextual heuristics are proposed to guide the responsible deployment of AI in public safety while safeguarding civil liberties. Collectively, these investigations underscore the imperative for ethical, context-sensitive, and rigorously informed design heuristics to guide the responsible integration of AI across diverse domains. They advance the discourse on user privacy, regulatory compliance, and human-centered innovation, while simultaneously promoting the development of design practices that are both ethically sound and equitable.
  • Item type: Item ,
    Integrating Symbolic Reasoning into Large Language Models
    (University of Waterloo, 2026-01-20) Dhanraj, Varun
    Large language models (LLMs) face fundamental challenges in symbolic reasoning, struggling with tasks requiring precise rule-following, logical consistency, and manipulation of structured representations. This thesis introduces a comprehensive neurosymbolic framework that addresses these limitations by integrating Vector Symbolic Algebras (VSAs) directly into the computational flow of transformer-based language models. Our core method encodes LLM hidden states into compositional neurosymbolic vectors, enabling symbolic algorithms to operate within a high-dimensional vector space before decoding results back into the neural network's processing pipeline. We demonstrate that LLMs naturally develop internally separable representations for symbolic concepts, which our linear and transformer-based encoders can extract with high fidelity. On mathematical reasoning tasks, our approach achieves 88.6\% lower cross-entropy loss and solves 15.4 times more problems correctly compared to chain-of-thought prompting and LoRA fine-tuning, while preserving performance on non-mathematical tasks through selective intervention. Beyond arithmetic, we extend this framework to three applications. First, we enable language-only models to perform visual question answering by encoding segmented images as queryable VSA representations, achieving 92% accuracy without requiring multimodal architectures. Second, we demonstrate environment navigation where LLMs use spatial semantic pointers to interpret and act upon grid-based worlds according to natural language instructions. Third, we address the context length limitations of LLMs by compressing reasoning histories into VSA representations, maintaining performance on iterative problem-solving tasks while avoiding quadratic scaling costs. Our results establish VSA-based neurosymbolic integration as a practical approach for augmenting neural language models with symbolic reasoning capabilities, providing both theoretical insights into LLM representations and practical improvements across diverse reasoning tasks. This work contributes to the broader goal of creating AI systems that combine the flexibility of neural networks with the precision and interpretability of symbolic computation. Code and data are available at https://github.com/vdhanraj/Neurosymbolic-LLM.
  • Item type: Item ,
    Path Reduction and Coverage Complexity for Fuzzing
    (University of Waterloo, 2026-01-20) Wang, Zekun
    Coverage-guided fuzzing is one of the most effective approaches to automated software testing, yet its performance depends critically on the coverage metric that guides input generation. It is widely assumed that finer metrics —especially path coverage, which cap- tures complete control-flow information— should lead to more effective fuzzing. However, practical realizations of path coverage have been limited to restricted forms due to path explosion. In this work, we introduce a path reduction algorithm that bounds loop iterations in execution paths, enabling a practical form of path coverage that preserves essential control- flow information. Despite this advancement, we find that path coverage performs no better than existing metrics such as edge coverage. To understand this phenomenon, we establish the concept of coverage complexity—a quantitative measure of the granularity of coverage metrics. Analogous to complexity and the Big-Onotation in algorithm analysis, coverage complexity classifies metrics into asymptotic complexity classes such as linear, polynomial, and exponential. This framework provides a structured overview of the entire space of coverage metrics, and guides the design of new coverage metrics. Our complexity analysis and empirical evaluation on the MAGMA benchmark reveals a consistent pattern: metrics within the same complexity class tend to exhibit similar fuzzing performance, where linear-complexity metrics consistently outperform more complex met- rics. This suggests a simple but powerful principle: when designing a new coverage metric, the first step is to determine its complexity class, which serves as an early predictor of its potential performance. Since higher-complexity metrics consistently underperform, our results imply that the family of linear metrics may already represent the optimal fron- tier of coverage-guided fuzzing, offering—for the first time—a structured overview of the landscape of coverage metrics.
  • Item type: Item ,
    Efficient Learning for Large Language Models
    (University of Waterloo, 2026-01-20) Rajabzadeh, Hossein
    Artificial Intelligence (AI) systems have become indispensable across domains such as healthcare, finance, robotics, and scientific discovery. At the heart of this revolution, Large Language Models (LLMs) have emerged as the central paradigm, demonstrating remarkable reasoning, generalization, and multi-domain adaptability. However, their exponential growth in scale introduces severe computational bottlenecks in training, fine-tuning, and inference, limiting accessibility, sustainability, and real-world deployment. This dissertation advances the efficiency of LLMs across all lifecycle stages by introducing a suite of five frameworks that significantly reduce compute, memory, and latency costs with minimal or no loss in accuracy. First, Quantized Dynamic Low-Rank Adaptation (QDyLoRA) enables memory-efficient fine-tuning across multiple LoRA ranks in a single training pass, achieving competitive performance to QLoRA while reducing GPU memory usage by up to 65% and supporting flexible rank selection at inference time. Second, Sorted-LoRA introduces a stochastic depth–aware fine-tuning framework that co-trains multiple sub-models of varying depths within a single cycle. On LLaMA2–7B, it produces submodels up to 40% smaller that retain over 98% task accuracy, with the largest variant even surpassing the base model by +0.34%. Third, LoRA-Drop accelerates autoregressive inference by dynamically substituting computationally redundant layers with lightweight low-rank modules during decoding. It delivers up to 2.6× faster decoding and a 50% reduction in KV-cache memory with less than 0.5% degradation in accuracy, offering latency-aware adaptability for real-world deployment. Fourth, EchoAtt exploits redundancy in attention maps by sharing attention matrices among similar layers. On TinyLLaMA–1.1B, it achieves 15% faster inference, 25% faster training, and a 4% parameter reduction while improving zero-shot accuracy, highlighting that structural compression can enhance rather than degrade model generalization. Finally, ECHO-LLaMA introduces cross-layer Key–Value (KV) and Query–Key (QK) sharing to reduce redundant attention computation. This approach achieves up to 77% higher token-per-second throughput during training, 16% higher Model FLOPs Utilization (MFU), and 7% higher test-time throughput, while preserving language modeling performance. On the mechanical-domain RoboEval benchmark, ECHO-CodeLLaMA-7B boosts average accuracy from 62.15% to 63.01% with only 50% KV sharing, confirming its robustness in domain adaptation. Together, these contributions form a coherent research program on the efficiency of large-scale Transformers. They demonstrate that intelligently exploiting representational redundancy—through quantization, low-rank structure, cross-layer sharing, and adaptive computation—can yield substantial compute savings with minimal trade-offs.