UWSpace

UWSpace is the University of Waterloo’s institutional repository for the free, secure, and long-term home of research produced by faculty, students, and staff.

Depositing Theses/Dissertations or Research to UWSpace

Are you a Graduate Student depositing your thesis to UWSpace? See our Thesis Deposit Help and UWSpace Thesis FAQ pages to learn more.

Are you a Faculty or Staff member depositing research to UWSpace? See our Waterloo Research Deposit Help and Self-Archiving pages to learn more.

Photo by Waterloo staff

Recent Submissions

  • Item type: Item ,
    Rushed by Discomfort, Trapped by Immersion: Users’ Experiences and Responses to Privacy Deceptive Design in Commercial VR Applications
    (Association for Computing Machinery, 2026-06-13) Hadan, Hilda; Valiquette, Michaela; Nacke, Lennart; Zhang-Kennedy, Leah
    Commercial Virtual Reality (VR) transforms people’s virtual experiences but introduces deceptive design opportunities that threaten user privacy. Although privacy deceptive patterns on 2D platforms are well-documented, their impacts in VR remain understudied. We surveyed 481 users’ experiences and responses to privacy deceptive patterns across eight commercial VR scenarios. We found that VR deceptive design can exploit both cognitive vulnerabilities and bodily strain, a phenomenon we define as Ergonomic Susceptibility, and that VR’s sensory-rich experiences can make users more likely to accept invasive data disclosure framed as immersion-preserving. Users recognized manipulation but their prior non-VR exposure can foster privacy resignation. Our study shows ergonomics is a critical factor in future privacy-preserving VR design, and urges VR researchers, designers, and policymakers to develop ethical design and privacy management solutions that account for VR’s unique multimodal, immersive, and ergonomic properties, building immersive experiences that respect user privacy and mitigate manipulative data practices.
  • Item type: Item ,
    Design, Dynamics, and Control of Upper-Limb Exoskeleton Robots
    (University of Waterloo, 2026-05-13) Wang, Yuntian
    Modern technology has enabled great improvements in the design and control of exoskeletons, which can assist users in various applications, including rehabilitation, muscle fatigue reduction, and power augmentation. However, existing power augmentation exoskeletons still face challenges in user comfort and transparency to the user. To improve the power augmentation, an active-passive shoulder exoskeleton was designed in a previous study, which combines the benefits of active and passive actuators, and was controlled by an electromyography-based (EMG) method. However, EMG-based control is sensitive to probe placement and unsuitable for factory use, while force/torque sensors add cost and depend on reliable contact. Therefore, we pursue model-based controllers of this active passive platform, without EMG or force/torque sensors. We first built a high-fidelity skeletal shoulder model in MapleSim, to guide our exoskeleton mechanical and controller designs. It was combined with the exoskeleton model to evaluate the proposed methods. To reduce unnecessary fatigue induced by human exoskeleton misalignment, it is important to understand the moving joint center of the human shoulder complex. The scapular kinematics is especially complex, so we proposed a simplified scapulothoracic model and validated it using bone-pin measurement data. To reduce human effort, a low impedance is required, but the long support chains in shoulder exoskeletons inherently make it prone to vibration. Hence, we proposed a model based vibration attenuation (VA) method for the exoskeleton in question. Static and dynamic human efforts were separately compensated, and the vibration attenuator was derived from identified structural elasticity. Furthermore, variable impedance can improve user comfort, but existing variable impedance profiles require expert tuning; thus, a new variable impedance law (Var-V) was proposed based on human biomechanics, which requires minimal tuning. To evaluate the proposed VA method and variable impedance law, we developed: i) a high-fidelity human-exoskeleton model in MapleSim; ii) a new 1-degree-of-freedom (DOF) human-exoskeleton adaptation model in MATLAB (CNS-MTG); iii) human-in-the-loop (HITL) experiments based on surface electromyography (sEMG). The MapleSim model assumes a perfect human adaptation that is not gradual, but it is more realistic than the 1-DOF adaptation model. The CNS-MTG adaptation model combined the human motor learning with muscle torque generator models, so that it has the advantages of both models. Two sets of HITL experiments were conducted: one for the VA method with a single participant, and the other for both the VA method and variable impedance laws with ten participants.
  • Item type: Item ,
    Adaptive Differential Privacy Budgeting Strategy for Optimizing Synthetic Data Generation and Privacy–Utility Trade-offs
    (University of Waterloo, 2026-05-13) Padalko, Kateryna
    Training generative models under differential privacy (DP) requires injecting calibrated noise into gradient updates, creating an inherent trade-off between privacy protection and data quality. In standard DP-CTGAN, a single discriminator processes all features under a shared privacy budget, so noise injected to protect sensitive demographic attributes equally degrades the learning signal for non-sensitive features, an architectural limitation, not a mathematical one. We propose the Dual-Path DP-CTGAN, a discriminator architecture that partitions features into sensitive and non-sensitive paths, each governed by its own DP-SGD mechanism and Rényi DP accountant. Gradient isolation confines privacy noise to its respective path, preserving the learning signal for non-sensitive features without relaxing the formal (ε, δ)-DP guarantee. By the post-processing theorem, the generator inherits the privacy guarantees of both paths without additional composition. We embed this architecture in a Bayesian multi-objective hyperparameter optimisation pipeline that jointly evaluates utility, distributional fidelity, and empirical privacy risk, using Pareto-dominance selection to surface non-dominated configurations. Experiments on the Adult Census Income benchmark demonstrate that Dual-Path at ε = 1 achieves distributional fidelity below the non-private baseline and reduces the downstream utility gap by 79% relative to single-path DP-CTGAN at the same budget, exceeding single-path performance at ε = 5 while maintaining comparable empirical privacy risk. Per-feature analysis confirms that the fidelity gain concentrates in the feature group freed from cross-path noise contamination, providing direct evidence for the gradient isolation mechanism. These results suggest that discriminator architecture, rather than the noise mechanism itself, is the primary bottleneck limiting utility in standard DP-GAN designs.
  • Item type: Item ,
    Mitigating Risks to Dependability from Vibe-Coding C for Embedded Systems
    (University of Waterloo, 2026-05-13) Dunne, Murray
    Vibe coding is the process of using a Large Language Model (LLM) to iteratively generate software code. It is popular, with 36% of workers at technology companies reporting adoption of generative artificial intelligence for software engineering in 2024 [1]. At this rate of use, LLM-generated code is quickly becoming part of the embedded-systems that comprise our everyday cyber-physical infrastructure. Most of this infrastructure is built on C language code [2]. LLM-generated C code poses threats to dependability, exhibiting faults such as buffer overflows, out-of-bounds writes, integer overflows, and more. In this work, we contribute methods for improving the dependability of these systems in three key parts: providing a real-world benchmark dataset for evaluating LLM-generated C code, protecting LLM code generation from poisoning attacks, and detecting changes in production embedded systems through power side-channel analysis. This work begins with an examination and categorization of weaknesses in LLMgenerated C code for embedded systems networking. Our findings suggest that LLMs perform poorly at programming tasks involving direct interactions with memory. Scores on existing LLM-generated C benchmarks do not adequately express this difficulty, as these benchmarks do not include sufficiently real-world C programming challenges. To support future testing of LLMs, we introduce EmbedEvalC, a dataset of C coding challenges to provide a benchmark against which LLMs can be evaluated on real-world tasks. Retrieval Augmented Code Generation (RACG) is an essential tool for vibe coding, but presents new threats to dependability from poisoning attacks. If an attacker can cause a RACG system to retrieve their crafted documents, they can induce the LLM to generate code with weaknesses. To detect this attack, we introduce canary functions, a process by which specific functions in the codebase are regenerated and re-tested to determine whether the addition of new documents induces new weaknesses. Finally, we consider the black-box setting where a systems integrator seeks to detect unexpected changes in embedded firmware. Such changes will only become more common with the proliferation of vibe coding. We suggest using power side-channel analysis to provide a feedback mechanism to a fuzzer in order to determine if a fuzzing input has caused a new response from the system. We show that responses involving five or more memory-interacting instructions are consistently detectable. In this work, we suggest a collection of techniques to mitigate risks to the dependability of embedded systems posed by LLM-generated C code. Abstract Citations: [1] Alex Singla, Alexander Sukharevsky, Lareina Yee, Michael Chui, and Bryce Hall. "The state of AI: How organizations are rewiring to capture value", McKinsey & Company, March 2025. [2] P. Soulier, D. Li, and J. R. Williams, “A Survey of Language-Based Approaches to Cyber-Physical and Embedded System Development,” Tsinghua Science and Technology, vol. 20, no. 2, pp. 130–141, 2015.
  • Item type: Item ,
    Efficiently Training Deep Learning Models on Elastic and Heterogeneous Cloud Resources
    (University of Waterloo, 2026-05-12) Guo, Runsheng
    Deep Neural Networks (DNNs) have demonstrated remarkable success across diverse domains, but their training requires substantial computational resources and is typically parallelized across large GPU clusters. However, such clusters are prohibitively expensive for most organizations to own and manage. Hence, instead of owning and managing their own clusters, organizations often rent clusters on cloud platforms to meet their training needs. While cloud environments offer elastic scalability and heterogeneous hardware options, they also introduce significant challenges for efficient distributed DNN training. Specifically, existing training frameworks lack support for dynamic reconfiguration during training, limiting the exploitation of cloud elasticity. Additionally, most systems assume homogeneous clusters, which rarely reflect the heterogeneous GPU clusters that organizations commonly use due to hardware availability constraints. Furthermore, heterogeneous network conditions in cloud environments create communication bottlenecks that limit the scalability of existing approaches. This thesis presents three systems that collectively address these limitations to enable efficient distributed DNN training on elastic and heterogeneous cloud resources. First, Hydrozoa leverages cloud elasticity through serverless containers, enabling dynamic scaling and configuration changes during training without the traditional pitfalls of serverless computing. By combining data and model parallelism with fine-grained resource provisioning, Hydrozoa achieves cost-effective training while eliminating cluster management overhead. Second, Cephalo addresses heterogeneous GPU clusters by independently balancing compute and memory resources across GPUs with different capabilities. Unlike existing approaches that tie workload assignment to computational speed, Cephalo separately optimizes compute distribution through proportional batch sizing and memory utilization through intelligent partitioning of training state, activation checkpointing, and gradient accumulation strategies. Third, Zorse tackles heterogeneous network conditions, which are particularly common in heterogeneous clusters, by efficiently combining memory-efficient data parallelism with pipeline parallelism. Through interleaved pipelining, parameter and activation offloading, and heterogeneous pipeline parallelism configurations, Zorse achieves both communication and memory efficiency for training large DNN models across diverse network topologies. The experimental evaluation demonstrates that these systems significantly improve training efficiency and resource utilization compared to existing approaches. Hydrozoa reduces training costs while providing seamless scalability, Cephalo simultaneously achieves high compute and memory utilization in heterogeneous clusters, and Zorse maintains high throughput under varying network conditions. Together, these contributions make distributed DNN training more accessible, cost-effective, and efficient in modern cloud environments, advancing the state of the art in large-scale machine learning infrastructure.