Electrical and Computer Engineering

Permanent URI for this collectionhttps://uwspace.uwaterloo.ca/handle/10012/9908

This is the collection for the University of Waterloo's Department of Electrical and Computer Engineering.

Research outputs are organized by type (eg. Master Thesis, Article, Conference Paper).

Waterloo faculty, students, and staff can contact us or visit the UWSpace guide to learn more about depositing their research.

Browse

Recent Submissions

Now showing 1 - 20 of 2057
  • Item
    Low-power and Radiation Hardened TSPC Registers
    (University of Waterloo, 2025-03-19) Maheshwari, Yugal; Sachdev, Manoj
    Battery-operated systems require power and energy-efficient circuits to extend their battery life. Flip-flops (FFs) are a basic component of digital circuits, and their power consumption and speed significantly impact the overall performance of a digital system. A clock network in a complex System-on-Chip (SoC) consumes a substantial amount of power. Additionally, often pipelines are used to enhance the system throughput, which puts additional burden on the clock network. Arguably, a flip-flop with fewer clock transistors will reduce its power burden on the clock network. This research proposes three very low-power Single-edge Triggered (SET) True Single-phase Clock (TSPC) FFs with only two and three clock transistors. Moreover, a scan-chain of 256 FFs and AES-128 encryption engine were designed as a benchmark to further investigate the power savings of the proposed FFs. Additionally, we have also designed three very low-power Dual-edge Triggered (DET) latch-multiplexer type TSPC FFs with only eight and ten clock transistors to sample the data at both positive and negative clock edges. Furthermore, high-performance computations in Integrated Circuits (ICs) are increasingly needed for space and safety-critical applications. ICs are subjected to high-energy ionizing particles in the radiant space environment, which will cause the device performance to degrade or even fail. A Single Event Upset (SEU) occurs in the logic circuit when an ion strikes a device’s sensitive node, changing the output from 0 to 1 or from 1 to 0. In radiant applications, ICs contain storage cells like FFs, latches, or Static Random Access Memories (SRAM), and always experience SEU. Although package and process engineering can minimize alpha particles, cosmic neutrons cannot be physically blocked. Therefore, for high reliability systems, soft error tolerant circuit designs are crucial. Traditional Radiation Hardened By Design (RHBD) techniques have some trade-offs between area, speed, power, and energy consumption. Thus, new designs are required to reduce these penalties. This research proposes two high-performance, low-power, low-energy, and low-area RHBD TSPC FFs with only four and five clock transistors suitable for space and safety-critical applications.
  • Item
    Investigating Dual Embodiment in Recurring Tasks with a New Social Robot: Designing the Mirrly Platform
    (University of Waterloo, 2025-03-19) Yamini, Ali; Dautenhahn, Kerstin
    In many contexts, including education, therapy, and everyday tasks, assistive robots have demonstrated considerable promise for augmenting human capabilities and providing supportive interactions. By designing and building a new tabletop social robot, Mirrly, as well as empirically examining how different robotic embodiments affect user engagement and task compliance, this thesis tries to contribute to this field. In light of advances in human-robot interaction (HRI) and child-robot interaction (CRI), I investigated a comprehensive set of mechanical, electronic, and software requirements. As a result of these requirements, Mirrly was developed, a low-cost, compact platform that could be deployed in schools, therapy centers, or personal homes and it is anthropomorphic enough for supporting social interactions with people. Following the design and implementation of Mirrly, I conducted a multi-session experiment to determine whether physical embodiment, virtual embodiment (mobile-based), or dual embodiment (both physical and virtual) promoted compliance with repetitive daily tasks, as relevant e.g. in clinical applications where patients need to comply with repetitive treatments. According to the results, physical presence is a strong motivator, leading to higher compliance and engagement, whereas dual embodiment enhanced participants' enjoyment (pleasure) of the interaction specifically. Interestingly, individual differences in the participant sample, such as personality traits and self-control, did not have a significant impact on adherence or user satisfaction. As at least within the short, relatively simple user tasks, these results emphasize the importance of design factors namely physical tangibility and interactive behaviors. As part of the thesis, a review of relevant HRI and CRI literature is conducted to contextualize Mirrly's design within the context of current robotics. Following a detailed description of utilized methodology, I present the experimental conditions, measures, and analytical methods for assessing compliance, engagement, and perceived enjoyment. Finally, I discuss the implications of the findings for building more adaptive, child-centered robots, especially in clinical, therapeutic and educational settings. Several future directions are also proposed, including extending task complexity, integrating advanced sensors for personalized feedback, and conducting longitudinal studies. As part of ongoing efforts in social and assistive robotics, this work introduces a novel robotic design. Moreover, in my study, I demonstrate that a robot with careful engineering, physical embodiment, and adaptability can significantly boost compliance. Consequently, this thesis lays a good foundation for future developments in CRI, highlighting how embodiment, anthropomorphism, and structured experimental design converge to support recurrent task compliance efficiently.
  • Item
    A Comprehensive Process for Addressing Market Power in Decentralized ADN Electricity Markets
    (University of Waterloo, 2025-03-05) AboAhmed, Yara; Salama, Magdy; Bhattacharya, Kankar
    Electric power systems have transformed globally, with distribution grids evolving into active distribution networks (ADNs), altering their characteristics and operations. Traditional centralized market structures have become inadequate for the complexities of the ADNs, leading to inefficiencies and challenges in reliable operation and energy pricing. ADN electricity markets offer a solution by leveraging smart grid features to integrate distributed energy resources (DERs), allowing non-utility entities, such as producers, consumers and prosumers, to participate directly, enhancing market efficiency, reducing monopoly power, and limiting utility control over prices. However, with the increasing penetration of DERs, there is a growing risk of market concentration and manipulation by entities owning large shares of DERs in ADN electricity markets. This poses a potential threat to market fairness, as some participants may exploit market power, leading to an uneven playing field, reducing the integrity and efficiency of ADN electricity markets. From this standpoint, this thesis investigates and adapts the concept of market power within ADN electricity markets, considering the unique characteristics of the market and the system. The investigation is structured around six central questions: (1) Can non-utility entities exercise market power in ADN electricity markets? (2) Is there a comprehensive framework for accurately monitoring, evaluating, and mitigating market power in decentralized ADN markets? (3) If such a framework exists, can it manage the complexity of monitoring the large number of ADN market participants? (4) If market power manipulation exists, are current investigations adequate, considering the decentralized market structure, the physical characteristics of the system, DER operational constraints, and the interplay between active and reactive power markets? (5) What types of decentralized market structures and frameworks—such as fully decentralized, community-based, or network-based peer-to-peer (P2P)—are appropriate for addressing market power in ADN electricity markets? (6) Are traditional market power mitigation methods applicable and effective in the context of ADN electricity markets considering the decentralized nature of the ADN and the dispersed DERs?. The primary objective of this thesis is to develop a fair and decentralized energy trading platform that limits monopoly power and mitigates market power abuse in ADN electricity markets. To achieve this goal, the thesis proposes an innovative comprehensive process for monitoring, evaluating, and mitigating market power, specially designed for the decentralized structure of ADNs and their market frameworks. This process considers the shifts in network configuration as well as the physical and operational characteristics of ADNs and their components. The process begins by monitoring market power of dominant market participants through introducing the zoning concept. These operational zones narrow down the number of market participants within each zone, addressing the challenge of monitoring a large number of market participants with widely distributed DERs and improving the identification and control of potential market power exercisers, thus minimizing their potential market power. These operational zones serve as decentralized interfaces between the zonal market participants and their corresponding zonal market operators, establishing a decentralized platform for energy trading. The second stage of the process focuses on evaluating market power through investigating and analyzing the strategic offering behavior of the potential market power exercisers identified in stage one. This analysis is conducted within the framework of a community-based P2P decentralized ADN electricity market, considering the physical and operational characteristics of both the system and DERs, along with the coupled active and reactive power markets. A comparative evaluation of market outcomes under competitive and strategic conditions is performed to identify strategic manipulators. In this context, the study also examines the applicability and effectiveness of conventional market power mitigation techniques used for the centralized market and assesses their impact on the strategic offering behavior of identified manipulators. While some traditional market power mitigation techniques may demonstrate efficiency, a new approach is necessary to address the unique decentralization characteristic of ADN electricity markets. A novel market power mitigation technique is proposed in the third stage of the process, targeting the root cause of market power: market concentration. This approach introduces an innovative market zoning concept, dynamically partitioning the system into "Market-Zones" to reduce market concentration while adapting to different system operational conditions, considering the uncertainties in system demand and generation, thereby aligning with the decentralized nature of ADNs and their markets. The proposed innovative zoning approach offers a robust solution for mitigating market power in decentralized ADN electricity markets. Within these Market-Zones, each player can actively engage and participate in the market and obtain the benefit without being overtaken by entities with large market shares. Consequently, the market power of the dominant players is subsided and diluted by utilizing the proposed Market-Zones, establishing a fair energy trading platform.
  • Item
    An Investigation Into the Effectiveness of Latent Variable Models for Domain Adaptation
    (University of Waterloo, 2025-03-04) Zeng, Xuanrui; Nielsen, Christopher
    The proliferation of machine learning with neural networks (NNs) has revolutionized fields such as computer vision and natural language processing. However, their successes often overshadow two important weaknesses of neural networks: (i) their reliance on large amounts of training data and (ii) the assumption of independent and identically distributed (i.i.d.) data. Because of these weaknesses, the vast majority of NNs today are applicationspecific machineries tuned to one task and one data domain. This thesis investigates the effectiveness of a latent variable model for unsupervised domain adaptation, aiming to bridge the gap between two different data distributions while leveraging only labeled data samples from one, and unlabeled data samples from the other. A novel generative modeling framework is proposed to address this problem, incorporating recent advances in probabilistic modeling and variational inference techniques from the neural network literature. Empirical results of the proposed approach seem promising, and indicates adequate transfer of the labeling knowledge of the model across disparate data domains without requiring manual re-labeling or domain-specific adjustments. Moreover, the proposed approach has also shown potentials in solving the related domain translation problem. Despite these fortunes, the existing approach has shown limitation in solving more complex scenarios of unsupervised domain adaptation, speficially those involving more vibrant differences between domains.
  • Item
    A Compressive-Sensing-Capable CMOS Electrochemical Capacitance Image Sensor with Two-Dimensional Code-Division-Multiplexed Readout
    (University of Waterloo, 2025-03-04) McLachlan, Shane; Levine, Peter
    Electrochemical capacitance imaging is a technique used to observe biological analyte or processes at the surface of an electrode, immersed in an electrolyte, via small changes in capacitance. This technique has various applications in biosensing such as biomedical diagnostics, neural interfaces and DNA sensors. Complimentary metal-oxide-semiconductor (CMOS) technology is well suited for implementing electrochemical capacitance image sen- sors since high spatial resolution electrode arrays and readout circuitry can be integrated on the same chip. This thesis presents the design and simulation of a 256 × 256 pixel electrochemical capacitance image sensor fabricated in a 180-nm analog/mixed-signal CMOS process. Our image sensor features a novel two-dimensional code-division-multiplexed (2D CDM) readout architecture that directly outputs analog coefficients of the 2D Walsh transform of the image. To the best of our knowledge, we are the first to implement true 2D CDM readout in the capacitive image sensor space. For passive-pixel sensors, CDM readout yields a signal-to-noise ratio (SNR) increase over traditional time-division-multiplexed (TDM) readout through integrating orthogonal combinations of all pixels for the entire frame time. Use of the 2D Walsh transform enables compressive sensing at the time of array readout, which is achieved by exploiting the energy compaction property of the Walsh domain. Compressive sensing provides analog lossy image compression that can enable a frame rate increase or power consumption decrease. In addition, our transform domain readout architecture removes the layout requirement for pitch-matched column amplifiers, requiring only one larger column circuit for the full array. Some potential advantages introduced by this include reductions to both amplifier flicker noise and fixed-pattern noise from transistor mismatch. Our sensor uses two-transistor switched-capacitor pixels with a 3.2 × 3.2 μm² work- ing electrode and 3.88 μm grid pitch to enable charge-based capacitance measurement. On-chip 256-bit parallel Walsh code generators enable power efficient orthogonal code generation. Full-chip post-layout analog simulation with a biological capacitance image demonstrates that we can achieve a structural similarity index (SSIM) of 0.875 versus a reference image. SSIM values range from 0 to 1, where 1 indicates complete image similarity.
  • Item
    Wideband Signal Generation at Millimeter-Wave and Sub-THz Frequencies
    (University of Waterloo, 2025-02-20) Su, Zi Jun; Boumaiza, Slim; Mitran, Patrick
    The rise of sixth-generation (6G) wireless technology has created a need for wideband signal generation at high radio frequencies (RF). However, current digital-to-analog converters (DACs) face limitations, offering either wide bandwidth with low resolution or high resolution with limited bandwidth. This thesis proposes two methods that utilize multiple DACs to generate multiple narrowband sub-bands of a wideband signal, that are combined to produce the desired wideband signal. These methods employ distinct digital processing approaches tailored to specific applications, such as instrumentation or real-time Orthogonal Frequency Division Multiplexing (OFDM) signal generation. To address non-idealities in frequency-stitching-based transmitters, a frequency-domain calibration technique using multi-tone signals is introduced. Experiments at X-band (9.6 GHz) and D-band (129.6 GHz) validate these methods, demonstrating up to 8 GHz bandwidth and achieving an error vector magnitude (EVM) as low as 0.3\% for a 7.2 GHz 256-QAM OFDM signal. A comparative study of three signal generation approaches—direct Arbitrary Waveform Generator (AWG) generation, baseband in-phase and quadrature (IQ) generation with up-conversion, and frequency stitching—shows EVMs of 1.5\%, 0.8\%, and 1\%, respectively, for an 8 GHz OFDM signal. A novel architecture using phase-coherent IQ-DACs and mixers for each sub-band is also presented. Calibration using non-uniformly interleaved tones corrects IQ imbalances and distortions, enabling the generation of a 256-QAM OFDM signal with 12 GHz bandwidth at D-band (149 GHz) and achieving a peak data rate of 96 Gbps. Calibration improves EVM and normalized mean square error (NMSE) from 82.6\% and 23.8\% to below 2\% and 1\%, respectively. Additionally, D-band amplifier linearization with a 4 GHz modulation bandwidth improves adjacent channel power ratio (ACPR) from -27.8/-26 dBc to -42.8/-43.1 dBc and EVM from 8.5\% to 1.2\%. Finally, two architectures for sub-band combination are compared. One generates a wideband signal at intermediate frequency (IF) and up-converts it, while the other up-converts narrowband IF signals and combines them. The second approach demonstrates superior ACPR at high IF power levels, enhancing ACPR by up to 8 dB when generating a 1.2 GHz modulated signal at 142.5 GHz. These results highlight the efficacy of the proposed methods for generating and linearizing high-quality wideband signals, supporting advanced applications in millimeter wave and sub-THz frequency bands for 6G technologies.
  • Item
    Model Predictive Control for Systems with Partially Unknown Dynamics Under Signal Temporal Logic Specifications
    (University of Waterloo, 2025-02-18) Dai, Zhao Feng; Pant, Yash; Smith, Stephen
    Autonomous systems are seeing increased deployment in real-world applications such as self-driving vehicles, package delivery drones, and warehouse robots. In these applications, such systems are often required to perform complex tasks that involve multiple, possibly inter-dependent steps that must be completed in a specific order or at specific times. One way of mathematically representing such tasks is using temporal logics. Specifically, Signal Temporal Logic (STL), which evaluates real-valued, continuous-time signals, has been used to formally specify behavioral requirements for autonomous systems. This thesis proposes a design for a Model Predictive Controller (MPC) for systems to satisfy STL specifications when the system dynamics are partially unknown, and only a nominal model and past runtime data are available. The proposed approach uses Gaussian Process (GP) regression to learn a stochastic, data-driven model of the unknown dynamics, and manages uncertainty in the STL specification resulting from the stochastic model using Probabilistic Signal Temporal Logic (PrSTL). The learned model and PrSTL specification are then used to formulate a chance-constrained MPC. For systems with high control rates, a modification is discussed for improving the solution speed of the control optimization. In simulation case studies, the proposed controller increases the frequency of satisfying the STL specification compared to controllers that use only the nominal dynamics model. An initial design is also proposed that extends the controller to distributed multi-agent systems, which must make individual decisions to complete a cooperative task.
  • Item
    Advancing Causal Representation Learning: Enhancing Robustness and Transferability in Real-World Applications
    (University of Waterloo, 2025-02-13) Shirahmad Gale Bagi, Shayan; Crowley, Mark; Czarnecki, Krzysztof
    Conventional supervised learning methods heavily depend on statistical inference, often assuming that data is identically and independently distributed (i.i.d). However, this assumption rarely holds in real-world scenarios, where environments or domains frequently shift, posing significant challenges to model robustness and generalization. Moreover, statistical models are typically treated as black boxes, with their learned representations remaining opaque and challenging to interpret. My research addresses these issues through a causal learning perspective, aiming to enhance the interpretability and adaptability of machine learning models in dynamic and uncertain environments. I have developed innovative methods for learning causal models that are applicable to a wide range of machine learning tasks, including transfer learning, out-of-distribution generalization, reinforcement learning, and action classification. The first method introduces a generative model tailored to learn causal variables in scenarios where the causal graph is known, such as Human Trajectory Prediction. By incorporating domain knowledge, this approach models the underlying causal mechanisms, leading to improved performance on both synthetic and real-world datasets. The results demonstrate that this generative model outperforms traditional statistical models, particularly in out-of-distribution contexts. The second method targets the more challenging scenario where the causal structure is unknown. I have explored various conditions and assumptions that facilitate the discovery of causal relationships without prior knowledge of the causal graph. This method combines advanced techniques in causal inference and machine learning to uncover the underlying causal graph and variables from observed data. Evaluations on both real-world and synthetic datasets show that this method not only surpasses existing approaches in causal representation learning but also brings AI systems closer to practical, real-world applications by enhancing reliability and interpretability. Overall, my research contributes significant advancements to the field of causal learning, providing novel solutions that improve model interpretability and robustness. These methods lay a strong foundation for developing AI systems capable of adapting to diverse and evolving real-world conditions, thereby broadening the scope and impact of machine learning across various domains.
  • Item
    A Comprehensive Framework Incorporating Hybrid Deep Learning Model, Vi-Net, for Wildfire Spread Prediction and Optimized Safe Path Planning
    (University of Waterloo, 2025-02-11) Dhindsa, Manavjit Singh; Naik, Kshirasagar
    Forest fires are becoming more prevalent than ever, and their intensity and frequency are expected only to increase owing to climate change and environmental degradation. These fires severely threaten the economy, human lives, and infrastructure. Therefore, effective management of wildfires is of utmost importance, and accurately predicting the wildfire spread lies at the core of it. Reliable predictions of fire spread not only provide insights about the at-risk regions but also help in planning several mitigation activities including resource allocation and evacuation planning. This thesis introduces Vi-Net, an innovative hybrid deep learning model, which integrates the localized precision of U-Net with the global contextual awareness of Vision Transformers (ViT) to predict next-day wildfire spread with unprecedented accuracy. This study utilizes an extensive multimodal dataset that accumulates data from different sources across the United States from 2012 to 2020 incorporating critical factors such as topographical, meteorological, anthropological (population density), and vegetation indices. These elements are vital for modeling the complex dynamics of wildfire spread. A significant challenge in this domain is the class imbalance as the fire points are generally quite less compared to non-fire points. The dataset used in this study had fire regions less than 5% of the total data. To address this issue, advanced loss functions, including Focal Tversky Loss (FTL), are employed, prioritizing accurate segmentation of fire-prone regions while minimizing false negatives. FTL modifies the focus towards hard-to-predict regions and crucial boundaries, thereby enhancing the model's predictive accuracy and reliability in practical scenarios. Vi-Net addresses the complexities in modeling fire dynamics by synergizing the strengths of U-Net and ViT. Integrating U-Net and ViT in Vi-Net allows for a comprehensive analysis that ensures high precision and recall, effectively balancing the sensitivity and specificity needed in wildfire predictions. This dual approach allows the model to process detailed local information and extensive contextual data, making it exceptionally capable of identifying and predicting fire spread across diverse landscapes. Experimental results highlight the superiority of Vi-Net over traditional models, achieving an F1 Score of 97.25% and an Intersection over Union (IoU) of 94.15% on the test dataset. These metrics highlight its capability to accurately capture localized fire patches and long-range dependencies while avoiding overprediction. These advancements validate the model's potential to offer more nuanced predictions, capturing the interplay between micro and macro-level environmental dynamics. In addition to predictive modeling, this research extends its practical applicability by integrating the predicted fire masks into an optimized A* algorithm for safe path planning. This step ensures actionable insights for emergency response teams, facilitating efficient evacuation routes and resource allocation while avoiding high-risk fire regions. Qualitative and quantitative analyses confirm the hybrid model's efficacy, with visualizations demonstrating Vi-Net’s ability to preserve spatial detail while capturing broad environmental contexts, and path planning results illustrating the model's robustness and reliability. This research not only sets a new benchmark for wildfire prediction models but also demonstrates the potential of hybrid deep learning systems in environmental science applications. By providing a robust framework for real-time wildfire management, Vi-Net could significantly influence future strategies in disaster response and resource allocation. Future enhancements could include integrating real-time data feeds to further improve the adaptability and predictive capabilities of the model, potentially revolutionizing wildfire management practices globally.
  • Item
    A Graph Neural Network Based Approach for Predicting Wildfire Burned Areas
    (University of Waterloo, 2025-02-10) Das, Ursula; Naik, Kshirasagar
    Wildfires annually cause substantial economic and environmental losses and has a detrimental impact on human lives and health due to the release of their harmful byproducts. Moreover, wildfire incidents have exhibited an alarming surge in frequency as well as severity in recent years due to increased urbanization near forested areas coupled with climate change, highlighting the need for advanced technologies to predict wildfire behavior in advance and mitigate its impact. In recent years, the enormous strides in machine learning research coupled with the increased availability of wildfire data through various sources such as remote sensing and the increased availability of computational resources have fueled the rise of data-driven approaches across all stages of wildfire management. Despite the growing adoption of machine learning-driven approaches in wildfire mitigation, the primary focus has been on analyzing historical patterns and identifying the causes leading to wildfire patterns rather than predicting wildfire behavior. The prediction of wildfire behavior over time, such as the burned area has been largely underexplored. This study aims to address this gap by advancing data-driven methods for predicting wildfire behavior during the active fire stage and aiding in resource allocation efforts. This study adopts a Graph Neural Network based framework for predicting the burned area resulting from a wildfire ignition. While CNN-based architectures have been widely employed to model wildfire behavior, including spread prediction, as a semantic segmentation task, these architectures impose specific limitations on geospatial data due to their reliance on fixed-size inputs and local receptive fields. Graph Neural Network (GNNs), have shown success in capturing the long-range dependencies and irregular-sized inputs inherent in geospatial data, such as wildfires, making them a viable alternative to CNNs. To this end, a GNN-based approach is adopted to model wildfire burned area prediction. A framework is developed to represent spatial wildfire data and its influencing factors as homogeneous graphs followed by the development of three distinct GNN models based on different message-passing mechanisms to process the graph-structured data. The results obtained through various experiments illustrate the efficacy of Graph Neural Networks in modeling wildfire behavior. In terms of Precision, most GNN models outperform the segmentation models, with the highest achieving a score of 0.4536. For AUROC, all GNN models demonstrate superior performance, reaching a maximum of 0.9377. Based on AUPRC, the Graph Convolutional Network (GCN) model surpasses all others, including segmentation models, with a top score of 0.4787. These findings underscore the potential of Graph Neural Networks (GNNs) as a powerful tool for wildfire behavior modeling and supporting resource allocation initiatives.
  • Item
    Data-Driven Predictive Control: Equivalence to Model Predictive Control Beyond Deterministic Linear Time-Invariant Systems
    (University of Waterloo, 2025-02-07) Li, Ruiqi; Smith, Stephen L.; Simpson-Porco, John W.
    In recent years, data-driven predictive control (DDPC) has emerged as an active research area, with well-known methods such as Data-enabled Predictive Control (DeePC) and Subspace Predictive Control (SPC) being validated through reliable experimental results. On the theoretical side, it has been established that both DeePC and SPC methods can generate equivalent control actions as one can obtain from Model Predictive Control (MPC), for deterministic linear time-invariant (LTI) systems. However, similar results do not yet exist for the application of DDPC beyond deterministic LTI systems. Therefore, the objective of our research is to generalize this theoretical equivalence between model-based and data-driven methods for more general classes of control systems. In this thesis, we present our contributions to DDPC for linear time-varying (LTV) systems and stochastic LTI systems. In our first piece of work, we developed Periodic DeePC (P-DeePC) and Periodic SPC (P-SPC) methods, which generalize DeePC and SPC from LTI systems to linear time-periodic (LTP) systems, as a special case of LTV systems. Theoretically, we demonstrate that our P-DeePC and P-SPC methods have equivalence control actions as produced from MPC for deterministic LTP systems, under appropriate tuning conditions. As an intermediate step in our theoretical development, we extended certain aspects of behavioral systems theory from LTI systems to LTP/LTV systems. This includes extending Willems’ fundamental lemma to LTP systems and the defining the concepts of order and lag for LTV systems. In our second piece of work, we proposed a control framework for stochastic LTI systems, namely Stochastic Data-Driven Predictive Control (SDDPC). Our SDDPC method theoretically achieves equivalent control performance to model-based Stochastic MPC, under idealized conditions of appropriate tuning and noise-free offline data. This method, which applies to general linear stochastic state-space systems, serves as an alternative to the data-driven method previously proposed by Pan et al., which also achieved theoretical equivalence to Stochastic MPC but was limited to a narrower class of systems. Beyond the theoretical assumption of noise-free offline data, we performed our SDDPC method in simulations with practical noisy offline data. The simulation results demonstrated that our SDDPC method outperforms benchmark methods, achieving lower cumulative tracking cost and lower rate and amount of constraint violation.
  • Item
    Automated Generation, Evaluation, and Enhancement of JMH Microbenchmark Suites from Unit Tests
    (University of Waterloo, 2025-02-03) JANGALI, MOSTAFA; SHANG, WEIYI
    Ensuring the performance of software systems is a cornerstone of modern software engineering, directly influencing user satisfaction and reliability. Despite its critical role, performance testing remains resource-intensive and difficult to scale, particularly in large projects, due to the complexity of microbenchmark creation and execution. Microbenchmarking frameworks like the Java Microbenchmark Harness (JMH) offer precise performance insights but require significant expertise, limiting their adoption. This thesis addresses these challenges by introducing ju2jmh, a novel framework that automates the transformation of JUnit tests into JMH microbenchmarks, bridging the gap between functional and performance testing. The contributions of this thesis are threefold. First, ju2jmh automates the generation of high-quality JMH microbenchmarks from widely used JUnit test suites, enabling developers to adopt performance microbenchmarking with minimal manual effort. Results demonstrate that the generated microbenchmarks exhibit stability comparable to manually crafted ones and effectively detect real-world performance bugs. Second, the Performance Mutation Testing (PMT) framework is developed to systematically evaluate the robustness of microbenchmarks in detecting artificial performance bugs, achieving competitive mutation scores. Third, a clustering approach is proposed to optimize the execution of microbenchmarks by grouping functionally similar tests based on code coverage information. This strategy reduces execution time by 81.2% to 86.2% across three large-scale projects while preserving accuracy and reliability. Evaluated on three diverse open-source Java projects, the proposed solutions address stability, detection capabilities, and scalability challenges in performance testing workflows. The findings highlight the potential of ju2jmh and its associated methodologies to transform performance microbenchmarking practices, providing developers with practical tools to integrate reliable and efficient performance testing into modern software development pipelines. These advancements pave the way for future research into extending automated performance testing across different programming languages and development ecosystems.
  • Item
    Software Infrastructure for Isolation and Performance Monitoring in Virtualized Systems
    (University of Waterloo, 2025-02-03) Rahman, Abdur; Pellizzoni, Rodolfo
    Modern multiprocessor System-on-Chip (SoC) architectures host a rich tapestry of heterogeneous components, enabling multiple workloads with differing requirements to run simultaneously on the same hardware platform. However, managing and isolating these concurrently running applications presents significant challenges. Traditional virtualization techniques, even with static partitioning hypervisors, could struggle to ensure robust isolation due to contention in shared system resources such as caches and memory bandwidth. To address this issue, this thesis investigates memory bandwidth contention among cores and explores isolation strategies by implementing MemGuard in the Bao Hypervisor on ARMv8-based systems. This implementation is complemented by cache coloring and DRAM bank partitioning techniques. The results, evaluated using the San Diego Vision Benchmark Suite, quantify the effectiveness of these mechanisms in reducing interference and provide insights into program behavior under varying isolation parameters. Beyond improving isolation, performance monitoring must extend beyond core-level observation to encompass system-wide interactions. To this end, this thesis develops a comprehensive software infrastructure for an Advanced Performance Monitoring Unit (APMU), designed for event-driven monitoring and dynamic runtime reconfiguration. By leveraging an LLVM-based toolchain to support custom instructions and integrating seamlessly with the hypervisor and guest OS layers, the APMU framework enables diverse applications while optimizing memory utilization and execution time. Collectively, the results and infrastructure presented in this work contribute to more predictable, secure, and efficient computing systems, advancing the state of the art in virtualization, performance isolation, and heterogeneous system analysis.
  • Item
    CADC++: Extending CADC with a Paired Weather Domain Adaptation Dataset for 3D Object Detection in Autonomous Driving
    (University of Waterloo, 2025-01-28) Tang, Mei Qi; Czarnecki, Krzysztof
    Lidar sensors enable precise 3D object detection for autonomous driving under clear weather but face significant challenges in snowy conditions due to signal attenuation and backscattering. While prior studies have explored the effects of snowfall on lidar returns, its impact on 3D object detection performance remains underexplored. Conducting such an evaluation objectively requires a dataset with abundant labelled data from both weather conditions and ideally captured in the same driving environment. Current driving datasets with lidar data either do not provide enough labelled data in both snowy and clear weather conditions, or rely on simulation methods to generate data for the weather domain with insufficient data. Simulations, nevertheless, often lack realism, introducing an additional domain shift that impedes accurate evaluations. This thesis presents our work in creating CADC++, a paired weather domain adaptation dataset that extends the existing snowy dataset, CADC, with clear weather data. Our CADC++ clear weather data have been recorded on the same roads and around the same days as CADC. We pair each CADC sequence with a clear weather one as closely as possible, both spatially and temporally. Our curated CADC++ achieves similar object distributions as CADC, enabling minimal domain shift in environmental factors beyond the presence of snow. Additionally, we propose track-based auto-labelling methods to overcome a limited labelling budget. Our approach, evaluated on the Waymo Open Dataset, achieves a balanced performance across stationary and dynamic objects and still surpasses a standard 3D object detector when using as low as 0.5% of human-annotated ground-truth labels.
  • Item
    Impact of Mechanical and Electrical Tilting for Cellular-Connected Drones and Legacy Users
    (University of Waterloo, 2025-01-23) Elleathy, Ahmad; Rosenberg, Catherine
    Drones, also known as Unmanned Aerial Vehicle (UAV)s, have lately been employed for a variety of tasks in our daily lives, including surveillance, delivery, and rescue operations. High-performance, dependable two-way communication with cellular networks is necessary to expand UAV applications quickly. Supporting different UAVs into current fifth generation (5G) networks is challenging. One of these challenges comes from ground and aerial users having different channel properties. This thesis investigates how the performance of cellular-connected UAVs and legacy ground users in a cellular network can be improved by changing the antenna tilting angle or type, and we will consider mechanical, electrical, and hybrid tilting in the system. This study considers the case of single user Multiple-Input Multiple-Output (SU-MIMO) system featuring Uniform Linear Array (ULA) or Uniform Planar Array (UPA) antenna system with Third Generation Partnership Project (3GPP) parameters. This study illustrates the impact of antenna tilting in improving user throughput, making it easier to integrate UAVs into 5G and future networks. These conclusions are supported by simulation results, which also show how hybrid tilting may be used as a scalable way to enhance multi-user performance for next-generation networks.
  • Item
    The Interplay of Information Theory and Deep Learning: Frameworks to Improve Deep Learning Efficiency and Accuracy
    (University of Waterloo, 2025-01-23) Mohajer Hamidi, Shayan; Yang, En-Hui
    The intersection of information theory (IT) and machine learning (ML) represents a promising, yet relatively under-explored, frontier with significant potential for innovation. Despite the clear benefits of combining these fields, progress has been limited by two main challenges: (i) the highly specialized nature of IT and ML, which creates a barrier to cross-disciplinary expertise, and (ii) the computational complexity involved in applying information-theoretic concepts to large-scale ML problems. This dissertation seeks to overcome these challenges and explore the rich possibilities at the intersection of IT and ML. By leveraging powerful tools and concepts from IT, we aim to uncover novel insights and develop innovative ML algorithms. Given that deep neural networks (DNNs) form the backbone of modern ML models, the integration of IT principles into ML requires a focus on optimizing the training and performance of DNNs using information-theoretic frameworks. While DNNs have a broad range of applications, this thesis narrows its focus to two key areas: classification and generative DNNs. The objective is to harness IT principles to enhance the performance of these models. • Classification DNNs. For classification DNNs, this dissertation targets improvements in three critical areas: (i) Improving classification accuracy. The performance of classification DNNs is traditionally measured by classification accuracy, but we argue that conventional error metrics are insufficient for capturing a model’s true performance. By introducing the concepts of conditional mutual information (CMI) and normalized conditional mutual information (NCMI), we propose a new metric for evaluating DNNs. The CMI measures intra-class concentration, while the ratio of CMI to NCMI reflects inter-class separation. We then modify the standard loss function in deep learning (DL) framework to minimize the standard cross entropy function subject to an NCMI constraint, yielding CMI constrained deep learning (CMIC-DL). Then, via extensive experiment results, we show that DNNs trained within CMIC-DL achieves a higher classification accuracy compared to the state-of-the-art models trained within the standard DL and other loss functions in the literature. (ii) Enhancing distributed learning accuracy. In the context of distributed learning, particularly federated learning (FL), we tackle the challenge of class imbalance using informationtheoretic concepts to improve the accuracy of the shared global model. To this end, we introduce new information-theoretic quantities into FL and propose a modified loss function based on these principles. This leads to the development of a federated learning framework, Fed-IT, which enhances the classification accuracy of models trained in distributed environments. (iii) Reduce the size and training/inference complexity. We introduce coded deep learning (CDL), a novel framework aimed at reducing the computational and storage complexity of classification DNNs. CDL achieves this by compressing model weights and activations through probabilistic quantization. Both forward and backward passes during training are performed using quantized weights and activations, significantly reducing floating-point operations and computational overhead. Furthermore, CDL imposes entropy constraints on weights and activations, ensuring compressibility at every stage of training, which also reduces communication costs in parallel computing environments. This leads to models that are more efficient in both training and inference, with lower storage and computational requirements. • Generative DNNs. For generative DNNs, this dissertation focuses on diffusion models and their application to solving inverse problems. Inverse problems are common in fields like medical imaging, signal processing, and physics, where the goal is to recover an underlying cause from corrupted or incomplete observations. These problems are often ill-posed, with multiple possible solutions or high sensitivity to small changes in the data. In this dissertation, we enhance the performance of diffusion models by incorporating probabilistic principles, making them more effective at capturing the posterior distribution of the underlying causes in inverse problems. This approach improves the model’s ability to accurately reconstruct signals and provides more reliable solutions in challenging inverse problem scenarios. Overall, this dissertation demonstrates the powerful synergy between IT and ML, showcasing novel methods that improve the accuracy and efficiency of both classification and generative DNNs. By addressing key challenges in training and optimization, this work lays the foundation for future research at the intersection of these two fields.
  • Item
    Linear Acceleration Perception on a Moving VR Environment
    (University of Waterloo, 2025-01-22) Zhou, Justin; Wang, David
    Understanding how people perceive linear acceleration is crucial for creating more realistic and immersive virtual environments. This thesis investigates how people perceive linear acceleration on a moving platform while in virtual reality (VR). The objective is the identify the Just Noticeable Difference (JND), which represents the smallest detectable change in stimuli that users can perceive. The study integrates a physically moving platform with a VR environment, employing a staircase method to determine upper and lower perception bounds. By focusing on human sensitivity to acceleration, the research aims to bridge the gap between physical and virtual motion experiences, a key motivator for enhancing VR realism. The results demonstrate that at low accelerations, there is an distinguishable upper and lower bound of acceleration perception. These findings, validated through statistical methods including t-tests, offer insights into how people perceive changes in acceleration. However, unexpected trends, such as increased variability at lower accelerations, suggest further investigation is needed to confirm the applicability of Weber’s Law in this context. The research also highlights practical applications, such as space conservation in VR motion systems, by leveraging the lower acceleration JND to shorten track distances without compromising perceived realism. Limitations, including sample size and equipment constraints, are acknowledged, and future work is proposed to explore higher speeds, angular acceleration, and alternative experimental conditions. By advancing our understanding of linear acceleration perception, this study provides a foundation for improving VR systems used in training, entertainment, and rehabilitation, ensuring they balance realism, comfort, and practicality.
  • Item
    A Low-Cost Technique for improving Angular Scan Range of Phased Array Antennas
    (University of Waterloo, 2025-01-22) Mostafa, Mahmoud; Abdel-Wahab, Wael; Majedi, Hamed
    With the emergence of modern communication technologies, there has been an increasing demand for faster and higher-quality communication, which necessitates higher bit rates and, consequently, greater bandwidth. This shift has driven the adoption of higher operational frequencies, such as millimeter-wave bands. For instance, 5G mobile communications operate in the K band (18–27 GHz) and Ka band (27–40 GHz), while satellite communications often use the Ku band (12–18 GHz) and Ka band. However, as the operational frequency increases, path loss becomes significantly higher, requiring the use of higher-gain antennas to compensate for this loss. A key drawback of using high-gain antennas, such as parabolic reflectors, is the difficulty in steering the beam to cover a wider angular range. Phased array antennas provide an excellent solution as transmitting or receiving antennas for that reason, as they provide a high gain with the ability to electronically steer the beam to other directions by changing the progressive phase shift between the array elements. Designing a high-performance broadband phased array antenna with a wide angular scanning range is challenging, as the antenna parameters are interrelated and require tradeoffs. For example, increasing the distance between elements reduces mutual coupling and increases the effective aperture of the array, thereby enhancing its gain. However, this also causes grating lobes to appear at lower scan angles, thereby limiting the angular scanning range. Additionally, a larger element spacing necessitates a wider electronic phase shift range, requiring a more linear phase shifter with frequency, which complicates the design of the feeding network. The focus of this research is to investigate a low-cost approach to improving the angular scanning range of phased array antennas through the use of a wide angle impedance matching layer (WAIM), employing two techniques. First, A general analytical method is provided to characterize the array’s scan impedance variation in the presence of nearby reflecting surfaces, such as a ground plane or WAIM layers. Second, the generalized Smatrix technique is used to model the array unit cell and transmission line (TL) models for WAIM layers. The WAIM layer offers a low-cost, scalable solution to increase the angular scanning range of phased array antennas without altering their lattice configuration or feeding network. This makes it a modular solution, simpler than other techniques. In this thesis, Both main WAIM modeling techniques are investigated, applied to different array examples (slot and dipole arrays). Then, the GSM method is used to design a fully dielectric WAIM layer
  • Item
    Implementation and Comparative Analysis of Open 5G Standalone Testbeds: A Systematic Approach
    (University of Waterloo, 2025-01-21) Amini, Maryam; Rosenberg, Catherine
    Open-source software and commercial off-the-shelf hardware are finally paving their way into the 5G world, resulting in a proliferation of experimental 5G testbeds. Surprisingly, very few studies have been published on the comparative analysis of testbeds with different hardware and software elements. This dissertation is a comprehensive study on the implementation of experimental 5G testbeds and the challenges associated with them. We first introduce a precise nomenclature to characterize a 5G-standalone single-cell testbed based on its constituent elements and main configuration parameters. We then build 36 distinct such testbeds and systematically analyze and compare their performance with an emphasis on element interoperability, as well as the number and type of User Equipment (UE), to address the following questions: 1) How is the performance (in terms of bit rate and latency) impacted by different elements? 2) How does the number of UEs affect these results? 3) What is the impact of the user(s)' location(s) on the performance? 4) What is the impact of the UE type on these results? 5) How far does each testbed provide coverage? 6) What is the impact of the available computational resources on the performance of each open-source software? Finally, to illustrate the practical applications of such open experimental testbeds, we present a case study focused on user scheduling. Historically, most research on user scheduling has been conducted using simulations, with a strong emphasis on downlink scheduling. In contrast, our study is fully experimental, and targets enhancements to the uplink scheduler within the open-source Radio Access Network platform, srsRAN. We aim to move beyond simulation-based evaluations and explore how improvements in the uplink scheduler translate to real-world performance, specifically by measuring their impact on the user experience in a live experimental testbed.
  • Item
    Learning Design Parameters to Build Application Customizable Network-on-Chips for FPGAs
    (University of Waterloo, 2025-01-20) Malik, Gurshaant Singh; Kapre, Nachiket
    We can exploit configurability of Field Programmable Gate Arrays (FPGA) and maximize the performance of communication-intensive FPGA applications by designing specifically customized Network-on-Chips (NoCs) using Machine Learning (ML). As transistor density growth stalls, NoCs play an increasingly critical role in deployment of FPGA applications for modern-day use cases. Unlike Application-Specific Integrated Circuits (ASICs), FPGA configurability allows the design of application-aware NoCs that can outperform statically configured NoCs in terms of both performance and efficiency. Conventional NoC design process is typically centered around universally-sound one-size-fits-all NoC design decisions and does not take the underlying application into account. In contrast, we present application aware designs that learn their NoC parameters by casting the NoC design space as a function of application performance using ML algorithms. Complex and non-obvious relationships between the large search space of NoC parameters and performance of the underlying FPGA application necessitates a more efficient approach than manual hand-tuning or brute force. Modern ML algorithms have demonstrated a remarkable ability to generalize to complex representations of the world by extracting high-order, non-linear features from complex inputs. In this thesis, we identify 1) NoC topology, 2) Flow control and 3) Regulation rate, as the key NoC design variables that have the strongest influence on application performance and leverage two primary ML methodologies in this thesis: 1) Stochastic Gradient Free Evolutionary Learning and 2) Gradient based Supervised Learning. First, we present NoC designs based on Butterfly Fat Tree (BFT) topology and light weight flow control. These BFT-based NoCs can customize their bisection bandwidth to match the application being routed while providing features such as in-order delivery and bounded packet delivery times. We present the design of routers with 1) latency-insensitive interfaces, coupled with 2) deterministic routing policy, and 3) round-robin scheduling at NoC ports. We evaluate our NoC designs under various conditions to deliver up to 3x lower latency and 6x higher throughput. We also learn the routing policy on a per-switch basis in an application-aware manner using Maximum Likelihood Estimation, decreasing latencies by a further ~1.1--1.7x over the static policy. Second, we overcome the pessimism in routing analysis of timing-predictable NoCs through the use of a "hybrid" application-customized NoCs. HopliteBuf NoCs leverage stall-free FIFOs as a measure of flow control under token bucket regularization. The static analysis, in the worst-case, can deliver very large FIFO size and latency bounds. Alternatively, HopliteBP uses light-weight backpressure as flow control under similar injection regulation. But, it suffers from severely pessimistic static analysis due to propagation of backpressure to other switches. We show that a hybrid FPGA NoC that seamlessly composes both design styles on a per-switch basis, delivers the best of both worlds. We learn, specifically for the application being routed, the switch configuration through a novel evolutionary algorithm based on Maximum Likelihood Estimation (MLE). We demonstrate ~1--6.8x lower routing latencies and ~2--3x improvements in feasibility, while only consuming ~1--1.5x more FPGA resources. Third, we further improve routability of a workload on the hybrid Buf-BP Hoplite NoC by learning to tune regulation rates for each traffic trace. We model the regulation space as a multivariate gaussian distribution. We capture critical dependency between parameters of the multivariate distribution using Covariance Matrix Adaptation Evolution Strategy (CMA-ES). We also propose nested learning, that learns switch configurations and regulation rates in-tandem, and further lower cost-constrained latency by ~1.5x and accelerate rates by ~3.1x. Finally, we propose a Graph Neural Network (GNN) based framework to accurately predict NoC performance in sub-second latencies for a variety of FPGA NoC designs and applications. Application-aware NoC design can include thousands of incremental updates to the NoC design space, with each step requiring performance evaluation of NoC configuration using slow and expensive conventional tooling. This presents a bottleneck in the adoption of application-aware FPGA NoC design. Instead of spending up to tens of wall clock minutes simulating the NoC design for each step, we present a GNN based framework to encode any FPGA NoC and any FPGA application into graphs. We create a dataset, consisting of over 1.5 million samples, to train GNNs to predict NoC routing latencies. GNNs accelerate benchmarking run-times by up to ~148x (~506x on GPU) while preserving accuracies as high as 97.2%. Through this work, we observe that application-aware NoCs designed using ML algorithms such as MLE and CMA-ES can decrease routing latency by ~2.5--10.2x, increase workload feasibility by ~2--3x, increase injection rates by up to ~3.1x. By leveraging GNNs trained using supervised learning, we can accelerate design time of such NoCs by up to ~4.3x.