Systems Design Engineering

This is the collection for the University of Waterloo's Department of Systems Design Engineering.

Research outputs are organized by type (eg. Master Thesis, Article, Conference Paper).

Waterloo faculty, students, and staff can contact us or visit the UWSpace guide to learn more about depositing their research.

Browse

Recent Submissions

Now showing 1 - 20 of 736
  • Item
    Planning Renewable Electricity Using Life-Cycle Analysis
    (University of Waterloo, 2024-07-16) Ali, Mir Sadek
    It has been predicted that by the mid-21st century worldwide energy demand will grow two to three times the current level of demand. Expanding the global electric power generation capacity will be problematic using the three predominant methods, namely, nuclear fission, fossil fuels and hydropower. There are few suitable sites left for new large-scale hydropower dams. Both fossil fuels and nuclear fission have widespread environmental consequences to their use and the supply of fuel for these two technologies is a non-renewable resource. Renewable energy system (RES) technologies have been proposed as the means to expanding energy markets in a sustainable manner. A formative step in deploying RES will be the design of a standardized methodology for determining policy and planning decisions to initiate market and government support for these nascent technologies. This thesis outlines the design of a RES planning model based on the life-cycle analysis (LA) methodology. The proposed model will integrate a climatologically-based renewable energy optimization and simulation (REOS) model into the LCA. Goal-attainment algorithms will be used to find feasible installed capacities for power generation which will meet a prescribed load demand and simultaneously attempt to meet desired policy targets. The policy targets here will be the per-kilowatt hour price of power, life-cycle air-borne CO2 emissions, and the land requirements of the system. An analysis of the performance of RES technologies in two Canadian cities that already have mature electricity utilities is done to demonstrate the methodology.
  • Item
    Toward Automated Detection of Landfast Ice Polynyas in C-Band Synthetic Aperture Radar Imagery with Convolutional Neural Networks
    (University of Waterloo, 2024-07-12) Brubacher, Neil
    Landfast ice polynyas - areas of open water surrounded by ice - are important features in many Northern coastal communities, and their automated detection from spaceborne synthetic aperture radar (SAR) imagery is positioned to support on-ice travel safety under changing Arctic sea ice and climate conditions. The characteristically small spatial scales and sparse distribution of landfast ice polynyas present key challenges to their detection, and limit the suitability of established methods developed for SAR-based sea ice and open water classification at broader spatial scales. This thesis explores the development of deep learning-based object detection networks for landfast ice polynya detection in dual-polarized C-band SAR imagery, having three main contributions. The first is a characterization of landfast ice polynya signatures and separability in SAR imagery based on datasets of polynyas mapped over several seasons near the communities of Sanikiluaq, NU, and Nain, NL. Results from this analysis highlight the challenging and variable nature of polynya signatures in dual-polarized backscatter intensity, motivating the use of convolutional neural networks (CNNs) to capture relevant textural, geometric and contextual polynya features. The second contribution is the development and evaluation of CNN-based object detection networks for polynya detection, drawing on advancements in the natural-scene small object detection field to address the challenging size and sparsity characteristics of polynyas. A simplified detection network architecture optimized for polynya detection in terms of feature representation capacity, feature map resolution, and training loss balancing is found to reliably detect polynyas with sufficient size and local contrast, and demonstrates good generalization to regions not seen in training. The third contribution is an assessment of detection model generalizability between imagery produced by Sentinel-1 (S1) and Radarsat Constellation Mission (RCM) SAR sensors, illustrating the ability for models trained only on S1 imagery to effectively extract and classify polynya features in RCM despite differences in resolution and noise characteristics. Across regions and sensors, missed polynyas are found to have smaller sizes and weaker signatures than detected polynyas, while false predictions are often caused by boundary areas between smooth and rough landfast ice. These represent fundamental limits to polynya / landfast ice separability in the medium-resolution, dual-polarized C-band SAR imagery used in this thesis, motivating future research into multi-temporal, multi-frequency, and/or higher-resolution SAR imagery for polynya detection. Ongoing and future progress in the development of robust landfast ice hazard detection systems is positioned to support community sea ice safety and monitoring.
  • Item
    Dynamic Alert Design Based on Driver’s Cognitive State for Take-over Request in Automated Vehicles
    (University of Waterloo, 2024-07-03) Umpaipant, Wachirawit
    This thesis investigates the effectiveness of dynamic alert systems tailored to drivers' cognitive states in automated driving environments, focusing on enhancing takeover readiness during critical transitions. Utilizing a large-scale immersive driving simulation, the study evaluated drivers' response times and physiological measures when reacting to various alert intensities and the presence of a secondary typing task. The experiment revealed that dynamic alerts significantly improved response times and takeover performance, especially in high-distraction scenarios. Drivers responded more effectively when alerts were adjusted to their cognitive load, with strong alerts resulting in the fastest reaction times under distracted conditions. On average, dynamic alerts reduced response times by approximately 1.75 seconds compared to static alerts. Additionally, higher lateral accelerations were observed under strong alerts, indicating more decisive maneuvering. Self-rated attention-capturing scores were notably higher with dynamic alerts, particularly under strong alert conditions and in the presence of secondary tasks. The ANOVA results showed significant improvements in attention capturing and overall alert effectiveness when dynamic alerts were employed, demonstrating the robust design’s ability to capture attention and enhance driver responsiveness. The study confirmed that adaptive alert designs, which adjust based on the driver's cognitive state, can markedly enhance overall driving experience and safety. Participants reported higher levels of confidence with dynamic alerts, especially in scenarios involving secondary tasks. Despite the strong alerts, annoyance levels remained low, indicating that dynamic alerts are effective without causing undue stress. These results underscore the potential of using adaptive systems to improve safety and efficiency in automated driving, advocating for a more nuanced approach to system alerts that considers the variable cognitive states of drivers. Future research should validate these findings with on-road studies, explore a broader range of alert modalities, and refine physiological monitoring techniques to further enhance adaptive alert systems.
  • Item
    Practical Application of Machine Learning to Water Pipe Failure Prediction
    (University of Waterloo, 2024-06-24) Laven, Kevin
    As water networks age, many utilities are faced with rising water main break rates and insufficient replacement funds. Machine learning is a promising tool to support efficient water pipe replacement decisions. This thesis explores the practical application of machine learning for water pipe failure prediction using a dataset of over 10 million pipe-year records from four countries. Analysis of predictive factors shows that length, age, diameter, material, and failure history are each significant. Two novel relationships with break rate are observed: with respect to diameter, an inverse linear relationship, and with respect to age a peak at around 40 years followed by a decline lasting several decades. A method is presented for predicting both probability of failure and the expected number of failures in a given pipe and time period. By inferring units, encoding categorical features, and normalizing for different utility practices, it is proposed that a single model can generalize across utilities, geographies, and time periods without any utility-specific data cleansing. The model is trained and tested on a leave-one-utility-out basis, with training data from time periods strictly prior to test data. The resulting Area Under the Curve for the Receiver Operating Characteristic of over 0.85 and Cumulate Lift at 10% of over 5.0 demonstrate the practical applicability of the model, matching the performance of models trained and tested on each utility’s own data. Within this model, a method of cross-encoding categorical features with numerical features is introduced to enable integration of data sets from diverse contributors. The applicability of these performance metrics and model outputs to common utility water main replacement decision making processes is also shown.
  • Item
    Modal Interaction in Electrostatic MEMS Mirrors
    (University of Waterloo, 2024-05-31) Rahmanian, Sasan
    The impetus of this work is to introduce nonlinear modal interactions as novel actuation mechanism for electrostatic MEMS-based scanning micromirrors. Modal interactions refer to the engagement of two or more modes of vibration in a system, creating a bridge to channel vibration energy from a directly excited mode to one or more of the coupled modes. In chapter two, this report carries out a comprehensive literature review of the different types of mode coupling in nonlinear resonators. First, internal resonance in general nonlinear oscillators are addressed. Second, we limit our focus to mode coupling in electrostatic MEMS. As an initial test-bed, we examine in chapter three the modulation equations governing a system of two nonlinearly coupled 1-DOF oscillators involved in a 2:1 parametric modal interaction. Simulations show that as the excitation frequency varies in the vicinity of the directly excited higher-frequency oscillator, the amplitude of its motions saturate. Meanwhile, the amplitude of the lower-frequency oscillator undergoes large motions under the influence of a parametric ‘energy pump’. The fourth chapter reports on nonlinear modal interaction in a MEMS made of an electrostatically actuated curved-beam. We characterize the first few in-plane and out-of-plane bending modes of the beam. Thermal noise excitation is utilized to extract the out-of-plane natural frequencies, whereas the in-plane natural frequencies are captured using pulse excitation. Then, the frequency response of the MEMS in the neighbor of the first symmetric and second symmetric in-plane modes. Characterization results discloses a 2:1 ratio between the second symmetric and the first anti-symmetric in-plane modes. We show that this anti-symmetric mode can be effectively excited via the energy channel between it and the second symmetric mode when the latter is driven directly by external electrostatic forcing. In the fifth chapter, we establish bending-torsional equations of motion for a symmetric electrostatic MEMS actuator that can capture the 2:1 modal interaction between its in-plane bending and out-of-plane rotational motions. Our approach demonstrates that incorporating the linear slopes into the cross-sectional shear strains efficiently originates quadratic couplings between the bending and torsional motions whose existence depends on non-vanishing first moments of area of the microbeam's cross-section. According to imperfections in microdevice fabrication, we assumed a minuscule offset in positions between the centroid of the as-fabricated and as-designed cross-sections of the microbeams. Energy approach is exploited to derive the equations of motion (EoM). The static response of the MEMS actuator together with its tuned eigenmodes are examined in this chapter. Chapter six reports the frequency- and voltage-displacement behaviors of the mirror addressing the 2:1 and 3:1 flexural-torsional internal resonance experimentally and numerically. The numerical simulation results indicate that the in-plane motion, which is the directly excited mode, saturates upon the initiation of a 2:1 energy pathway between the bending and torsional motions. Through suitable tuning of the AC frequency, the amplitude of the in-plane motion is minimized, while the amplitude of the torsional motion, an indirectly excited mode, is maximized. The numerical simulation results demonstrate that the actuator's torsional motion, when subjected to a 1:2:1 electro-flexural-torsional modal interactions, is triggered by applying a maximum voltage of 10 V, resulting in about 15 degrees rotational angle. Further, prolific frequency combs are generated as a result of secondary Hopf bifurcations along the large-amplitude response branches, capturing quasi-periodicity in the MEMS dynamics. The experimental results demonstrate the mirror's dynamics exhibiting 3:1 flexural-torsional modal interaction that provides an efficient out-of-plane rotation drive through in-plane excitation. The present study is a platform for the implementation of a novel actuation mechanism of MEMS scanning micromirrors using parametric modal interaction. Conclusion remarks and propose future work with the are presented seven chapter.
  • Item
    Implementing Fairness in Real-World Healthcare Machine Learning through Datasheet for Database
    (University of Waterloo, 2024-05-28) Murugan, Anand
    Healthcare Machine Learning (HML) models are revolutionizing the healthcare industry, promising improved patient outcomes and enhanced public health. However, it is essential to ensure fairness, i.e., models delivering equitable performance to all individuals, irrespective of their inherent or acquired characteristics. This requires a thorough examination of the data used and the specific applications of these models. This study conducted a six-year systematic survey of models trained on the Medical Information Mart for Intensive Care (MIMIC) clinical research database (CRD) – one of the most popular and widely used HML databases to explore the link between data and fairness in HML. The results were striking: for the popular MIMIC IV – ICU mortality task, a naive baseline outperformed the state-of-the-art (SOTA) model in prediction performance, demonstrating greater fairness across subgroups (while still somewhat unfair). These findings demonstrate the urgent need to integrate fairness into healthcare machine learning models and a greater need to include practitioners in HML modeling. To achieve this, we propose a data-centric approach to fairness through our ‘Datasheet for MIMIC IV v2.0 CRD’, modeled after the recent works recommending datasheets for datasets. Given that MIMIC is large and complex, this datasheet will assist practitioners in identifying data anomalies and task-specific feature-target relationships during modeling, thereby fostering the development of equitable HML models.
  • Item
    Deep Graph Neural Networks for Spatiotemporal Forecasting of Sub-Seasonal Sea Ice: A Case Study in Hudson Bay
    (University of Waterloo, 2024-05-27) Gousseau, Zacharie
    This thesis introduces GraphSIFNet, a novel graph-based deep learning framework for spatiotemporal sea ice forecasting. GraphSIFNet employs a Graph Long-Short Term Memory (GCLSTM) module within a sequence-to-sequence architecture to predict daily sea ice concentration (SIC) and sea ice presence (SIP) in Hudson Bay over a 90-day time horizon. The use of graph networks allows the domain to be discretized into arbitrarily specified meshes. This study demonstrates the model's ability to forecast over an irregular mesh with higher spatial resolution near shorelines, and lower resolution otherwise. Utilizing atmospheric data from ERA5 and oceanographic data from GLORYS12, the model is trained to model complex spatial relationships pertinent to sea ice dynamics. Results demonstrate the model's superior skill over a linear combination of persistence and climatology as a statistical baseline. The model showed skill particularly in short- to medium-term (up to 35 days) SIC forecasts, with a noted reduction in root mean squared error by up to 10\% over the statistical baseline during the break-up season, and up to 5\% in the freeze-up season. Long-term (up to 90 days) SIP forecasts also showed significant improvements over the baseline, with increases in accuracy of around 10\% even at a lead time of 90 days. Variable importance analysis via feature ablation was conducted which highlighted current sea ice concentration and thickness as critical predictors. Thickness was shown to be important at longer lead times during the melting season suggesting its importance as an indicator of ice longevity, while concentration was shown to be more critical at shorter lead times which suggests it may act as an indicator of immediate ice integrity. The thesis lays the groundwork for future exploration into dynamic mesh-based forecasting, the use of more complex graph structures, and mesh-based forecasting of climate phenomena beyond sea ice.
  • Item
    Comparing 2-level and 3-level graded collision warning systems under distracted driving conditions
    (University of Waterloo, 2024-05-16) Shariatmadari, Khatereh
    This study delves into a comprehensive exploration of driver performance by comparing the effects of a 3-level graded collision warning system with those of a 2-level graded system. Employing a within-between-subject design, the experiment seeks to unravel the impact of graded warning levels (2-stage and 3-stage) on driving performance in both normal and critical driving conditions. Forty participants were recruited to undergo precise testing within a controlled driving simulator environment. The experimental setup involves dividing participants into two groups, each exposed to distinct collision warning paradigms. The first group experiences a two-level graded warning system, while the second group encounters a three-level graded warning system, structured based on Time to Collision (TTC) metrics. Each participant drove eight scenarios, including four normal and four critical scenarios. This strategic design allows for a comprehensive evaluation of the influence of warning system intricacies on various facets of driving behavior. The study encompasses an array of dependent variables, including eye-tracking data, wristband-derived physiological metrics, driver response times, and the incidence of collisions. This multifaceted approach ensures a holistic understanding of the drivers’ reactions under different collision warning paradigms. Results indicated that the 3-level graded system significantly reduced response times and collision frequencies compared to the 2-level system across both normal and critical driving conditions. Additionally, the 3-level system demonstrated better mitigation of driver distraction. While driving conditions did not significantly affect eye-tracking data, the warning level had a significant impact, with the 3-level system showing superior results. However, neither warning level nor driving condition significantly affected physiological data, including Electrodermal Activity (EDA), Heart Rate (HR) and Heart Rate Variability (HRV). Subjective evaluations highlighted the impact of collision warnings on driver performance, particularly in high-speed scenarios. Moreover, auditory warning modalities were preferred by a majority of participants. These findings provide valuable insights for the development of advanced collision warning systems, emphasizing the importance of multi-level warnings and preferred warning modalities in enhancing driver safety and reducing collision risks in diverse driving environments.
  • Item
    Applications of Strongly Coupled Electrostatic NEMS
    (University of Waterloo, 2024-04-30) Mouharrar, Hamza
    This work explores potential applications of electrostatic nanoelectromechanical systems (NEMS) in inertial sensing and Frequency Comb (FC) generation. NEMS inertial sensors exhibit exceptional sensitivity with low power consumption, making them ideal for portable gas sensors. We equip a novel ZnO NEMS with Metal-Organic Frameworks (MOFs) to ensure selectivity to volatile organic compounds (VOCs), resulting in a sensor with sensitivity ranging from 0.33 to 0.71 Hz/ppm and limits of detection from 4 to 9 ppb. This high sensitivity is attributed to the high porosity and large surface area of MAF-6. These findings pave the way for the development of MOF-coated NEMS sensors, promising advances in the field of gas sensing. We also present a novel low-power generation technique for frequency combs (FC) developed using modal interactions in electrostatic NEMS. Experimental results show a broadband FCs spectrum with a coherent phase. The proposed technique is flexible, enabling the generation of multiple frequency combs and fine-tuning of their Free Spectral Range (FSR). Additionally, we show an innovative approach that leverages internal resonances within a NEMS-phononic cavity to generate soliton frequency combs with over 3000 spectral lines, offering a breakthrough for quantum computing and metrology. The soliton generator can seamlessly be integrated into portable devices, aligning with contemporary miniaturized technology.
  • Item
    On Landmarks for Introducing 3D SLAM Structure to VPR
    (University of Waterloo, 2024-04-29) Bradley, Matthew
    Simultaneous Localization and Mapping (SLAM) is a critical foundation to a wide variety of robotic applications. Visual SLAM systems rely on Visual Place Recognition (VPR) for map maintenance and loop-closing so their quality suffers when VPR performance is impacted. In most VPR systems images are described compactly and stored for later comparison, with matches indicating that a scene has been seen before and has been revisited. Changes in illumination are a common difficulty for VPR image descriptors based on vocabularies of local features. Global descriptors which incorporate high-level structure are more robust to illumination, but are often sensitive to changes in viewpoint. There is an overall focus in VPR on describing single images despite the fact that SLAM systems recover 3D structure from the environment, and that this structure is both illumination invariant and remains the same regardless of vantage point. Work leveraging SLAM-recovered structure in the form of 3D points, in conjunction with LiDAR scan descriptors, has demonstrated superior VPR performance under harsh illumination compared with SoTA visual vocabulary descriptors. However, performance in general is not as high. A significant observed limitation was difficulty matching pseudo-LiDAR scans with significantly differing sub-regions. This is due to an assumption by the LiDAR descriptors used, that the entire volume of two corresponding scans should match. This does not fit well with the inherent sparsity of accumulated pointclouds from traversal by visual SLAM, due to differences in route, incomplete coverage, and the inherent sparsity of SLAM feature tracking in general. What is needed is an approach based on matching sub-regions which are common between pseudo-scans, in other words an approach performing place recognition based on landmarks. Here we explore generation of landmarks from accumulated SLAM structure through various clustering-based techniques, as well as the application of SoTA Grassmannian Graph-based association to match them. We present the challenges and successes of this approach to introducing 3D structure into VPR and propose various avenues of exploration to address the challenges faced. One of the foremost challenges is that pointclouds derived from SLAM are very sparse and uneven, making reliable and repeatable clustering difficult to achieve. We make significant improvement in landmark quality by using semantic labeling to provide better separation before clustering. While this has a noticeable impact on the number of outlier landmarks, we also find that there is an extreme sensitivity to outliers in the association method used. This sensitivity persists across data sets and seems inherent to this method of association. This precludes effective place recognition at this time, however in future work we expect this will be alleviated through the use of landmark descriptors for more effective outlier rejection. Descriptors can also provide putative associations which can be beneficial to landmark matching. We also propose various other enhancements to help improve landmark generation and association of landmarks for place recognition. It is our firm expectation that incorporation of 3D structure from SLAM systems into underlying VPR will be mutually beneficial, with VPR systems gaining additional descriptive capability which is fully invariant to illumination but more stable than viewpoint-sensitive 2D image structure.
  • Item
    Zero-Shot Monocular Motion Segmentation: A Fusion of Deep Learning and Geometric Approaches
    (University of Waterloo, 2024-04-29) Huang, Yuxiang
    Identifying and segmenting moving objects from a moving monocular camera is difficult when there is unknown camera motion, different types of object motions and complex scene structures. Deep learning methods achieve impressive results for generic motion segmentation, but they require massive training data and do not generalize well to novel scenes and objects. Conversely, recent geometric methods show promising results by fusing different geometric models together, but they require manually corrected point trajectories and cannot generate a coherent segmentation mask. This work proposes an innovative zero-shot motion segmentation approach that seamlessly combines the strengths of deep learning and geometric methods. The proposed method first generates object proposals for every video frame by using state-of-the-art foundation models, and then extracts different object-specific motion cues. Finally, the method uses multi-view spectral clustering to synergistically fuse different motion cues together to cluster objects into distinct motion groups, resulting in a coherent segmentation. The key contributions of this work are as follows: 1) Proposing the first zero-shot motion segmentation pipeline that performs dense motion segmentation on different scenes and object classes without any training. 2)This work is the first to combine epipolar geometry and optical flow-based motion models for motion segmentation. Multi-view spectral clustering is used to effectively combine different motion models to achieve good motion segmentation results in complex scenes Through extensive experimentation and comparative analysis, we validate the efficacy of the proposed method. Despite not being trained on any data, the method is able to achieve competitive results on real-world datasets, some of which are even better than those of the state-of-the-art motion segmentation methods trained in a supervised manner. This work not only contributes to the advancement of monocular motion segmentation, but also shows that combining different geometric motion models and motion cues is very important in analyzing the motions of objects.
  • Item
    Navigating Unsignalized Intersections: Deep RL-Based Decision-Making and Control Framework for Autonomous Vehicles with Pedestrian Integration
    (University of Waterloo, 2024-04-25) Sana, Faizan
    Unprotected left turns at unsignalized intersections, alongside pedestrians and adversarial vehicles, pose significant challenges for Autonomous Vehicle (AV)s. These challenges stem from the absence of traffic signals or signs, the dynamic nature of the environment shaped by human interactions at crosswalks, and the variability in intersection layouts. This thesis delves into addressing these challenges through the application of a hierarchical Deep Reinforcement Learning (DRL) approach, where the DRL policy governs high-level decision-making (or planning), and low-level Proportional-Integral-Derivative (PID) controllers handle actuation. To evaluate and train DRL policies, it was necessary to create a simulation environment within a high-fidelity environment with realistic behaviors and dynamic vehicle models. To the best of our knowledge, this research marks a pioneering effort in simulating pedestrian interactions within a high-fidelity environment, coexisting alongside adversarial vehicles within the CARLA simulation platform. We have dedicated extensive efforts to the development of this simulation, enabling straightforward customization of parameters such as the number of pedestrians, adversarial vehicles, and reward functions amongst others. This is made available open-source at https://github.com/faizansana/ intersection-carla-gym. The study evaluates five distinct model-free DRL algorithms, namely Deep Q-Learning (DQN), Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), Recurrent PPO, and Soft Actor Critic (SAC). The primary focus of this work is to conduct a comprehensive comparative analysis of these DRL algorithms within a hierarchical framework to enhance AV decision-making in complex and uncontrolled intersection scenarios. The training code, with its versatile software architecture is made available at https://github.com/faizansana/intersection-driving. Our findings reveal that Recurrent PPO, coupled with a discretized action space, outperforms the other algorithms, displaying the highest success rate and the lowest accident rate for executing unprotected left turns in chaotic intersections. This outcome underscores the potential of Recurrent PPO to navigate such complex traffic scenarios effectively. The thesis concludes by discussing potential extensions of the proposed hierarchical DRL system and outlining promising avenues for future research in the field of autonomous vehicle navigation at challenging and dynamic intersections.
  • Item
    A Representational Response Analysis Framework For Convolutional Neural Networks
    (University of Waterloo, 2024-04-25) Hryniowski, Andrew
    Over the past decade, convolutional neural networks (CNNs) have become the defacto machine learning model for image processing due to their inherent ability to capitalize on modern data availability and computational resources. Much of a CNN's capabilities come from their modularity and flexibility in model design. As such, practitioners have been able to successfully tackle applications not previously possible with other contemporary methods. The downside to this flexibility is that it makes designing and improving upon a CNN's performance an arduous task. Designing a CNN is not a straightforward process. Model architecture design, learning strategies, and data selection and processing must all be precisely tuned for a researcher to produce even a non-random performing model. Finding the correct balance to achieve start-of-the-art can be its own challenge requiring months or years of effort. When building a new model, researchers will rely on quantitative metrics to guide the development process. Typically, these metrics revolve around model performance characteristic constraints (e.g., accuracy, recall, precision, robustness) and computational (e.g., number of parameters, number of FLOPS), while the learned internal data processing behaviour of a CNN is ignored. Some research investigating the internal behaviour of CNNs has been proposed and adopted by a niche group within the broader deep learning community. Because these methods operate on extremely high dimensional latent embeddings (between one to three orders of magnitude larger than the input data) they are computationally expensive to compute. In addition, many of the most common methods do not share a common root from which downstream metrics can be computed, thus making the use of multiple metrics prohibitive. In this work we propose a novel analytic framework that offers a broad range of complementary metrics that can be used by a researcher to study the internal behaviour of a CNN, and whose findings can be used to guide model performance improvements. We call the proposed framework Representational Response Analysis (RRA). The RRA framework is built around a common computational kNN based model of the latent embeddings of a dataset at each layer in a CNN. Using the information contained within these kNNs, we propose three complementary metrics that extract targeted information and provides a researcher with the ability to investigate specific behaviours of a CNN across all of its layers. For this work we focus our attention on classification based CNNs and perform two styles of experiments using the proposed RRA framework. The first set of experiments revolve around better understanding RRA hyper-parameter selection and the impacts on the downstream metrics with regards to observed characteristics of a CNN. From this first set of experiments we determine the effects of adjusting specific RRA hyper-parameters, and we propose general guidelines for selecting these hyper-parameters. The second set of experiments investigates the impact of specific CNN design choices. To be more precise, we use RRA to investigate the consequences on a CNN's latent representation when training with and without data augmentations, and to understand the latent embedding symmetries across different pooled spatial resolutions. For each of these experiments RRA provides novel insights into the internal workings of a CNN. Using the insights from the pooled spatial resolution experiments we propose a novel CNN attention-based building block that is specifically designed to take advantage of key latent properties of a ResNet. We call the proposed building block the Scale Transformed Attention Condenser (STAC) module. We demonstrate that the proposed STAC module not only improves a model's performance across a selection of model-dataset pairs, but that it does so with an improved performance-to-computational-cost tradeoff when compared to other CNN spatial attention-based modules of similar FLOPS or number of parameters.
  • Item
    Evaluating the Usefulness of Synthetic Data in Healthcare: Applications in Predictive Modeling and Privacy Protection
    (University of Waterloo, 2024-04-24) Basri, Mohammad Ahmed
    The advent of data-driven approaches in healthcare has opened new horizons for patient care, disease management, and medical research. However, one of the significant challenges is the availability of large-scale, high-quality datasets. Accessing health data that contains sensitive information requires lengthy approval processes and stringent restrictions. Synthetic data effectively addresses this dilemma by replicating the statistical properties of real datasets, offering a viable solution. Due to privacy concerns and regulatory restrictions associated with health data, there is a growing need for highly realistic synthetic health data, particularly in health data science initiatives. While significant advancements have been achieved in establishing recognized evaluation methods for synthetic data models, there remains a notable gap in understanding the optimal approaches to enhance the quality and usefulness of synthetic data. This thesis aims to bridge this gap by conducting a systematic evaluation of objective functions for hyperparameter tuning of synthetic data generation and studying the efficacy of synthetic data in predictive models. We evaluate synthetic data using three criteria: Fidelity, assessing how well it mirrors real-world data statistically; Utility, measuring its effectiveness for machine learning applications; and Privacy, evaluating the risk of re-identification. We examine the usefulness of synthetic data for the hyperparameter optimization process of predictive models, particularly in scenarios where access to real data is constrained. We found a notable correlation between model performance accuracy using real data and synthetic data, suggesting that parameters optimized with synthetic data are applicable to real data for optimal results. Our study confirms the feasibility of using synthetic data on external computing resources to optimize models, effectively addressing healthcare's computing constraints.
  • Item
    Evaluation of Passive Microwave-Based Sea Ice Edge and Marginal Ice Zone
    (University of Waterloo, 2024-04-24) Soleymani, Armina
    Sea ice is a vital factor in polar navigation, numerical weather prediction models, and climate change studies. It significantly influences the global climate, northern communities, and Earth’s ecosystems. The sea ice edge and marginal ice zone are important areas for monitoring, as they affect ship navigation, human activities, and marine habitats. Passive microwave instruments offer valuable tools for monitoring the Earth’s surface, regardless of solar illumination. This advantage is particularly prominent in polar regions, where harsh climate conditions, restricted accessibility, and polar darkness pose challenges to data collection. This thesis is dedicated to the analysis of the sea ice edge and the marginal ice zone obtained from passive microwave algorithms, with the aim of enhancing our understanding of these influential regions. The first research compares the sea ice edge derived from three passive microwave algorithms against Canadian Ice Service charts over the Eastern Canadian Arctic. It also introduces a novel measurement for edge displacement error. The findings demonstrate differences in the performance of various algorithms across different seasons. During the freeze-up period, there is an increase in edge displacement error values, attributed to thin ice conditions. In April, the study observed the widest range of edge displacement error values, which were linked to fluctuations in wind speed and air temperature. The second study focuses on a 40-year trend analysis of the Arctic marginal ice zone using the Bootstrap sea ice product, employing two definitions: one based on sea ice concentration and the other based on sea ice concentration anomaly. Comparative analysis shows consistent trends in marginal ice zone fraction, with the anomaly-based definition exhibiting higher values during transitional periods. Furthermore, change point detection analysis highlights an increase in marginal ice zone fraction after 2005 for the concentration-based definition and after 2007 for the anomaly-based definition, suggesting the influence of climate change on sea ice concentration and mobility. In the third investigation, two sea ice products, passive microwave and synthetic aperture radar, are utilized to delineate the marginal ice zone in the Greenland Sea using two distinct definitions. The anomaly-based definition reveals a broader spatial marginal ice zone region, capturing the variability in sea ice concentration resulting from ice growth. This definition also maintains consistency across both sea ice products. Additionally, the study underscores the consistency of synthetic aperture radar in detecting the marginal ice zone (regardless of the definition) and its reduced sensitivity to the sea ice concentration anomaly standard deviation threshold, compared to passive microwave data.
  • Item
    Preference and Performance-Based Adaptive Task Planning in Human-Robot Collaboration
    (University of Waterloo, 2024-04-23) Noormohammadi-Asl, Ali
    This thesis delves into a central challenge in human-robot collaboration (HRC): the adaptive task planning of robots to enhance team performance, fluency, and the human agent's perception of both the robot and the collaboration. This thesis tackles the challenge of proactive task planning and allocation in collaborative scenarios, involving a single human and a single robot working together to accomplish a task. Recognizing the existing gaps in the literature, our focus revolves around balancing human agents' leading/following preferences and their performance, with the aim of fostering collaboration while maintaining a high level of human perception of the robot. After an in-depth review of related work, we initiate our exploration with an online user study, in a simulation environment using a manipulator robot. This study is designed to evaluate the impact of the robot's planning strategy on participants' perception of the robot and collaboration. This study incorporates three distinct planning strategies: prioritizing the human's objectives, prioritizing the robot's objectives, and achieving a balance between both agents' objectives. The results guide our assessment of how the balancing strategy, in particular, can uphold both team performance and a high level of participants' perception of collaboration and the robot, in comparison to the other strategies. However, a limitation arises as the study employs fixed strategies, randomly assigned to participants, irrespective of their preference and performance. Building upon the results of the first user study, we address the limitations identified in the initial study by enabling the robot to estimate the human agent's leading/following preference. However, the human agent's preference is not the sole factor influencing the robot's decision-making process; the human agent's performance is also crucial for adjusting the team's overall performance, particularly in cases of the human agent's poor or suboptimal performance. Consequently, the robot also estimates the human agent's performance. Furthermore, the robot needs to be capable of updating the task state based on both agents' actions and mistakes. With an updated understanding of the human agent's performance, leading/following preference, and task state, the robot updates its plan for task allocation and scheduling to minimize collaboration costs. Next, we evaluate the adaptability of the task planning framework and algorithm in a simulation environment, demonstrating its effectiveness across various human performance and preference scenarios. Yet, recognizing the unique challenges posed by human participants, the complete evaluation of the algorithm's effectiveness requires real-world scenarios, considering uncertainties inherent in human behavior and decision-making. Subsequently, we tackle the challenges of implementing the task planning framework on a real robot, a mobile manipulator robot, within a carefully designed collaborative scenario. Providing details on the experimental setup and methodology, a system evaluation study highlights the robot's ability to adapt based on human behavior. Finally, we conduct a user study involving 48 participants, evaluating results from multiple perspectives, including participants' perception of the robot, tasks, and collaboration, participants' actions and performance, and the robot's actions and performance. Results from the study affirm the success of the task planning framework in achieving its objectives: enhancing team fluency by considering the human agents' preferences and performance while maintaining a high level of participants' perception of the robot and the human-robot collaboration. This thesis also explores participants' leading/following preferences in collaboration, revealing a dominant preference to lead the robot. This finding can assist robotics and autonomous systems designers in considering this factor in their designs. Additionally, we evaluated the influence of participants' leadership and followership styles on their collaboration, warranting further and more in-depth future studies. In summary, this thesis contributes a proactive task planning framework that takes into account both human leading/following preferences and performance, signifying an advancement in the field of human-robot collaboration. The validation through user studies offers valuable insights, laying the groundwork for future research and applications in the continually evolving domain of human-robot collaboration.
  • Item
    Adversarial Machine Learning and Defenses for Automated and Connected Vehicles
    (University of Waterloo, 2024-04-18) Zhang, Dayu
    This thesis delves into the realm of adversarial machine learning within the context of Connected and Automated Vehicles (CAVs), presenting a comprehensive study on the vulnerabilities and defense mechanisms against adversarial attacks in two critical areas: object detection and decision-making systems. The research firstly introduces a novel adversarial patch generation technique targeting the YOLOv5 object detection algorithm. It presents a comprehensive study in the different transformations and parameters that change the effectiveness of the patch. The patch is then implemented within the CARLA simulation environment to assess robustness under varied real-world conditions, such as changing weather and lighting. With all the transformation applied during generation, the patch is able to reduce the confidence of YOLO5 detecting the stop sign by 70% comparing to the original stop sign if the lighting condition is good. However if the lighting condition is sub-optimal, for example, during a raining weather, the patch only reduce the confidence by 38% due to the patch being harder to be detected. Overall, the optimized patch still shows a greater effect on detection evasion compares to a random noise patch on any environment conditions. Overall, this part of the research showcase a novel way of generating adversarial patches and a new approach of testing the patches in a open-source simulator, CARLA, for better autonomous vehicle testing against adversarial attacks in the future. Simultaneously, this thesis investigates the susceptibility of Deep Reinforcement Learning (DRL) algorithms, in particular, Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG) algorithms, to black-box adversarial attacks executed through zeroth-order optimization methods like ZO-SignSGD in a lane-changing scenario. The research first train the policies with finely turned hyper-parameters in the lane-changing environment and achieving a high performance. With a good policy as a base, the black-box attack successfully fooled both algorithms by optimally changing the state value to force the policy going straight while maintaining a small perturbation size compare to the original. While under attack, both DQN and DDPG are unable to perform, achieving an average of reward 108 and 45 comparing to their original performance of 310 and 232 respectively. A preliminary study on the effect of adversarial defense is also performed, which shows resistance against the attack and achieving slightly increase in average reward. This part of research uncovers significant vulnerabilities, demonstrating substantial performance degradation in DRL when used in the decision making of an autonomous vehicle. At last, the study underscores the importance of enhancing the security and resilience of machine learning algorithms embedded in CAV systems. Through a dual-focus on offensive and defensive strategies, including the exploration of adversarial training, this work contributes to the foundational understanding of adversarial threats in autonomous driving and advocates for the integration of robust defense mechanisms to ensure the safety and reliability of future autonomous transportation systems.
  • Item
    Enhancing Clinical Support for Breast Cancer with Deep Learning Models using Optimized Synthetic Correlated Diffusion Imaging
    (University of Waterloo, 2024-04-17) Tai, Chi-en, Amy
    Breast cancer is the second most common type of cancer in women in Canada and the United States, representing over 25% of all new female cancer cases. The prevalence of breast cancer continues to grow, affecting about 300,000 females in the United States in 2023. However, there are different levels of severity of breast cancer requiring different treatment strategies, and hence, grading breast cancer and estimating treatment prognosis have become vital clinical tasks in breast cancer. Recently, a new form of magnetic resonance imaging (MRI) called synthetic correlated diffusion imaging (CDIs) was introduced to address the physical hardware limitations and showed considerable promise for clinical decision support for cancers such as prostate cancer when compared to current gold-standard MRI techniques. However, the efficacy for CDIs for other forms of cancers such as breast cancer has not been as well-explored. This thesis explores and designs novel deep learning architectures for enhancing two breast cancer clinical task performance (pathologic complete response prediction and Scarff-Bloom-Richardson grade classification) with optimized CDIs. A volumetric convolutional neural network is leveraged to learn volumetric deep radiomic features from a pre-treatment cohort, constructing a predictor based on the learned features for grade and post-treatment response prediction. The optimization of parameters for computing CDIs for breast cancer is also conducted through improving tumour delineation. The proposed approach was evaluated using the ACRIN-6698 study and compared against current gold-standard MRI modalities. For grade prediction, using optimized CDIs achieved a leave-one-out cross-validation accuracy of 95.79%, which is over 16% above the next best gold-standard MRI modality and over 6% above using the unoptimized CDIs. Additionally, using optimized CDIs for post-treatment response prediction resulted in a leave-one-out cross-validation accuracy of 93.28%, which is over 8.5% above the next best gold-standard MRI modality and over 5.5% above using the unoptimized CDIs. The proposed approach demonstrates how using optimized CDIs can be used to enhance the performance of breast cancer clinical tasks, indicating its potential as a valuable tool for oncologists to enhance patient treatment within the breast cancer domain and beyond.
  • Item
    Task-Parameterized Transformer for Learning Gripper Trajectory from Demonstrations
    (University of Waterloo, 2024-02-26) Chen, Yinghan
    The goal of learning from demonstration or imitation learning is to teach the model to generalize across unseen tasks based on available demonstrations. This ability can be important for the stable performance of a robot in a chaotic environment such as a kitchen when compared to a more structured setting such as a factory assembly line. By leaving the task learning up to the algorithm, human teleoperators can dictate the action of robots without any programming knowledge and improve overall productivity in various settings. Due to the difficulty of manually collecting gripper trajectories in large qualities, successful application of learning from demonstrations would have to be able to learn from a sparse number of examples while still providing a high degree of predicted trajectory accuracy. Inspired by the development of transformer models for large language model tasks such as sentence translation and text generation, I seek to modify the model for trajectory prediction. While there have been previous works that managed to train end-to-end models capable of taking images and contexts and then generating control output, those works rely on a massive quantity of demonstrations and detailed annotations. To facilitate the training process for a sparse number of demonstrations, we created a training pipeline that includes a DeeplabCut model for object position prediction, followed by the Task-Parameterized Transformer model for learning the demonstrated trajectories, and supplemented with data augmentations that allow the model to overcome the constraint of limited dataset. The resulting model is capable of outputting the predicted end effector gripper trajectory and pose at each time step with better accuracy than previous works in trajectory prediction.
  • Item
    A Deep-Learning Framework for Detecting and Predicting Clinical Events Using Continuous, Multimodal Physiological Signals
    (University of Waterloo, 2024-02-20) Ross-Howe, Sara Anne
    There are an estimated 313 million surgeries performed worldwide each year. Even with significant clinical and technical advances in perioperative research, many patients experience a major complication during the first 30 days following surgery. In recent years, there has been significant advancement in wearable technology and digital health platforms to support remote patient monitoring. Research into machine learning models for predicting adverse clinical events have predominantly focused on utilizing static, derived vital metrics extracted from Electronic Health Record (EHR) systems. However, many limitations have been identified with developing machine learning models from EHR data. Deep learning offers a solution to these challenges by analyzing the physiological signals directly to organize and automatically extract progressive layers of features. Preliminary research in deep learning has just begun for biomedical signal analysis and has been predominately limited to processing individual biometric channels and signal modalities. This dissertation presents a novel deep learning framework, called the BiometricNet, for processing continuous, multimodal physiological signals for the detection and prediction of adverse clinical events. In the initial signal pre-processing stage of the BiometricNet, an integrated Generative Multiscale Wavelet De-Noising Autoencoder (Ψ-GANDAE) is utilized to remove common noise patterns encountered during ambulatory signal collection. Signal segmentation and feature extraction is performed in the second processing phase of the BiometricNet, where ResNet and BiLSTM architectures leverage post-processing attention layers (RBLAN) to support multiple signal sensor formats (e.g., electrocardiograms, photoplethysmograms, respiration, temperature, and arterial blood pressure), and collation of multiple synchronized channels. In the final stage of the BiometricNet, detection and prediction of adverse health events is achieved by a Siamese Neural Network (SNN), which produces a risk score based on the comparison of a baseline signal with samples taken through the patient event timeline. Important considerations addressed in the proposed framework include optimizing the use of available data, given the lack of large publicly available labelled datasets in healthcare, and addressing the imbalanced nature of clinical data, where there is a chronic scarcity representing target adverse event conditions. The BiometricNet is flexible and adaptive to support application to a wide array of adverse clinical events, and it is demonstrated on continuous non-invasive blood pressure estimations from synchronized ECG and PPG signals, and myocardial infarction predication from ECG Lead I, II, and III channels. Lastly, this research honours important considerations regarding AI ethics and network interpretability and transparency to facilitate regulatory approval processes and garner confidence from clinical practitioners.