UWSpace
UWSpace is the University of Waterloo’s institutional repository for the free, secure, and long-term home of research produced by faculty, students, and staff.
Depositing Theses/Dissertations or Research to UWSpace
Are you a Graduate Student depositing your thesis to UWSpace? See our Thesis Deposit Help and UWSpace Thesis FAQ pages to learn more.
Are you a Faculty or Staff member depositing research to UWSpace? See our Waterloo Research Deposit Help and Self-Archiving pages to learn more.

Communities in UWSpace
Select a community to browse its collections.
- The University of Waterloo institution-wide UWSpace community.
Recent Submissions
Item type: Item , Numerical Investigation of Infrasound Generated by Clear-Air Turbulence(University of Waterloo, 2026-04-20) Drapeau, ChristopherClear-air turbulence (CAT) represents a major hazard to the aviation industry because it occurs without visible indicators and cannot be detected by pilots or onboard instruments. This study investigates whether CAT generates infrasonic emissions that could potentially be used for remote detection. High-resolution large-eddy simulations were performed using the Weather Research and Forecasting model to produce atmospheric environments associated with two CAT encounters: a mountain-wave event over Wyoming in 2020 and a shear-driven event over Illinois in 2023. Acoustic source terms were computed from the simulated flow fields using a hybrid acoustic analogy framework to estimate the acoustic pressure at distant observer locations. When acoustic sources were computed using velocity fluctuations relative to the mean flow, the CAT region in the Wyoming case produced acoustic emissions approximately 22--29 dB stronger than the background turbulence, revealing a clear acoustic energy enhancement associated with CAT. However, further investigation incorporating the mean flow and thermodynamic sources demonstrated that the background acoustic field increased substantially due to amplification by the strong terrain-driven mean flow (Wyoming) and underlying convective processes (Illinois). These cases represent particularly energetic atmospheric conditions that reduce the apparent contrast of the CAT signal, producing only modest overall sound pressure level increases relative to the background. Importantly, the thermodynamic sources related to potential temperature remained the principal contribution within the CAT regions. This suggests that the sharp potential temperature gradients and overturning motions play a primary role in CAT-related acoustic emissions, rather than turbulence-generated sound alone. These results demonstrate that CAT can generate measurable acoustic emissions even in realistic and complex atmospheric environments, supporting further investigation of infrasound as a potential remote-detection tool.Item type: Item , Robust Hardware-Assisted Malware Detection(University of Waterloo, 2026-04-20) Propp, EliMalware detection using hardware performance counters (HPCs) offers a promising, low-overhead approach for monitoring program behaviour, as shown in prior work. However, a fundamental architectural constraint, that only a limited number of hardware events can be monitored concurrently, creates a significant bottleneck, leading to detection blind spots. Prior work has primarily focused on optimizing machine learning models for a single, statically chosen event set, or an ensemble of models over the same feature set. We argue that robustness requires diversifying not only the models, but also the underlying feature sets (i.e., the monitored hardware events) in order to capture a broader spectrum of program behaviour. This observation motivates the following research question: Can detection performance be improved by trading temporal granularity for broader coverage, via the strategic scheduling of different feature sets over time? To answer this question, this thesis proposes Hydra, a novel detection mechanism that partitions execution traces into time slices and learns an effective, stochastic schedule of feature sets and corresponding classifiers for deployment. By cycling through complementary feature sets, Hydra mitigates the limitations of a fixed monitoring perspective. Experimental evaluation shows that Hydra significantly outperforms state-of-the-art single-feature-set baselines, achieving at least a 19.32% improvement in F1 score and a 60.23% reduction in false positive rate. These results underscore the importance of feature-set diversity and establish strategic multi-feature-set scheduling as an effective principle for robust, hardware-assisted malware detection.Item type: Item , Statistical Methods for Mitigating Bias from Confounding and Measurement Error with Complex Exposures(University of Waterloo, 2026-04-20) Wang, XiaoyaThis thesis is concerned with methods for mitigating bias from confounding and measurement error with a semi-continuous exposure. This work is primarily motivated by the analysis of six longitudinal cohort studies investigating the effect of prenatal alcohol exposure (PAE) on childhood cognition. Prenatal alcohol exposure is reported as the average number of ounces of alcohol consumed each week during the pregnancy; this is a semi-continuous variable with a point mass at zero, a value held for expectant mothers who do not consume alcohol during their pregnancy. Throughout the following chapters, we develop novel approaches to estimate causal effects for semi-continuous exposures of this sort and propose new strategies for addressing measurement error and misclassification. These methods are designed to enhance the validity, accuracy, and reliability of causal estimates in epidemiological studies and other applications involving semi-continuous exposures. Chapter 1 introduces the potential outcomes framework for causal inference and reviews key approaches for addressing confounding, including propensity score methods and related estimation strategies. It also summarizes core concepts in measurement error and misclassification, and introduces the motivating study. In Chapter 2, we extend methods for causal inference with binary treatment indicators to handle semi-continuous exposure variables. The exposure distribution is semi-continuous with a mass at zero (representing the unexposed sub-population) and a sub-density characterizing variation in the level of exposure among those exposed. We first propose the potential outcomes framework for a setting with a semi-continuous exposure, then develop a two-stage estimation. In the first stage, the causal effect of the exposure level is assessed among exposed individuals using propensity score regression adjustment. In the second stage, the causal effect of the binary ``exposure status" is evaluated using inverse probability weighted (IPW) and augmented inverse probability weighted (AIPW) estimation functions. We derive the large sample properties of the estimators resulting from the various methods of analyses and construct joint confidence regions for the causal effects. Simulation studies confirm good finite sample performance of the proposed estimators. We apply these new approaches to analyze data from the Detroit prenatal alcohol study. In Chapter 3, we address the challenge of causal inference regarding drinking status and the effect of dose for multiple outcomes representing domains of cognitive functions. A two-stage estimating equation approach is proposed for multiple outcomes with large sample properties derived for the resulting estimators. Homogeneity tests are developed to assess whether causal effects of exposure status and the dose-response effects are the same across multiple outcomes. A global homogeneity test is also developed to assess whether the effect of exposure status (exposed/not exposed) and the dose-response effect of the continuous exposure level are each equal across all domains. The methods of estimation and testing are rigorously evaluated in simulation studies and applied to a motivating study on the effects of prenatal alcohol exposure on childhood cognition defined by executive function (EF), academic achievement in math, and learning and memory (LM). In Chapter 4, we develop likelihood-based methods to correct for the effect of a semi-continuous exposure subject to both misclassification of exposure status and measurement error in the exposure level. Motivated by repeated maternal self-reports of alcohol use collected during pregnancy, we specify a two-part measurement error model in which the binary indicator of any exposure may be misclassified and the log-transformed dose among the exposed is measured with error. Treating the true exposure components as latent, we derive two estimation strategies: a two-stage approach that estimates the exposure error process using the replicate data and then corrects the outcome model, and a joint approach that simultaneously estimates all model components using an EM algorithm. We establish the large sample inference of the estimators, extend the framework to multi-cohort studies with formal homogeneity tests to guide evidence synthesis across cohorts, and evaluate performance in simulation studies. The proposed methods are illustrated using data from two Pittsburgh prenatal alcohol cohorts, yielding corrected effect estimates and tests that inform whether pooling across cohorts is appropriate. Finally, Chapter 5 summarizes the contributions of this thesis and outlines directions for future research.Item type: Item , Equilibrium Passive Sampling of Per- and Polyfluoroalkyl Substances (PFAS): Design, Validation, Performance Evaluation, and Cross-Environment Application(University of Waterloo, 2026-04-20) Medon, BlessingPer- and polyfluoroalkyl substances (PFAS) are persistent contaminants, some of which are mobile in aqueous systems, making quantification of their freely dissolved concentrations important for evaluating transport, partitioning, and exposure potential. Conventional grab sampling provides a snapshot in time and may disturb solid–water equilibria. Equilibrium-based passive sampling offers an alternative approach to estimate the freely dissolved fraction, but its performance for PFAS across different water chemistry conditions requires further evaluation. The overarching goal of this research was to develop and validate equilibrium-based passive sampling for quantifying freely dissolved PFAS in aqueous systems. The work addressed four questions: 1. Can the concentration of freely dissolved PFAS be estimated using an equilibrium sampler with a non-sorptive receiving phase? 2. How do sampler materials and matrix salinity influence PFAS adsorption, diffusion, and equilibration? 3. How do PFAS physicochemical properties affect their diffusion across sampler membranes? 4. Can performance reference compounds (PRCs) be used to estimate PFAS uptake by equilibrium passive samplers? The uptake of PFAS by a peeper sampler was evaluated through laboratory and field experiments to assess its suitability for monitoring anionic PFAS in surface water and sediment porewater (Chapter 3). Results indicated that PFAS uptake was driven by diffusion through a polycarbonate membrane used as the sampling window, and concentrations measured by the sampler were generally comparable (± 30%) to those in grab samples. Material screening experiments further indicated that peepers made of polycarbonate membranes and high-density polyethylene (HDPE) containers are suitable for freshwater deployment, whereas silver membranes and stainless steel containers may be better suited to saline water, where PFAS adsorption to sampler surfaces is more pronounced (Chapters 4 and 5). Performance reference compounds (PRCs), including isotopically labelled PFAS, were also evaluated as tracers of native PFAS uptake and equilibration. Their release kinetics were comparable to those of the native PFAS, supporting their use to confirm equilibrium and quantify non-equilibrium under both freshwater and saline conditions. However, under saline conditions, stronger interactions between PFAS and silver membranes were observed, leading to slower uptake, particularly for longer-chain compounds (C ≥ 9) (Chapters 4 and 5). A regenerated-cellulose dialysis bag (RCDB) sampler (Chapter 5) was also evaluated as an equilibrium sampler for PFAS in both freshwater and synthetic seawater. Equilibrium was reached for C4–C9 compounds within 5–10 days, and the measured concentrations were within ±20% of those in grab samples. Overall, the results indicate that equilibrium passive sampling can be applied to quantify freely dissolved PFAS across a range of aqueous environments. The roles of sampler material, salinity, and PFAS physicochemical properties in controlling uptake and equilibration were clarified, which supports further application of equilibrium sampling for PFAS monitoring.Item type: Item , Multi-Layer OTN Simulation and LLM-Driven Root Cause Analysis: From Alarm Propagation to Reinforcement-Optimized Diagnostic Agents(University of Waterloo, 2026-04-20) Wang, ShihangOptical Transport Networks (OTNs) generate massive alarm storms when faults occur, as alarms propagate along service paths and across functional blocks, obscuring the underlying root cause. Since operators primarily observe electrical-layer OTU/ODU alarms and telemetry metrics (BBE, BBER, ES), practical failure localization requires understanding how alarms are shaped and propagated by termination, adaptation, and supervisory functions. This thesis presents an end-to-end framework for OTN fault simulation and LLM-based root-cause analysis. The electrical-layer–centric simulator decomposes each network element into typed functional boards (Tributary, XCON, Line, OA, OD, OM, FIU) connected by directed dependency edges along the Service Function Chain. An 81-rule engine drives alarm propagation following ITU-T G.798 semantics—including AIS/BDI signaling and layer-selective regenerator boundary behavior—while mapping seven failure types to distinct temporal metric profiles (step, ramp, step-recovery, burst) so that alarm flows and metric shapes are jointly available and causally aligned. A multi-failure cascade engine extends this to concurrent failure scenarios via BFS-driven cascade resolution. Building on the simulator, a two-stage LLM training pipeline combines Supervised Fine-Tuning (SFT) on Qwen 2.5-7B with LoRA adapters and Group Relative Policy Optimization (GRPO) on Qwen 2.5-3B using composite reward functions. A ReAct agent framework wraps the fine-tuned model with five diagnostic tools; a category-based triage layer routes queries to specialist prompts (Fiber, XCON, or Line), narrowing the search space from seven candidates to at most three. Evaluation on 147 test examples per split shows that the triage-augmented agent achieves 96.6% event accuracy on in-distribution data and 97.3% on out-of-distribution data, with perfect board-level localization and endto-end scores of 81.5% (IID) and 93.5% (OOD).