Computer Science
Permanent URI for this collectionhttps://uwspace.uwaterloo.ca/handle/10012/9930
This is the collection for the University of Waterloo's Cheriton School of Computer Science.
Research outputs are organized by type (eg. Master Thesis, Article, Conference Paper).
Waterloo faculty, students, and staff can contact us or visit the UWSpace guide to learn more about depositing their research.
Browse
Recent Submissions
Item 2D Surface-Only Liquid-Solid Coupling(University of Waterloo, 2025-01-06) Kim, ClaraLiquid simulations typically involve solving the fluid equations over the simulated liquid domain via some volumetric discretization of the domain. Consequently, the majority of schemes which aim to simulate the interaction between liquids and freely moving solids are built assuming a volumetrically discretized liquid model. However, storing and manipulating a surface mesh rather than a volumetric discretization, on top of allowing us to avoid storing interior volume data, has the potential to reduce the number of unknowns in the systems necessary to resolve fluid flow. Motivated by this potential, we present a method for simulating the two-way coupled interactions between solid rigid bodies and an inviscid liquid, where the liquid domain is represented and simulated entirely by its surface. Our work builds off of the surface-only liquids method first proposed by Da et al. [2016]. We are concerned with the 2D version of the liquid-solid coupling problem. As such, the liquid surface is represented as a series of point vertices connected by line segment edges, with the velocity data stored at the vertices. The surface-only liquid simulation method integrates outside forces, such as forces caused by scripted solids in contact with the liquid, by performing a boundary element method (BEM) solve for fluid pressures using surface tension and solid velocities to set boundary conditions. We perform liquid-solid coupling in a single unified solve by modifying this force integration step to account for solids with dynamic velocities by modeling the momentum exchange that occurs at the solid-liquid interface. We show several examples demonstrating our method’s ability to handle liquid-rigid body dynamics, as well as validate our method against analytical solutions derived using the fluid mechanics concept of added mass. We also demonstrate our method’s ability to support multiple solid rigid bodies interacting with each other through the liquid domain without need for direct contact between the solids. We hope that our work encourages further investigations into the surface-only liquids framework in the future, allowing for the simulation of an even wider range of interesting liquid phenomena using only a surface discretization of the simulated domain.Item Harnessing Generalist LLMs for Diverse Objective and Subjective NLP Tasks(University of Waterloo, 2024-12-17) Sahu, GauravRecent advances in natural language processing (NLP), particularly in the subspace of large language modeling, have led to a major paradigm shift. Large language models (LLMs), like the GPT and LLaMA family of models, are trained on a massive Internet corpus covering data from a gamut of diverse domains. In addition, the billions of parameters in these models also invoke emergent capabilities in them, leading to strong improvements across diverse NLP tasks without much task-specific tuning; however, effectively harnessing the knowledge of these generalist models for real-world data still remains a major challenge as the LLMs can produce inconsistent, biased, and unsatisfactory outputs. In this thesis, we propose task-specific strategies for effectively leveraging LLMs for a number of challenging NLP tasks, such as (low-resource) text classification, text summarization, modeling artistic preferences of creative individuals, and automated data analysis. Our results suggest that LLMs can serve as excellent data generators and data labelers for well-defined single-step tasks like classification and summarization, crucially in data-scarce settings, where models trained on LLM-generated data achieved competitive performance to oracle models trained on a much larger labeled training data. On the other hand, for more subjective tasks like modeling artistic preferences among creative individuals, we demonstrate that while LLMs might not be able to discern between the likes and dislikes of artists, they can be effective in extracting key linguistic and poetic properties from text that can later be employed to infer artistic preferences among different individuals. Lastly, we also evaluate the effectiveness of LLMs in multi-step tasks that require the LLM to perform multiple tasks in tandem without compromising performance for individual tasks. Overall, our work draws critical insights into the strengths and shortcomings of LLMs for a wide range of subjective and objective NLP tasks and includes meaningful suggestions for the research community to harness LLMs for those tasks effectively.Item Developer-Applied Accelerations in Continuous Integration: A Detection Approach and Catalog of Patterns(University of Waterloo, 2024-12-16) Yin, MingyangContinuous Integration (CI) provides a feedback loop for the change sets that developers produce. It is crucial that CI processes change sets quickly to provide timely feedback to developers and enable teams to release software updates rapidly. Prior work has made several advances in proposing automated approaches to speed up CI builds. While these approaches have been broadly adopted, CI platforms are flexible enough to enable teams to produce custom strategies to optimize or omit unnecessary or redundant tasks (i.e., developer-applied accelerations). Exploring developer-applied accelerations and identifying recurrent patterns within them may enable broader reuse and can inform recommendations to enhance software development efficiency. In this thesis, we set out to detect and catalog developer-applied CI accelerations. First, we propose clustering, rule-based, and ensemble approaches to detect developer-applied accelerations in a dataset of 2,896 CircleCI build jobs, which achieve an F1-score of up to 0.64. e then conduct a qualitative analysis of the detected developer-applied accelerations to create a detailed catalog of 14 patterns spanning four categories of purposes, 16 patterns spanning five categories of mechanisms, and three categories of magnitudes, from which we infer actionable implications for both the consumers and the providers of CI platforms. Developers can leverage our identified patterns to audit their CI pipelines for inefficiencies, such as redundant invocations of costly external services and rebuilds triggered by minor corrections. Additionally, developers can use our identified patterns to create templates that detect non-impactful changes to specific files, such as \texttt{.yml} and \texttt{.json}.Item Optimizing Automated Bug Localization for Practical Use(University of Waterloo, 2024-12-13) Chakraborty, ParthaA considerable share of resources and developers' efforts is focused on addressing software bugs. Identifying the root causes of these bugs within the codebase is crucial for their resolution. Automated tools for bug localization aim to assist in this process. However, their effectiveness is often limited, leading to low adoption rates. This low adoption rate indicates the disparity between research goals and developers' expectations, emphasizing the need for improvements in bug localization tools. This thesis explores and addresses the challenges faced by developers and tool-builders in implementing practical bug localization tools. Our research focuses on understanding developers' expectations and enhancing the tools' overall effectiveness. Initially, we conduct a mixed-method empirical study to understand developers' expectations. The study reveals that while developers are willing to use bug localization tools, they have concerns related to accuracy and potential leakage of intellectual property. We found that only 27.5% of developers are familiar with these tools. The study indicates that developers need more reliable performance, better integration, flexibility, transparency, and contextual understanding to increase adoption and effectiveness. We also examine performance issues in bug localization tools, particularly with their base—the embedding model. We found that key factors such as pre-training strategies, data familiarity, and input sequence length in embedding techniques significantly affect performance. Our findings show that using project-specific data and pre-training methods like ELECTRA can improve model performance by 25.9%. Additionally, we explore the use of reinforcement learning (RL) in bug localization and propose an RL agent called RLocator. RLocator learns from developer feedback, making it suitable for low-data environments. We also propose BLAZE, an efficient bug localization technique for cross-project and cross-language settings. By using dynamic chunking, a technique that dynamically adjusts the size of the input data to the model, and hard example learning, BLAZE achieves up to a 144% improvement in Mean Average Precision (MAP) compared to previous tools. In conclusion, our findings highlight the shortcomings in the adaptability and efficiency of current tools. We advocate for highly adaptable cross-language, cross-project bug localizers to enhance adoption rates among developers. By leveraging our observations, curated datasets, and proposed methods, tool builders can create more user-friendly bug localization tools for software developers, inspiring a new wave of innovation in this field.Item QAVSA: Question Answering Using Vector Symbolic Algebras(University of Waterloo, 2024-11-29) Laube, RyanWith the advancement of large pretrained language models (PLMs), many question answering (QA) benchmarks have been developed in order to evaluate the capabilities of these models. Augmenting PLMs with external knowledge in the form of Knowledge Graphs (KGs) has been a popular method to improve their question-answering capabilities, and a common method to incorporate KGs is to use Graph Neural Networks (GNNs). As an alternative to GNNs for augmenting PLMs, we propose a novel graph reasoning module using Vector Symbolic Algebra (VSA) graph representations and a k-layer MLP. We demonstrate that our VSA-based model performs as well as QA-GNN, a model combining a PLM and a GNN-module, on 3 multiple-choice question answering (MCQA) datasets. Our model has a simpler architecture than QA-GNN, converges 37% faster during training, and has constant memory requirements as the size of the knowledge graphs increase. Furthermore, a novel method to analyze the VSA-based outputs of QAVSA is presented.Item Statistical Foundations for Learning on Graphs(University of Waterloo, 2024-11-27) Baranwal, AseemGraph Neural Networks are one of the most popular architectures used to solve classification problems on data where entities have attribute information accompanied by relational information. Among them, Graph Convolutional Networks and Graph Attention Networks are two of the most popular GNN architectures. In this thesis, I present a statistical framework for understanding node classification on feature-rich relational data. First, I use the framework to study the generalization error and the effects of existing neural network architectures, namely, graph convolutions and graph attention on the Contextual Stochastic Block Model in the regime where the average degree of a node is at least order log squared n in the number of nodes n. Second, I propose a notion of asymptotic local optimality for node classification tasks and design a GNN architecture that is provably optimal in this notion, for the sparse regime, i.e., average degree O(1). In the first part, I present a rigorous theoretical understanding of the effects of graph convolutions in neural networks through the node classification problem of a non-linearly separable Gaussian mixture model coupled with a stochastic block model. First, I identify two quantities corresponding to the signal from the two sources of information: the graph, and the node features, followed by a result that shows that a single graph convolution expands the regime of the distance between the means where multi-layer networks can classify the data by a factor of up to one over square root of the expected degree of a node. Second, I show that with a slightly stronger graph density, two graph convolutions improve this factor to up to 1/sqrt{n}, where n is the number of nodes in the graph. This set of results provides both theoretical and empirical insights into the performance of graph convolutions placed in different combinations among the layers of a neural network, concluding that the performance is mutually similar for all combinations of the placement. In the second part, the analysis of graph attention is provided, where the main result states that in a well-defined ``hard'' regime, every attention mechanism fails to distinguish the intra-class edges from the inter-class edges. In addition, if the signal in the node attributes is sufficiently weak, graph attention convolution cannot perfectly classify the nodes even if the intra-class edges are separable from the inter-class edges. In the third part, I study the node classification problem on feature-decorated graphs in the sparse setting, i.e., when the expected degree of a node is O(1) in the number of nodes, in the fixed-dimensional asymptotic regime, i.e., the dimension of the feature data is fixed while the number of nodes is large. Such graphs are typically known to be locally tree-like. Here, I introduce a notion of Bayes optimality for node classification tasks, called asymptotic local Bayes optimality, and compute the optimal classifier according to this criterion for a fairly general statistical data model with arbitrary distributions of the node features and edge connectivity. The optimal classifier is implementable using a message-passing graph neural network architecture. This is followed by a result that precisely computes the generalization error of this optimal classifier, and compares its performance statistically against existing learning methods on a well-studied data model with naturally identifiable signal-to-noise ratios (SNRs). We find that the optimal message-passing architecture interpolates between a standard MLP in the regime of low graph signal and a typical graph convolutional layer in the regime of high graph signal. Furthermore, I provide a corresponding non-asymptotic result that demonstrates the practical potential of the asymptotically optimal classifier.Item Reinforcement Learning for Solving Financial Problems(University of Waterloo, 2024-11-26) Wang, LufanThis thesis explores the application of reinforcement learning (RL) to address two impor- tant financial problems: risk management and optimal trade execution. In risk management, we aim to balance returns with associated risks. To achieve this, we propose an enhanced RL model that integrates a dynamic Conditional Value at Risk (CVaR) measure. By leveraging distorted probability measures, CVaR allows the RL agent to emphasize worst-case scenarios, ensuring that potential losses are accounted for while optimizing long-term returns. Our method substantially reduces the model’s training time by efficiently reusing computation results, significantly lowering computational overhead. Furthermore, it optimizes the balance between exploration and exploitation. This approach leads to more robust decision-making in uncertain environments and a better overall return. For optimal trade execution, we formulate a flexible RL-based framework capable of dynamically adjusting to changing market conditions. Our model not only replicates the results of Almgren-Chriss model in linear environments but also demonstrates superior performance in more complex, nonlinear scenarios where traditional methods like Almgren- Chriss face challenges.Item Safe Memory Reclamation Techniques(University of Waterloo, 2024-11-14) Singh, AjayThis dissertation presents three paradigms to address the challenge of concurrent memory reclamation, manifesting as use-after-free errors that arise in concurrent data structures using non-blocking techniques. Each paradigm aligns with one of our three objectives for practical and safe memory reclamation algorithms. Objective 1: Design memory reclamation algorithms that are fast, have a bounded memory footprint, and are easy to use — requiring neither intrusive changes to data structures nor specific architecture or compiler support. These algorithms should also deliver consistent performance across various workloads and be applicable to a wide range of data structures. To achieve this, we introduce the neutralization paradigm with the NBR (Neutralization-Based Reclamation) algorithm and its enhanced version, NBR+ (Optimized Neutralization-Based Reclamation). These algorithms use POSIX signals and a lightweight handshaking mechanism to facilitate safe memory reclamation among threads. By relying solely on atomic reads and writes, they achieve bounded garbage and high performance with minimal overhead compared to existing algorithms. They are straightforward to implement, similar in reasoning and programming effort to two-phased locking, and compatible with numerous data structures. Objective 2: Eliminate the asymmetric synchronization overhead in existing reclamation algorithms, which often incur costly memory fences while eagerly publishing reservations, as seen in algorithms like hazard pointers and hazard eras. We propose the reactive synchronization paradigm, implemented through deferred memory reclamation and POSIX signals. This mechanism enables threads to privately track memory references (or reservations) and share this information on demand, using the publish-on-ping algorithm. This approach serves as a drop-in replacement for hazard pointers and hazard eras and includes a variant (EpochPOP) that combines epochs with the robustness of hazard pointers to approach the performance of epoch-based reclamation. Objective 3: Completely eliminate the batching common in current reclamation algorithms to allow immediate memory reclamation, similar to sequential data structures, while maintaining high performance. We introduce Conditional Access, a hardware-software co-design paradigm implemented in a graphite multi-core simulator. This paradigm leverages cache coherence to enable efficient detection of potential use-after-free errors without explicit shared-memory communication or additional coherence traffic. Conditional Access provides programmers with hardware instructions for immediate memory reclamation with minimal overhead in optimistic data structures. To validate our claims, we designed and conducted extensive benchmark tests to evaluate all proposed algorithms on high-end machines under various scenarios. We paired these algorithms with several real-world concurrent data structures, representing various memory access patterns, and compared their time and space efficiency against numerous state-of-the-art memory reclamation algorithms, demonstrating significant improvements.Item The Role of Modularization in Minimizing Vulnerability Propagation and Enhancing SCA Precision(University of Waterloo, 2024-10-24) Abdollahpour, Mohammad MahdiIn today’s software development landscape, the use of third-party libraries is near-ubiquitous; leveraging third-party libraries can significantly accelerate development, allowing teams to implement complex functionalities without reinventing the wheel. However, one significant cost of reusing code is security vulnerabilities. Vulnerabilities in third-party libraries have allowed attackers to breach databases, conduct identity theft, steal sensitive user data, and launch mass phishing campaigns. Notorious examples of vulnerabilities in libraries from the past few years include log4shell, solarwinds, event-stream, lodash, and equifax. Existing software composition analysis (SCA) tools track the propagation of vulnerabilities from libraries through dependencies to downstream clients and alert those clients. Due to their design, many existing tools are highly imprecise—they create alerts for clients even when the flagged vulnerabilities are not exploitable. Library developers occasionally release new versions of their software with refactorings that improve modularity. In this work, we explore the impacts of modularity improvements on vulnerability detection. In addition to generally improving the nonfunctional properties of the code, refactoring also has several security-related beneficial side effects: (1) it improves the precision of existing (fast and stable) SCAs; and (2) it protects from vulnerabilities that are exploitable when the vulnerable code is present and not even reachable, as in gadget chain attacks. Our primary contribution is thus to quantify, using a novel simulation-based counterfactual vulnerability analysis, two main ways that improved modularity can boost security. We propose a modularization method using a DAG partitioning algorithm, and statically measure properties of systems that we (synthetically) modularize. In our experiments, we find that modularization can improve precision of Software Composition Analysis (SCA) tools to 71%, up from 35%. Furthermore, migrating to modularized libraries results in 78% of clients no longer being vulnerable to attacks referencing inactive dependencies. We further verify that the results of our modularization reflect the structures that are already implicit in the projects (but for which no modularity boundaries are enforced).Item A Security Analysis of the Multi-User Ecosystem in Android Framework(University of Waterloo, 2024-10-23) Khan, Muhammad Shahpar NafeesThe Android framework’s multi-user ecosystem introduces significant security challenges, particularly in the enforcement of user-specific access control checks. While previous research has highlighted flaws in Android’s access control mechanism, these efforts often overlook the complexities introduced by vendor customization and the unique demands of a multi-user environment. In this thesis, we conduct a systematic analysis of the Android Open Source Project (AOSP), identifying key patterns regulating multi-user access control implementations. We use these patterns to develop MVP, a static analysis tool that examines vendor ROMs for missing user-specific access control checks in custom ROMs. For example, our analysis reveals that Android’s multi-user environment is susceptible to cross-user attacks; sensitive data can be shared between profiles, and non-privileged users can manipulate privileged system settings. These findings underscore the need for rigorous enforcement of access control mechanisms to mitigate security risks in Android’s multi-user environment.Item Query Complexity of Recursively Composed Functions(University of Waterloo, 2024-10-21) Al-Dhalaan, BandarIn this work, we explore two well-studied notions of randomized query complexity; bounded-error randomized ($\R(f)$), and zero-error randomized ($\R_0(f)$). These have their natural analogues from the classical model of computation, $\R$ corresponding to BPP or ``Monte Carlo" algorithms and $\R_0$ to ZPP or ``Las Vegas" algorithms. For a query complexity measure $M$, one can define the composition limit of $M$ on $f$ by $M^*(f) = \lim_{k \to \infty} \sqrt[k]{M(f^k)}$. The composition limit is a useful way to understand the asymptotic complexity of a function with respect to a specific measure (e.g. if $M(f) = O(1)M(g)$, then $M^*(f) = M^*(g)$). We show that under the composition limit, Las Vegas algorithms can be reduced to Monte Carlo algorithms in the query complexity world. Specifically, $\R_0^*(f) = \max(\C^*(f), \R^*(f))$ for all possibly-partial boolean functions $f$. This has wide-reaching implications for the classical query complexity of boolean functions that are still open. For example, this result implies that any bounded-error algorithm for recursive 3-majority can be converted into a zero-error algorithm with no additional cost (i.e. $R^*(\text{3-MAJ}) = R_0^*(\text{3-MAJ})$. Furthermore, we explore one possible generalization of the recursive 3-majority problem itself, by analyzing 3-majority as a special case of a combinatorial game we call Denial Nim.Item Automated Generation of Dynamic Occlusion-Caused Collisions(University of Waterloo, 2024-10-17) Dykhne, Eli-HenryDynamic occlusions (occlusions caused by other moving road objects) pose some of the most difficult challenges for autonomous driving systems (ADS). Validating the robustness of ADSs to safety critical dynamic occlusions is a difficult task due to the rarity of such scenarios in recorded driving logs. We provide a novel typology of dynamic occlusion scenarios involving vehicles, as well as a framework for ADS safety validation in the presence of dynamic occlusions. Our framework allows for the generation of a diverse set of dynamic occlusion-caused collisions (OCCs) across a wide variety of intersections. We provide results demonstrating that our technique achieves higher generation efficiency and diversity of OCCs than prior works, while being applicable to a wide range of intersections. In this work, we present our generation technique and provide a detailed analysis of the variety and quality of our generated scenarios, as measured by typology coverage, scenario severity, and robustness to faster reaction times.Item Evaluating Container-based and WebAssembly-based Serverless Platforms(University of Waterloo, 2024-10-04) Monum, AbdulServerless computing, often also referred to as Function-as-a-Service (FaaS), allows de- velopers to write scalable event-driven applications while the cloud provider manages the burden of provisioning and maintaining compute resources. Serverless computing is en- abled using virtualized sandboxes like containers or lightweight virtual machines that form the execution units for FaaS applications. However, applications suffer from expensive startup latency (cold starts) due to the compulsory overhead of creating a sandbox and initializing the application code and its dependencies. FaaS platforms keep function ex- ecutors warm in memory to avoid this latency which incurs additional memory overhead on the system. Recently, WebAssembly (Wasm) has emerged as a promising alternative for FaaS applications with its lightweight sandboxing, providing negligible startup delays and reduced memory footprint. However, Wasm applications experience slower execution speeds compared to native execution. This thesis presents a performance evaluation of WebAssembly-based serverless computing in comparison with container-based serverless platforms using analytical performance models and its experimental evaluation. The per- formance model for container-based serverless platforms is used from existing literature, reflecting the behavior of commercial platforms like AWS Lambda, IBM Cloud Functions, and Azure Functions. For WebAssembly-based serverless platforms, this thesis proposes a new performance model based on queueing systems. These models are verified exper- imentally using open-source platforms: Apache OpenWhisk for containers and Spin for WebAssembly. A suite of representative serverless applications is used to validate the models. The comparison of the performance models with experimental results highlights the trade-offs between container-based and WebAssembly-based serverless platforms, pro- viding insights into their respective efficiencies in handling serverless workloads.Item Inherent Limitations of Dimensions for Characterizing Learnability(University of Waterloo, 2024-09-24) Lechner, Tosca KerstinThe fundamental theorem of statistical learning establishes the equivalence between various notions of both agnostic and realizable Probably Approximately Correct (PAC) learnability for binary classification with the finiteness of the VC dimension. Motivated by these foundational results, this work explores whether similar characterizations of learnability and sample complexity can be defined through dimensions in other learning models. We introduce the concept of a scale-invariant dimension and demonstrate that for a range of learning tasks, including distribution learning, any such dimension fails to capture learnability. Additionally, we define Computable PAC (CPAC) learning and show that not all classes are properly computably PAC learnable, highlighting a significant limitation in classical PAC frameworks. Our analysis further reveals that, unlike binary classification, realizable learning does not always imply agnostic learning in settings such as distribution learning and CPAC learning. Finally, we address learning under conditions of data corruption, whether from adversaries or self-manipulating agents, and show that the extent of prior knowledge about manipulation capabilities can significantly affect learnability. We address various ways in overcoming uncertainty in manipulation capabilities by either learning manipulation capabilities from distribution shifts or further oracle access or by allowing abstentions. We also show that this uncertainty in manipulation capabilities can sometimes be overcome without additional oracle access in the realizable case while the agnostic case requires such resources. These findings further underscore that in order to characterize agnostic learnability it is not always sufficient to understand the realizable case.Item BugLLM: Explainable Bug Localization through LLMs(University of Waterloo, 2024-09-24) Subramanian, Vikram NachiappanBug localization is the process of identifying the files in a codebase that contain a bug based on a bug report. This thesis presents BugLLM, a novel zero-shot bug localization method leveraging Large Language Models (LLMs) and semantic search techniques. BugLLM comprises two main phases: ingestion and inference. In the ingestion phase, the codebase is chunked using an Abstract Syntax Tree (AST) parser, embedded using OpenAI's Ada V2 model and indexed in a Milvus vector database for efficient querying. In the inference phase, a query is built from the bug report using an LLM to filter out non-technical details. This refined query is then used to search the vector database, retrieving semantically similar code chunks. These chunks undergo further filtering using another LLM query to establish their relevance to the bug, ensuring only the most pertinent chunks are considered. Our method was evaluated on a dataset that includes bugs from six large Java projects. The evaluation metrics used include top-5 accuracy, where BugLLM achieved a top-5 accuracy ranging from 44.7% to 61.1%. BugLLM's performance was competitive, often surpassing traditional methods, and demonstrated efficiency with no training required. To further aid developers, BugLLM also generates explanations for why specific files are relevant to a bug. The motivation behind this is twofold: helping developers understand why a file is important to fixing a bug and increasing transparency about how our tool works. Our methodology employs Chain-of-Thought prompting to generate detailed explanations from LLMs. These explanations are evaluated based on technical accuracy, groundedness, and informativeness. We find that the explanations generated by BugLLM are largely accurate and grounded in the actual content and context of the code, with minimal hallucination. The explanations were also found to be informative, providing valuable insights to developers. The mean scores (out of 5) for technical accuracy, groundedness, and informativeness were 3.9, 4.5, and 4.3, respectively, across different prompting techniques.Item Studying Practical Challenges of Automated Code Review Suggestions(University of Waterloo, 2024-09-24) Kazemi, FarshadCode review is a critical step in software development, focusing on systematic source code inspection. It identifies potential defects and enhances code quality, maintainability, and knowledge sharing among developers. Despite its benefits, it is time-consuming and error-prone. Therefore, approaches such as Code Reviewer Recommendation (CRR) have been proposed to streamline the process. However, when deployed in real-world scenarios, they often fail to account for various complexities, making them impractical or even harmful. This thesis aims to identify and address challenges at various stages of the code review process: validity of recommendations, quality of the recommended reviewers, and the necessity and usefulness of CRR approaches considering emerging alternative automation. We approach these challenges in three empirical studies presented in three chapters of this thesis. First, we empirically explore the validity of the recommended reviewers by measuring the rate of stale reviewers, i.e., those who no longer contribute to the project. We observe that stale recommendations account for a considerable portion of the suggestions provided by CRR approaches, accounting for up to 33.33% of the recommendations with a median share of 8.30% of all the recommendations. Based on our analysis, we suggest separating the reviewer contribution recency from the other factors used by the CRR objective function. The proposed filter reduces the staleness of recommendations, i.e., the Staleness Reduction Ratio (SRR) improves between 21.44%–92.39%. While the first study assesses the validity of the recommendations, it does not measure their quality or potential unintended impacts. Therefore, we next probe the potential unintended consequences of assigning recommended reviewers. To this end, we study the impact of assigning recommended reviewers without considering the safety of the submitted changeset. We observe existing approaches tend to improve one or two quantities of interest while degrading others. We devise an enhanced approach, Risk Aware Recommender (RAR), which increases the project safety by predicting changeset bug proneness. Given the evolving landscape of automation in code review, our final study examines whether human reviewers and, hence, recommendation tools are still beneficial to the review process. To this end, we focus on the behaviour of Review Comment Generators (RCGs), models trained to automate code review tasks, as a potential way to replace humans in the code review process. Our quantitative and qualitative study of the RCG-generated interrogative comments shows that RCG-generated and human-submitted comments differ in mood, i.e., whether the comment is declarative or interrogative. Our qualitative analysis of sampled comments demonstrates that RCG-generated interrogative comments suffer from limitations in the RCG capacity to communicate. Our observations show that neither task-specific RCGs nor LLM-based ones can fully replace humans in asking questions. Therefore, practitioners can still benefit from using code review tools. In conclusion, our findings highlight the need for further support of human participants in the code review process. Thus, we advocate for the improvement of code review tools and approaches, particularly code review recommendation approaches. Furthermore, tool builders can use our observations and proposed methods to address two critical aspects of existing CRR approaches.Item Impossibility of Two-Round MPC with the Black-Box Use of Additive Homomorphic Encryption(University of Waterloo, 2024-09-24) Ghadirli, AliMinimizing the number of rounds in the context of the Multiparty Computation (MPC) realm with respect to an arbitrary number of semi-honest adversaries is considered one of the branches that has gotten attention from researchers recently. Garg et al. proved that two-round semi-honest MPC is impossible from black-box use of two-round oblivious transfer (OT). Before this work, Garg and Srinivasan and Benhamouda and Lin showed a construction of a two-round MPC with a non-black-box use of the underlying two-round OT. Constructions of cryptographic protocols with the black-box use of cryptographic primitives have the advantage of being more efficient compared to non-black-box constructions, since in these constructions treat the underlying primitives as oracles which simplifies protocol design and analysis, leading to potentially more efficient constructions. Reducing the number of rounds has the advantage of making parties able to send their first messages and go offline until all the other parties send their message of the second round and compute the output. Our main result in this paper is to prove an impossibility result: We show that a two-round MPC based on black-box use of additive homomorphic encryption is impossible. This result is stronger than the previous result by Garg et al., mainly because OT can be constructed using additive homomorphic encryption.Item Enhancing Spatial Query Efficiency Through Dead Space Indexing in Minimum Bounding Boxes(University of Waterloo, 2024-09-24) Chen, YingThis thesis introduces the Grid Minimum Bounding Box (GMB) as an enhancement to traditional Minimum Bounding Box (MBB), designed to mitigate the negative impact of dead space and improve query efficiency. The GMB achieves this by utilizing a low- cost grid bitmap to index dead space within the MBB, enabling the filtering of queries that intersect only with dead space. This filtering reduces false positives in intersection tests and minimizes unnecessary disk I/O to leaf nodes. A key advantage of GMB is that it is developed as an augmentation technique, enabling seamless integration with any R-tree variant without altering the core indexing architecture. The effectiveness of this technique is validated through a comprehensive set of experiments on both real-world and synthetic datasets, demonstrating significant improvements in query performance across various datasets and query types. The key contributions of this thesis include the development of a data-driven algorithm for constructing GMBs, the design of a grid bitmap compression technique, the imple- mentation of an efficient maintenance system for dynamic GMBs, and the enhancement of search operations through GMB-based intersection tests. These contributions collectively establish GMB as a robust solution to the challenges presented by dead space within MBBs across different R-tree variants.Item A Longitudinal Analysis Of Replicas in the Wild Wild Android(University of Waterloo, 2024-09-24) Abbas Zaidi, Syeda MashalIn this thesis, we report and study a phenomenon that contributes to Android API sprawls. We observe that OEM developers introduce private APIs that are composed by copy-paste-editing full or partial code from AOSP and other OEM APIs – we call such APIs, Replicas. To quantify the prevalence of Replicas in the wildly fragmented Android ecosystem, we perform the first large-scale (security) measurement study, aiming at detecting and evaluating Replicas across 342 ROMs, manufactured by 10 vendors and spanning 7 versions. Our study is motivated by the intuition that Replicas contribute to the production of bloated custom Android codebases, add to the complexity of the Android access control mechanism and updates process, and hence may lead to access control vulnerabilities. Our study is facilitated by RepFinder, a tool we develop. It infers the core functionality of an API and detects syntactically and semantically similar APIs using static program paths. RepFinder reveals that Replicas are commonly introduced by OEMs and more importantly, they unnecessarily introduce security enforcement anomalies. Specifically, RepFinder reports an average of 141 Replicas per the studied ROMs, accounting for 9% to 17% of custom APIs – where 37% (on average) are identified as under-protected. Our study thus points to the urgent need to debloat Replicas.Item On Classifying the outcomes of Legal Motions(University of Waterloo, 2024-09-23) Cardoso, OluwaseunConflict is inherent to the human condition, and socially acceptable methods of resolving conflict typically begin with dialogue, compromise, or negotiation. When these efforts fail, the legal process, often culminating in the courtroom, becomes the final recourse. Legal practitioners strive to position themselves advantageously by predicting the outcomes of legal disputes, increasingly relying on predictive tools to navigate the complexities of the courtroom. This thesis investigates the feasibility of predicting the outcomes of legal motion disputes using supervised machine learning methods. While previous research has predominantly utilized expertly hand-crafted features for judicial predictions, this study explores the use of written arguments, known as briefs, as the only basis for prediction. We trained 36 classifiers to predict the outcomes of legal motions and compared their performance to that of a baseline model. The best-performing classifier achieved an accuracy of 62\% on the test dataset. However, statistical analysis reveals that the performance of the top 10 classifiers is not statistically different from the baseline model. These findings suggest that, among the top-performing classifiers, there is no conclusively dominant approach for predicting legal motion outcomes using briefs. The thesis also offers theoretical considerations to explain these results.