Shim, Kyu Min2024-08-212024-08-212024-08-212024-08-12https://hdl.handle.net/10012/20832Variance reduction is an important area of research in the realm of online controlled experiments, also known as A/B tests. Reducing outcome variability in an A/B test improves the test’s statistical power and improves the efficiency of the experimentation process. Many variance reduction techniques already exist, and they typically utilize data collected prior to the experiment (pre-experiment data) to reveal complex relationships between the outcome of interest and covariates. These insights are then applied to the data collected during the experiment (in-experiment data) to reduce the outcome variability in the A/B test. However, such methods are heavily reliant on the assumption that pre- and in-experiment data are highly correlated. This is questionable in online settings where trends change quickly due to heterogeneity in user behavior, the rapid development of technology, and the competitive landscape. In these settings, we cannot ignore that fluctuations in other factors may degrade the correlation between pre- and in-experiment data. We propose a two-stage framework for treatment effect estimation that adjusts for differences between pre- and in-experiment data, thereby producing treatment effect estimators with smaller variance than those associated with other variance reduction methods. Inference is conducted by modeling and estimating the counterfactual outcome of each unit and performing a pairwise comparison. This method of inference is shown to be asymptotically unbiased, with an asymptotic variance that scales with the model’s predictive accuracy. We compare the variance reduction capabilities of the proposed method with several alternatives through simulation studies using both simulated data and real-world data. In doing so, we demonstrate that the proposed method’s variance reduction capabilities are at least as good (and in some cases orders of magnitude better) than that of existing methods.enVariance Reduction with Model-based Counterfactual EstimationMaster Thesis