Using Domain Adaptation to Improve Water Quality Modeling with Scarce Data

No Thumbnail Available

Date

2025-01-06

Advisor

Layton, Anita

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Water Quality (WQ) modelling is important not just to the conservation of ecosystems, but also to the welfare of modern human society. However, collecting enough high-quality data to use for training WQ prediction models is difficult. Unlike hydrology, current WQ collecting methods are constrained by cost, spatial coverage, and temporal sparsity. This thesis explores using Domain Adaptation (DA) to overcome the data scarcity problem. By treating the different WQ measuring locations as different domains, high- resolution data from other locations can be used to better model a target location that has sparse data. The chosen DA method is inspired by domain-invariant (DI) representation learning. The model itself consists of (1) an f submodel representing the DI portion, and (2) one g submodel per domain representing the domain-variant portion. Within the context of this thesis, the main findings are as follows: 1. DA can be successfully applied in the context of WQ modeling 2. The optimal model sizes are different between the full DA method and just the pretraining. 3. Using a station’s basin was not a good measure of similarity. 4. At a high number of domains, further increasing the number of domains did not increase model performance. 5. Simply adding the outputs of f and g (i.e. f (x) + g(x)) did not perform as well as passing the output of f through g (i.e. g(f (x))). These findings support the effectiveness of using DA in WQ modelling as well as present various considerations that affect the final performance. Furthermore, these findings are relevant to not only this particular DA method but also to DA in general.

Description

Keywords

water quality, modeling, domain adaptation, machine learning, time series

LC Subject Headings

Citation