Using Domain Adaptation to Improve Water Quality Modeling with Scarce Data
No Thumbnail Available
Date
2025-01-06
Authors
Advisor
Layton, Anita
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Water Quality (WQ) modelling is important not just to the conservation of ecosystems,
but also to the welfare of modern human society. However, collecting enough high-quality
data to use for training WQ prediction models is difficult. Unlike hydrology, current WQ
collecting methods are constrained by cost, spatial coverage, and temporal sparsity.
This thesis explores using Domain Adaptation (DA) to overcome the data scarcity
problem. By treating the different WQ measuring locations as different domains, high-
resolution data from other locations can be used to better model a target location that has
sparse data. The chosen DA method is inspired by domain-invariant (DI) representation
learning. The model itself consists of (1) an f submodel representing the DI portion, and
(2) one g submodel per domain representing the domain-variant portion.
Within the context of this thesis, the main findings are as follows:
1. DA can be successfully applied in the context of WQ modeling
2. The optimal model sizes are different between the full DA method and just the
pretraining.
3. Using a station’s basin was not a good measure of similarity.
4. At a high number of domains, further increasing the number of domains did not
increase model performance.
5. Simply adding the outputs of f and g (i.e. f (x) + g(x)) did not perform as well as
passing the output of f through g (i.e. g(f (x))).
These findings support the effectiveness of using DA in WQ modelling as well as present
various considerations that affect the final performance. Furthermore, these findings are
relevant to not only this particular DA method but also to DA in general.
Description
Keywords
water quality, modeling, domain adaptation, machine learning, time series