Addressing Data Scarcity in Domain Generalization for Computer Vision Applications in Image Classification

Kaai, Kimathi

Addressing Data Scarcity in Domain Generalization for Computer Vision Applications in Image Classification

dc.contributor.advisor	Wong, Alexander
dc.contributor.advisor	Rambhatla, Sirisha
dc.contributor.author	Kaai, Kimathi
dc.date.accessioned	2024-08-30T17:14:36Z
dc.date.available	2024-08-30T17:14:36Z
dc.date.issued	2024-08-30
dc.date.submitted	2024-08-23
dc.description.abstract	Domain generalization (DG) for image classification is a crucial task in machine learning that focuses on transferring domain-invariant knowledge from multiple source domains to an unseen target domain. Traditional DG methods assume that classes of interest are present across multiple domains (domain-shared), which helps mitigate spurious correlations between domain and class. However, in real-world scenarios, data scarcity often leads to classes being present in only a single domain (domain-linked), resulting in poor generalization performance. This thesis introduces the domain-linked DG task and proposes a novel methodology to address this challenge. This thesis proposes FOND, a "Fairness-inspired cONtrastive learning objective for Domain-linked domain generalization," which leverages domain-shared classes to learn domain-invariant representations for domain-linked classes. FOND is designed to enhance generalization by minimizing the impact of task-irrelevant domain-specific features. The theoretical analysis in this thesis extends existing domain adaptation error bounds to the domain-linked DG task, providing insights into the factors that influence generalization performance. Key theoretical findings include the understanding that domain-shared classes typically have more samples and learn domain-invariant features more effectively than domain-linked classes. This analysis informs the design of FOND, ensuring that it addresses the unique challenges of domain-linked DG. Furthermore, experiments are performed across multiple datasets and experimental settings to evaluate the effectiveness of various current methodologies. The proposed method achieves state-of-the-art performance in domain-linked DG tasks, with minimal trade-offs in the performance of domain-shared classes. Experimental results highlight the impact of shared-class settings, total class size, and inter-domain variations on the generalizability of domain-linked classes. Visualizations of learned representations further illustrate the robustness of FOND in capturing domain-invariant features. In summary, this thesis advocates future DG research for domain-linked classes by (1) theoretically and experimentally analyzing the factors impacting domain-linked class representation learning, (2) demonstrating the ineffectiveness of current state-of-the-art DG approaches, and (3) proposing an algorithm to learn generalizable representations for domain-linked classes by transferring useful representations from domain-shared ones.
dc.identifier.uri	https://hdl.handle.net/10012/20932
dc.language.iso	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.relation.uri	https://github.com/criticalml-uw/fond
dc.subject	machine learning
dc.subject	computer vision
dc.subject	domain generalization
dc.subject	contrastive learning
dc.subject	image classification
dc.subject	error bounds
dc.title	Addressing Data Scarcity in Domain Generalization for Computer Vision Applications in Image Classification
dc.type	Master Thesis
uws-etd.degree	Master of Applied Science
uws-etd.degree.department	Systems Design Engineering
uws-etd.degree.discipline	System Design Engineering
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	0
uws.contributor.advisor	Wong, Alexander
uws.contributor.advisor	Rambhatla, Sirisha
uws.contributor.affiliation1	Faculty of Engineering
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Kaai_Kimathi.pdf
Size:: 2.12 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Systems Design Engineering