Representation Learning for Image Search in Histopathology

Shafique, Abubakr

Representation Learning for Image Search in Histopathology

Files

Shafique_Abubakr.pdf (73.29 MB)

Date

2024-01-26

Authors

Shafique, Abubakr

Advisor

Tizhoosh, Hamid

Publisher

University of Waterloo

Abstract

Advancements in the field of Machine Learning (ML) have shown significant promise in complementing the endeavors of healthcare professionals. However, the widespread acceptance and trust in clinical applications necessitate the creation of state-of-the-art algorithms characterized by superior accuracy and performance. Digital Pathology (DP) and Whole Slide Image (WSI) technologies present an innovative pathway for image-based diagnosis in the field of histopathology. DP's advantages offer a unique opportunity to delve into vast archives of medical images using Content-based Image Retrieval (CBIR). CBIR, by enabling pathologists to access information from previously diagnosed cases, can serve as a virtual second opinion, empowering physicians to make confident diagnoses. The representation of whole slide images (WSIs) plays a pivotal role in various domains, notably in pathology and medicine. However, this task is particularly challenging due to the vast dimensions of WSIs, making comprehensive processing a formidable undertaking within the constraints of existing hardware resources. To confront the complexities associated with processing and searching within expansive repositories of gigapixel WSIs, akin to numerous other substantial big-data challenges, there emerges a compelling need to employ a fundamental computer science methodology known as the "Divide and Conquer" strategy. It is employed to break down WSIs into smaller, meaningful patches. Accurate representation of these patches is vital, especially in medical image analysis for tasks like search and matching. In this thesis, I address these challenges by dividing WSIs into significant patches and creating distinct representations for different tissue types. Regarding the "divide" process, I have introduced an unsupervised method known as the Selection of Distinct Morphologies (SDM). This approach aims to identify and select all unique patches from the WSI, which we refer to as a "montage". The creation of this montage serves as a pivotal element essential for enabling a variety of applications, including image search. The primary objective of this methodology is to construct a montage consisting of a smaller number of patches that display diversity while retaining their meaningfulness within the framework of the WSI. Furthermore, for the "conquer" aspect, a novel method for learning representations that discriminate between different morphological features has been developed, employing a ranking loss mechanism specifically designed for image retrieval tasks. This metric learning strategy effectively attracts representations of similar morphological attributes closer together in the latent space, while concurrently distancing those that are dissimilar by a predefined margin. The cumulative research efforts during the Ph.D. program have culminated in a comprehensive and pragmatic framework. This framework is designed to facilitate the acquisition of meaningful representations for WSI in the field of DP, with a specific focus on applications related to image search.