Show simple item record

dc.contributor.authorHemati, Sobhan
dc.date.accessioned2022-08-24 20:22:19 (GMT)
dc.date.available2022-12-23 05:50:07 (GMT)
dc.date.issued2022-08-24
dc.date.submitted2022-08-17
dc.identifier.urihttp://hdl.handle.net/10012/18637
dc.description.abstractDigital pathology has enabled us to capture, store, query and analyze scanned biopsy samples as digital images. The widespread adoption of digital pathology has spurred the digitization of tissue biopsy samples, known as whole slide images (WSIs). Content-based WSI retrieval and computational pathology are expected to reduce the physicians' workload, improve diagnostic performance, and facilitate the teaching and research in pathology. Recent advances in deep learning have the potential to contribute to computational pathology and more effective WSI search systems. Deep learning is a successful tool for image analysis, including various applications in the medical domain. However, considering the extremely large size of the multi-resolution images and lack of patch-level labelled data, deep networks are challenging to adapt for WSI analysis. More precisely, the gigantic size of WSIs imposes three main challenges to apply deep learning to represent pathology image data for efficient and accurate processing. First, it is not easy to obtain deep WSI embeddings in an end-to-end manner. Second, storing WSIs and patch embeddings in Euclidean space needs significant memory resources when operating in large repositories. Third, WSI and patch search using Euclidean embeddings in large image archives is infeasible. In order to address the above challenges, first, we propose Efficient Spectral Hashing (ESH), a method based on spectral hashing formulation with lower space and time complexities which leads to binary representations with an enhanced search performance compared to many recent hashing methods. We also proposed a novel quantization scheme, called non-rigid quantization (NRQ), where for the first time we proposed to employ non-rigid transformations for minimizing quantization loss. After studying standard hashing algorithms, the main challenge is modifying these methods so that they can be applied to WSIs. Due to the gigantic size of WSIs, the first step in processing WSIs is to replace them with a subset of their associated representative patches. Considering this multi-instance (bag of patches) representation per WSI, this is not clear how to apply the two proposed methods to learn binary WSI representations. To mitigate this challenge, we proposed we proposed CNN-Deep Sets (CNN-DS) to learn one permutation invariant vector representation per WSI in an end-to-end manner. Although using CNN-DS, we were able to obtain WSI embeddings, still there were two issues with this approach. First, the method faces high GPU memory usage during the training due to keeping multiple bags of patches in the memory. Second, the obtained embeddings were in Euclidean space which for the very large archives the search speed becomes very slow while they occupy significantly more storage. Further, applying ESH/NRQ on the extracted embeddings needs an additional learning step. To unify ideas from ESH/NRQ with CNN-DS that is learning compact (binary and sparse) permutation-invariant WSI representation for efficient search and also to bypass training time memory bottleneck we proposed a novel framework based on deep generative modelling and the Fisher Vector Theory. We introduced new loss functions for learning sparse and binary permutation-invariant WSI representations that employ instance-based training achieving better memory efficiency. The learned WSI representations were validated on The Cancer Genomic Atlas (TCGA) and Liver-Kidney-Stomach (LKS) datasets. The proposed method outperforms Yottixel (a recent search engine for histopathology images) both in terms of retrieval accuracy and speed. Further, we achieve competitive performance against SOTA on the public benchmark LKS dataset for WSI classification. Finally, showed that learning sparse permutation-invariant WSI representations which in our framework is associated with encouraging sparsity on the gradients reduces the sharpness of the loss landscape and as a result improves the generalization of deep neural networks.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectComputational Pathologyen
dc.subjectWhole Slide Image Searchen
dc.subjectLearning to Hashen
dc.subjectPermutation-invariant Representationsen
dc.titleLearning Compact Representations for Efficient Whole Slide Image Search in Computational Pathologyen
dc.typeDoctoral Thesisen
dc.pendingfalse
uws-etd.degree.departmentSystems Design Engineeringen
uws-etd.degree.disciplineSystem Design Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws-etd.embargo.terms4 monthsen
uws.contributor.advisorTizhoosh, Hamid
uws.contributor.advisorRahnamayan, Shahryar
uws.contributor.affiliation1Faculty of Engineeringen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages