Learning Compact Representations for Efficient Whole Slide Image Search in Computational Pathology

Hemati, Sobhan

Learning Compact Representations for Efficient Whole Slide Image Search in Computational Pathology

Files

Hemati_Sobhan.pdf (8.32 MB)

Date

2022-08-24

Authors

Hemati, Sobhan

Advisor

Tizhoosh, Hamid
Rahnamayan, Shahryar

Publisher

University of Waterloo

Abstract

Digital pathology has enabled us to capture, store, query and analyze scanned biopsy samples as digital images. The widespread adoption of digital pathology has spurred the digitization of tissue biopsy samples, known as whole slide images (WSIs). Content-based WSI retrieval and computational pathology are expected to reduce the physicians' workload, improve diagnostic performance, and facilitate the teaching and research in pathology. Recent advances in deep learning have the potential to contribute to computational pathology and more effective WSI search systems. Deep learning is a successful tool for image analysis, including various applications in the medical domain. However, considering the extremely large size of the multi-resolution images and lack of patch-level labelled data, deep networks are challenging to adapt for WSI analysis. More precisely, the gigantic size of WSIs imposes three main challenges to apply deep learning to represent pathology image data for efficient and accurate processing. First, it is not easy to obtain deep WSI embeddings in an end-to-end manner. Second, storing WSIs and patch embeddings in Euclidean space needs significant memory resources when operating in large repositories. Third, WSI and patch search using Euclidean embeddings in large image archives is infeasible. In order to address the above challenges, first, we propose Efficient Spectral Hashing (ESH), a method based on spectral hashing formulation with lower space and time complexities which leads to binary representations with an enhanced search performance compared to many recent hashing methods. We also proposed a novel quantization scheme, called non-rigid quantization (NRQ), where for the first time we proposed to employ non-rigid transformations for minimizing quantization loss. After studying standard hashing algorithms, the main challenge is modifying these methods so that they can be applied to WSIs. Due to the gigantic size of WSIs, the first step in processing WSIs is to replace them with a subset of their associated representative patches. Considering this multi-instance (bag of patches) representation per WSI, this is not clear how to apply the two proposed methods to learn binary WSI representations. To mitigate this challenge, we proposed we proposed CNN-Deep Sets (CNN-DS) to learn one permutation invariant vector representation per WSI in an end-to-end manner. Although using CNN-DS, we were able to obtain WSI embeddings, still there were two issues with this approach. First, the method faces high GPU memory usage during the training due to keeping multiple bags of patches in the memory. Second, the obtained embeddings were in Euclidean space which for the very large archives the search speed becomes very slow while they occupy significantly more storage. Further, applying ESH/NRQ on the extracted embeddings needs an additional learning step. To unify ideas from ESH/NRQ with CNN-DS that is learning compact (binary and sparse) permutation-invariant WSI representation for efficient search and also to bypass training time memory bottleneck we proposed a novel framework based on deep generative modelling and the Fisher Vector Theory. We introduced new loss functions for learning sparse and binary permutation-invariant WSI representations that employ instance-based training achieving better memory efficiency. The learned WSI representations were validated on The Cancer Genomic Atlas (TCGA) and Liver-Kidney-Stomach (LKS) datasets. The proposed method outperforms Yottixel (a recent search engine for histopathology images) both in terms of retrieval accuracy and speed. Further, we achieve competitive performance against SOTA on the public benchmark LKS dataset for WSI classification. Finally, showed that learning sparse permutation-invariant WSI representations which in our framework is associated with encouraging sparsity on the gradients reduces the sharpness of the loss landscape and as a result improves the generalization of deep neural networks.