Learning Discriminative Representations for Gigapixel Images
MetadataShow full item record
Digital images of tumor tissue are important diagnostic and prognostic tools for pathologists. Recent advancement in digital pathology has led to an abundance of digitized histopathology slides, called whole-slide images. Computational analysis of whole-slide images is a challenging task as they are generally gigapixel files, often one or more gigabytes in size. However, these computational methods provide a unique opportunity to improve the objectivity and accuracy of diagnostic interpretations in histopathology. Recently, deep learning has been successful in characterizing images for vision-based applications in multiple domains. But its applications are relatively less explored in the histopathology domain mostly due to the following two challenges. Firstly, there is difficulty in scaling deep learning methods for processing large gigapixel histopathology images. Secondly, there is a lack of diversified and labeled datasets due to privacy constraints as well as workflow and technical challenges in the healthcare sector. The main goal of this dissertation is to explore and develop deep models to learn discriminative representations of whole slide images while overcoming the existing challenges. A three-staged approach was considered in this research. In the first stage, a framework called Yottixel is proposed. It represents a whole-slide image as a set of multiple representative patches, called mosaic. The mosaic enables convenient processing and compact representation of an entire high-resolution whole-slide image. Yottixel allows faster retrieval of similar whole-slide images within large archives of digital histopathology images. Such retrieval technology enables pathologists to tap into the past diagnostic data on demand. Yottixel is validated on the largest public archive of whole-slide images (The Cancer Genomic Atlas), achieving promising results. Yottixel is an unsupervised method that limits its performance on specific tasks especially when the labeled (or partially labeled) dataset can be available. In the second stage, multi-instance learning (MIL) is used to enhance the cancer subtype prediction through weakly-supervised training. Three MIL methods have been proposed, each improving upon the previous one. The first one is based on memory-based models, the second uses attention-based models, and the third one uses graph neural networks. All three methods are incorporated in Yottixel to classify entire whole-slide images with no pixel-level annotations. Access to large-scale and diversified datasets is a primary driver of the advancement and adoption of machine learning technologies. However, healthcare has many restrictive rules around data sharing, limiting research and model development. In the final stage, a federated learning scheme called ProxyFL is developed that enables collaborative training of Yottixel among the multiple healthcare organizations without centralization of the sensitive medical data. The combined research in all the three stages of the Ph.D. has resulted in the development of a holistic and practical framework for learning discriminative and compact representations of whole-slide images in digital pathology.
Cite this version of the work
Shivam Kalra (2022). Learning Discriminative Representations for Gigapixel Images. UWSpace. http://hdl.handle.net/10012/18259