Approaches and Techniques to Enhance Efficiency and Performance of Non-Contrastive Self-Supervised Learning Methods

Saheb Pasand, Ali

Approaches and Techniques to Enhance Efficiency and Performance of Non-Contrastive Self-Supervised Learning Methods

Files

SahebPasand_Ali.pdf (2.01 MB)

Date

2023-03-31

Authors

Saheb Pasand, Ali

Advisor

Ghodsi, Ali

Publisher

University of Waterloo

Abstract

Self-supervised learning (SSL) methods have gained considerable attention in recent years due to their ability to learn useful representations of data without relying on labels during training. These methods have revolutionized research across various domains, including Natural Language Processing, Computer Vision, and Graph Deep Learning. SSL methods can be classified into three categories: Generative, Predictive, and Dual-Encoder. Among these, Dual-Encoder methods have gained significant popularity in many applications as, in contrast with generative methods, they do not require training a powerful decoder. Also, in contrast with predictive SSL methods, they do not need a careful design of the pre-training task. However, Dual-Encoder techniques suffer from collapse in representation space, since mapping all the samples into the same point in the embedding space is the trivial solution to their optimization problem. Based on their mechanisms for preventing this issue, those can be further divided into two classes: Contrastive and Non-Contrastive. Contrastive techniques prevent representation collapse using negative sampling, while Non-Contrastive techniques employ asymmetries such as stop-gradient or information maximization through Covariance/Cross-Covariance matrices whitening. This thesis aims to enhance two classes of Non-Contrastive Dual-Encoder SSL methods - Asymmetry-Based and Covariance-Based. The proposed improvements are as follows: • Covariance-Based methods: This thesis proposes techniques to enhance the effi- ciency and performance of Covariance-Based methods. Specifically, the invariance loss term’s efficiency is improved through various data sampling techniques, and the covariance whitening term’s efficiency is enhanced through random dimension selection, LSH bucketing, Nystrom approximation, and random projection. These techniques are thoroughly tested and validated. • Asymmetry-Based methods: This thesis enhances the performance of Asymmetry- Based methods by preventing rank degradation and early termination of the update procedure for the target network. Overall, this thesis proposes novel techniques to enhance the efficiency and performance of Non-Contrastive Dual-Encoder SSL methods. Proposed ideas tested and validated for benchmark static graph datasets. However, those are applicable to other applications and modalities as well.