Advancing Antibody Design: Integrating Protein Language Models for Enhanced Computational Strategies

Ghodsi, AliKohandel, MohammadJamialahmadi, Benyamin2024-01-252024-05-252024-01-252024-01-17http://hdl.handle.net/10012/20289Antibodies, or immunoglobulins, are integral to the immune response, playing a crucial role in recognizing and neutralizing external threats such as pathogens. The design of these molecules, however, is complex due to the limited availability of paired structural antibody-antigen data and the intricacies of structurally non-deterministic regions. In this thesis, we explore innovative approaches for computationally designing antibodies, addressing key challenges in traditional methods. Our focus is on overcoming the limitations of existing computational techniques in antibody design, which include limited structural data availability, CDR flexibility, and dependence on contextual information. We propose two novel solutions leveraging Protein Language Models (pLMs). The first employs a sequence-to-sequence model, analogous to language translation, utilizing data augmentation for semi-supervised training. The second approach integrates both sequential and structural antigen information into a pLM using specially designed adapter modules. These methods aim to efficiently utilize extensive sequence data, circumventing the challenges of limited structural data. Our models demonstrate promising results in the Rosetta Antibody Design benchmark, outperforming existing models and showcasing the potential of integrating pLMs in computational antibody design. This research contributes to enhancing the precision and applicability of antibody design, marking a significant advancement in therapeutic and diagnostic applications.enantibody designprotein language modelsprotein structural encodingneural machine translationback translationAdvancing Antibody Design: Integrating Protein Language Models for Enhanced Computational StrategiesMaster Thesis