Harnessing Generalist LLMs for Diverse Objective and Subjective NLP Tasks

Sahu, Gaurav

Harnessing Generalist LLMs for Diverse Objective and Subjective NLP Tasks

Files

Sahu_Gaurav.pdf (8.38 MB)

Date

2024-12-17

Authors

Sahu, Gaurav

Advisor

Vechtomova, Olga

Publisher

University of Waterloo

Abstract

Recent advances in natural language processing (NLP), particularly in the subspace of large language modeling, have led to a major paradigm shift. Large language models (LLMs), like the GPT and LLaMA family of models, are trained on a massive Internet corpus covering data from a gamut of diverse domains. In addition, the billions of parameters in these models also invoke emergent capabilities in them, leading to strong improvements across diverse NLP tasks without much task-specific tuning; however, effectively harnessing the knowledge of these generalist models for real-world data still remains a major challenge as the LLMs can produce inconsistent, biased, and unsatisfactory outputs. In this thesis, we propose task-specific strategies for effectively leveraging LLMs for a number of challenging NLP tasks, such as (low-resource) text classification, text summarization, modeling artistic preferences of creative individuals, and automated data analysis. Our results suggest that LLMs can serve as excellent data generators and data labelers for well-defined single-step tasks like classification and summarization, crucially in data-scarce settings, where models trained on LLM-generated data achieved competitive performance to oracle models trained on a much larger labeled training data. On the other hand, for more subjective tasks like modeling artistic preferences among creative individuals, we demonstrate that while LLMs might not be able to discern between the likes and dislikes of artists, they can be effective in extracting key linguistic and poetic properties from text that can later be employed to infer artistic preferences among different individuals. Lastly, we also evaluate the effectiveness of LLMs in multi-step tasks that require the LLM to perform multiple tasks in tandem without compromising performance for individual tasks. Overall, our work draws critical insights into the strengths and shortcomings of LLMs for a wide range of subjective and objective NLP tasks and includes meaningful suggestions for the research community to harness LLMs for those tasks effectively.

Keywords

large language models (LLMs), natural language processing (NLP), text classification, intent classification, few-shot text classification, text summarization, extractive text summarization, semi-supervised text summarization, data augmentation, zero-shot text classification, artistic preference modeling, LLM-based exploratory data analysis, abstractive text summarization, GPT, LLaMA-3, LLaMA-2, BERT, DistilBERT, DistilBART, PreSumm, PromptMix, MixSumm

URI

https://hdl.handle.net/10012/21255

Collections

Theses
Computer Science

Full item page

Harnessing Generalist LLMs for Diverse Objective and Subjective NLP Tasks

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections