Explore the In-context Learning Capability of Large Language Models

dc.contributor.authorLi, Tianle
dc.date.accessioned2024-05-10T17:24:01Z
dc.date.available2024-05-10T17:24:01Z
dc.date.issued2024-05-10
dc.date.submitted2024-05-08
dc.description.abstractThe rapid evolution of Large Language Models (LLMs) has marked the beginning of a new age in AI capabilities, particularly in the domain of natural language understanding and processing. Among the forefront of these advancements is the exploration of in-context learning, a paradigm that enables models to adapt to new tasks without explicit retraining. This thesis embarks on a comprehensive investigation into the in-context learning capabilities of LLMs, guided by two pivotal studies: KB-BINDER's deployment in Question Answering over Knowledge Bases (KBQA) and the evaluation of LLMs' performance on LongICLBench, a self-curated benchmark for long-context understanding. The first facet of this investigation, embodied by KB-BINDER, addresses the challenge of generalizing LLMs to diverse KBQA tasks without task-specific training. KB-BINDER pioneers a novel few-shot in-context learning approach, utilizing Codex to generate logical forms and employing BM25 for draft binding, demonstrating remarkable efficacy across heterogeneous KBQA datasets. We believe KB-BINDER can serve as an important baseline for future research in utilizing the few-shot capability of LLMs to resolve the problem of KBQA. Complementing this, the second study introduces LongICLBench, a specialized benchmark designed to test long-context LLMs in processing long, context-rich sequences across extreme-label classification tasks with in-context learning. Through evaluation with tasks of increasing difficulty level, an obvious performance threshold is identified, highlighting the current limitations of LLMs in handling extensive context windows and revealing a bias towards labels positioned towards the input's end after grouping the instances with the same labels in demonstration. This underscores a crucial gap in the current long-context LLMs' ability to reason over long sequences, paving the way for further enhancements in long-context comprehension. Together, these studies form the cornerstone of this thesis, encapsulating the dynamic landscape of in-context learning within LLMs. Through a detailed examination of KB-BINDER and LongICLBench, this work not only charts the current capabilities and boundaries of LLMs but also lays the groundwork for future advancements in making LLMs more adaptable and proficient in handling a wide array of complex tasks.en
dc.identifier.urihttp://hdl.handle.net/10012/20554
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectnatural language processingen
dc.titleExplore the In-context Learning Capability of Large Language Modelsen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0en
uws.contributor.advisorChen, Wenhu
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Li_Tianle.pdf
Size:
2.94 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: