UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Towards Measuring Coherence in Poem Generation

Loading...
Thumbnail Image

Date

2023-01-11

Authors

Mohseni Kiasari, Peyman

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Large language models (LLM) based on transformer architecture and trained on massive corpora have gained prominence as text-generative models in the past few years. Even though large language models are very adept at memorizing and generating long sequences of text, their ability to generate truly novel and creative texts including poetry lines is limited. On the other hand, past research has shown that variational autoencoders (VAE) can generate original poetic lines adhering to the stylistic characteristics of the training corpus. Originality and stylistic adherence of lines generated by VAEs can be partially attributed to the fact that, firstly, VAEs can be successfully trained on small highly curated corpora in a given style, and secondly, VAEs with a recurrent neural network architecture has a relatively low memorization capacity compared to transformer networks, which leads to the generation of more creative texts. VAEs, however, are limited to producing short sentence-level texts due to fewer trainable parameters, compared to LLMs. As a result, VAEs can only generate independent poetic lines, rather than complete and coherent poems. In this thesis, we propose a new model of coherence scoring that allows the system to rank independent lines generated by a VAE and construct a coherent poem. The scoring model is based on BERT, fine-tuned as a coherence evaluator. We propose a novel training schedule for fine-tuning BERT, during which we show the system different types of lines as negative examples: lines sampled from the same vs. different poems. The results of the human evaluation show that participants perceive poems constructed by this method to be more coherent than randomly sampled lines.

Description

Keywords

Coherence, Poem Generation, Text Generation, natural language processing

LC Keywords

Citation