Controlled Generation of Stylized Text Using Semantic and Phonetic Representations

dc.contributor.advisorVechtomova, Olga
dc.contributor.authorGudmundsson, Egill Ian
dc.date.accessioned2022-01-21T16:56:04Z
dc.date.available2022-01-21T16:56:04Z
dc.date.issued2022-01-21
dc.date.submitted2022-01-15
dc.description.abstractNeural networks are a popular choice of models for the purpose of text generation. Variational autoencoders have been shown to be good at reconstructing text and generating novel text. However, controlling certain aspects of the generated text (e.g., length, semantics, cadence) has proven a more difficult task. The objectives of disentanglement and controlled text generation have thus become areas of interest, with various approaches depending on the aspects we desire to control. In this work we study controllable generation of lyric text based on semantic and phonetic criteria. The phonetic information takes the form of generalized phonetic patterns. A Bag-of-Words Variational Autoencoder (VAE) extracts and models the semantic information, while a phonetic pattern VAE handles the phonetic information. Each uses several regularization techniques for its respective latent space and the information from each is fed to a lyrics decoder to generate novel lyric lines that would satisfy both the Bag-of-Words and phonetic constraints. The experiments show that our model can learn to reconstruct phonetic patterns extracted from text and use them with the Bag-of-Words representations to reconstruct the original lyric lines. Together, the learned representations of phonetic patterns and Bag-of-Words constraints can be used to generate new lyrics.en
dc.identifier.urihttp://hdl.handle.net/10012/17941
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectartificial intelligenceen
dc.subjectmachine learningen
dc.subjectnatural language processingen
dc.subjectphoneticsen
dc.subjectstylized texten
dc.subjecttext generationen
dc.subjectlyricsen
dc.titleControlled Generation of Stylized Text Using Semantic and Phonetic Representationsen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0en
uws.contributor.advisorVechtomova, Olga
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gudmundsson_Egill.pdf
Size:
1.09 MB
Format:
Adobe Portable Document Format
Description:
Master's thesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: