A Comparison of Unsupervised Topic Modelling Techniques for Qualitative Data Analysis of Online Communities

dc.contributor.advisorWallace, James
dc.contributor.authorKaur, Amandeep
dc.date.accessioned2024-07-25T19:54:04Z
dc.date.available2024-07-25T19:54:04Z
dc.date.issued2024-07-25
dc.date.submitted2024-07-18
dc.description.abstractSocial media constitutes a rich and influential source of information for qualitative researchers. However, its vast volume and diversity present significant challenges, which can be assisted by computational techniques like topic modelling. But qualitative researchers often struggle to use computational techniques due to a lack of programming expertise and concerns about maintaining the nuanced aspects of their research, such as contextual understanding, subjective interpretations, and ethical considerations of their data. To address this issue, this thesis explores the integration of BERTopic, an advanced Large Language Model (LLM)-based method, into the Computational Thematic Analysis (CTA) Toolkit to support qualitative data analysis of social media. We conducted interviews and hands-on evaluations in which qualitative researchers compared topics from three modeling techniques --- LDA, NMF and BERTopic. Participants prioritized topic relevance, logical organization, and the capacity to reveal unexpected relationships within the data, valuing detailed, coherent clusters for deeper understanding and actionable insights. BERTopic was favored by 8/12 participants for its ability to uncover hidden connections. These findings underscore the transformative potential of LLM-based tools in providing deeper, more nuanced insights for qualitative analysis of social media data.en
dc.identifier.urihttp://hdl.handle.net/10012/20741
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.relation.urihttps://git.uwaterloo.ca/jrwallace/computational-thematic-analysis-toolkiten
dc.subjecthuman computer interactionen
dc.subjectqualitative analysisen
dc.subjectcomputational linguisticsen
dc.subjectlarge language modelsen
dc.subjectsocial media analysisen
dc.titleA Comparison of Unsupervised Topic Modelling Techniques for Qualitative Data Analysis of Online Communitiesen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0en
uws.contributor.advisorWallace, James
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Kaur_Amandeep.pdf
Size:
43.96 MB
Format:
Adobe Portable Document Format
Description:
Main article

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: