UWSpace >
University of Waterloo >
Electronic Theses and Dissertations (UW) >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10012/4081

Title: The Use Of Kullback-Leibler Divergence In Opinion Retrieval
Authors: Cen, Kun
Keywords: Opinion Retrieval
Kullback-Leibler Divergence
Approved Date: 1-Oct-2008
Date Submitted: 24-Sep-2008
Abstract: With the huge amount of subjective contents in on-line documents, there is a clear need for an information retrieval system that supports retrieval of documents containing opinions about the topic expressed in a user’s query. In recent years, blogs, a new publishing medium, have attracted a large number of people to express personal opinions covering all kinds of topics in response to the real-world events. The opinionated nature of blogs makes them a new interesting research area for opinion retrieval. Identification and extraction of subjective contents from blogs has become the subject of several research projects. In this thesis, four novel methods are proposed to retrieve blog posts that express opinions about the given topics. The first method utilizes the Kullback-Leibler divergence (KLD) to weight the lexicon of subjective adjectives around query terms. Considering the distances between the query terms and subjective adjectives, the second method uses KLD scores of subjective adjectives based on distances from the query terms for document re-ranking. The third method calculates KLD scores of subjective adjectives for predefined query categories. In the fourth method, collocates, words co-occurring with query terms in the corpus, are used to construct the subjective lexicon automatically. The KLD scores of collocates are then calculated and used for document ranking. Four groups of experiments are conducted to evaluate the proposed methods on the TREC test collections. The results of the experiments are compared with the baseline systems to determine the effectiveness of using KLD in opinion retrieval. Further studies are recommended to explore more sophisticated approaches to identify subjectivity and promising techniques to extract opinions.
Program: Management Sciences
Department: Management Sciences
Degree: Master of Applied Science
URI: http://hdl.handle.net/10012/4081
Appears in Collections:Faculty of Engineering Theses and Dissertations
Electronic Theses and Dissertations (UW)

Files in This Item:

File Description SizeFormat
Thesis_Kun_Cen.pdf905.37 kBAdobe PDFView/Open


This item is protected by original copyright

All items in UWSpace are protected by copyright, with all rights reserved.

 

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

contact us | give us feedback | http://www.lib.uwaterloo.ca | © 2006 University of Waterloo