An Investigation of Word Sense Disambiguation for Improving Lexical Chaining

Enss, Matthew

An Investigation of Word Sense Disambiguation for Improving Lexical Chaining

dc.contributor.author	Enss, Matthew	en
dc.date.accessioned	2007-05-08T14:01:33Z
dc.date.available	2007-05-08T14:01:33Z
dc.date.issued	2006	en
dc.date.submitted	2006	en
dc.description.abstract	This thesis investigates how word sense disambiguation affects lexical chains, as well as proposing an improved model for lexical chaining in which word sense disambiguation is performed prior to lexical chaining. A lexical chain is a set of words from a document that are related in meaning. Lexical chains can be used to identify the dominant topics in a document, as well as where changes in topic occur. This makes them useful for applications such as topic segmentation and document summarization. <br /><br /> However, polysemous words are an inherent problem for algorithms that find lexical chains as the intended meaning of a polysemous word must be determined before its semantic relations to other words can be determined. For example, the word "bank" should only be placed in a chain with "money" if in the context of the document "bank" refers to a place that deals with money, rather than a river bank. The process by which the intended senses of polysemous words are determined is word sense disambiguation. To date, lexical chaining algorithms have performed word sense disambiguation as part of the overall process building lexical chains. Because the intended senses of polysemous words must be determined before words can be properly chained, we propose that word sense disambiguation should be performed before lexical chaining occurs. Furthermore, if word sense disambiguation is performed prior to lexical chaining, then it can be done with any available disambiguation method, without regard to how lexical chains will be built afterwards. Therefore, the most accurate available method for word sense disambiguation should be applied prior to the creation of lexical chains. <br /><br /> We perform an experiment to demonstrate the validity of the proposed model. We compare the lexical chains produced in two cases: <ol> <li>Lexical chaining is performed as normal on a corpus of documents that has not been disambiguated. </li> <li>Lexical chaining is performed on the same corpus, but all the words have been correctly disambiguated beforehand. </li></ol> We show that the lexical chains created in the second case are more correct than the chains created in the first. This result demonstrates that accurate word sense disambiguation performed prior to the creation of lexical chains does lead to better lexical chains being produced, confirming that our model for lexical chaining is an improvement upon previous approaches.	en
dc.format	application/pdf	en
dc.format.extent	303578 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/10012/2938
dc.language.iso	en	en
dc.pending	false	en
dc.publisher	University of Waterloo	en
dc.rights	Copyright: 2006, Enss, Matthew. All rights reserved.	en
dc.subject	Computer Science	en
dc.subject	lexical chains	en
dc.subject	word sense disambiguation	en
dc.subject	computational linguistics	en
dc.subject	natural language processing	en
dc.title	An Investigation of Word Sense Disambiguation for Improving Lexical Chaining	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Mathematics	en
uws-etd.degree.department	School of Computer Science	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: mjrenss2006.pdf
Size:: 296.46 KB
Format:: Adobe Portable Document Format

Download

Collections

Theses
Computer Science