An Investigation of Word Sense Disambiguation for Improving Lexical Chaining

dc.contributor.authorEnss, Matthewen
dc.date.accessioned2007-05-08T14:01:33Z
dc.date.available2007-05-08T14:01:33Z
dc.date.issued2006en
dc.date.submitted2006en
dc.description.abstractThis thesis investigates how word sense disambiguation affects lexical chains, as well as proposing an improved model for lexical chaining in which word sense disambiguation is performed prior to lexical chaining. A lexical chain is a set of words from a document that are related in meaning. Lexical chains can be used to identify the dominant topics in a document, as well as where changes in topic occur. This makes them useful for applications such as topic segmentation and document summarization. <br /><br /> However, polysemous words are an inherent problem for algorithms that find lexical chains as the intended meaning of a polysemous word must be determined before its semantic relations to other words can be determined. For example, the word "bank" should only be placed in a chain with "money" if in the context of the document "bank" refers to a place that deals with money, rather than a river bank. The process by which the intended senses of polysemous words are determined is word sense disambiguation. To date, lexical chaining algorithms have performed word sense disambiguation as part of the overall process building lexical chains. Because the intended senses of polysemous words must be determined before words can be properly chained, we propose that word sense disambiguation should be performed before lexical chaining occurs. Furthermore, if word sense disambiguation is performed prior to lexical chaining, then it can be done with any available disambiguation method, without regard to how lexical chains will be built afterwards. Therefore, the most accurate available method for word sense disambiguation should be applied prior to the creation of lexical chains. <br /><br /> We perform an experiment to demonstrate the validity of the proposed model. We compare the lexical chains produced in two cases: <ol> <li>Lexical chaining is performed as normal on a corpus of documents that has not been disambiguated. </li> <li>Lexical chaining is performed on the same corpus, but all the words have been correctly disambiguated beforehand. </li></ol> We show that the lexical chains created in the second case are more correct than the chains created in the first. This result demonstrates that accurate word sense disambiguation performed prior to the creation of lexical chains does lead to better lexical chains being produced, confirming that our model for lexical chaining is an improvement upon previous approaches.en
dc.formatapplication/pdfen
dc.format.extent303578 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10012/2938
dc.language.isoenen
dc.pendingfalseen
dc.publisherUniversity of Waterlooen
dc.rightsCopyright: 2006, Enss, Matthew. All rights reserved.en
dc.subjectComputer Scienceen
dc.subjectlexical chainsen
dc.subjectword sense disambiguationen
dc.subjectcomputational linguisticsen
dc.subjectnatural language processingen
dc.titleAn Investigation of Word Sense Disambiguation for Improving Lexical Chainingen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentSchool of Computer Scienceen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
mjrenss2006.pdf
Size:
296.46 KB
Format:
Adobe Portable Document Format