UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Exploitation of Redundant Inverse Term Frequency for Answer Extraction

dc.contributor.authorLynam, Thomas Richarden
dc.date.accessioned2006-08-22T14:27:02Z
dc.date.available2006-08-22T14:27:02Z
dc.date.issued2002en
dc.date.submitted2002en
dc.description.abstractAn automatic question answering system must find, within a corpus,short factual answers to questions posed in natural language. The process involves analyzing the question, retrieving information related to the question, and extracting answers from the retrieved information. This thesis presents a novel approach to answer extraction in an automated question answering (QA) system. The answer extraction approach is an extension of the MultiText QA system. This system employs a question analysis component to examine the question and to produce query terms for the retrieval component which extracts several document fragments from the corpus. The answer extraction component selects a few short answers from these fragments. This thesis describes the design and evaluation of the Redundant Inverse Term Frequency (RITF) answer extraction component. The RITF algorithm locates and evaluates words from the passages that are likely to be associated with the answer. Answers are selected by finding short fragments of text that contain the most likely words based on: the frequency of the words in the corpus, the number of fragments in which the word occurs, the rank of the passages as determined by the IR, the distance of the word from the centre of the fragment, and category information found through question analysis. RITF makes a substantial contribution in overall results, nearly doubling the Mean Reciprocal Rank (MRR), a standard measure for evaluating QA systems.en
dc.formatapplication/pdfen
dc.format.extent409863 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10012/1190
dc.language.isoenen
dc.pendingfalseen
dc.publisherUniversity of Waterlooen
dc.rightsCopyright: 2002, Lynam, Thomas. All rights reserved.en
dc.subjectComputer Scienceen
dc.subjectQuestion Answeringen
dc.subjectInformation Retrievalen
dc.subjectNatural Language Processingen
dc.subjectRetrieval Models and Rankingen
dc.subjectQuery Intenten
dc.titleExploitation of Redundant Inverse Term Frequency for Answer Extractionen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentSchool of Computer Scienceen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
trlynam2002.pdf
Size:
400.26 KB
Format:
Adobe Portable Document Format