Exploitation of Redundant Inverse Term Frequency for Answer Extraction

Lynam, Thomas Richard

Exploitation of Redundant Inverse Term Frequency for Answer Extraction

dc.contributor.author	Lynam, Thomas Richard	en
dc.date.accessioned	2006-08-22T14:27:02Z
dc.date.available	2006-08-22T14:27:02Z
dc.date.issued	2002	en
dc.date.submitted	2002	en
dc.description.abstract	An automatic question answering system must find, within a corpus,short factual answers to questions posed in natural language. The process involves analyzing the question, retrieving information related to the question, and extracting answers from the retrieved information. This thesis presents a novel approach to answer extraction in an automated question answering (QA) system. The answer extraction approach is an extension of the MultiText QA system. This system employs a question analysis component to examine the question and to produce query terms for the retrieval component which extracts several document fragments from the corpus. The answer extraction component selects a few short answers from these fragments. This thesis describes the design and evaluation of the Redundant Inverse Term Frequency (RITF) answer extraction component. The RITF algorithm locates and evaluates words from the passages that are likely to be associated with the answer. Answers are selected by finding short fragments of text that contain the most likely words based on: the frequency of the words in the corpus, the number of fragments in which the word occurs, the rank of the passages as determined by the IR, the distance of the word from the centre of the fragment, and category information found through question analysis. RITF makes a substantial contribution in overall results, nearly doubling the Mean Reciprocal Rank (MRR), a standard measure for evaluating QA systems.	en
dc.format	application/pdf	en
dc.format.extent	409863 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/10012/1190
dc.language.iso	en	en
dc.pending	false	en
dc.publisher	University of Waterloo	en
dc.rights	Copyright: 2002, Lynam, Thomas. All rights reserved.	en
dc.subject	Computer Science	en
dc.subject	Question Answering	en
dc.subject	Information Retrieval	en
dc.subject	Natural Language Processing	en
dc.subject	Retrieval Models and Ranking	en
dc.subject	Query Intent	en
dc.title	Exploitation of Redundant Inverse Term Frequency for Answer Extraction	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Mathematics	en
uws-etd.degree.department	School of Computer Science	en
uws.peerReviewStatus	Unreviewed	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: trlynam2002.pdf
Size:: 400.26 KB
Format:: Adobe Portable Document Format

Download

Collections

Theses
Computer Science