UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Weighting Document Genre in Enterprise Search

Loading...
Thumbnail Image

Date

2007-08-14T17:40:27Z

Authors

Yeung, Peter Chun Kai

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

The creation of an Enterprise Search system involves many challenges that are not present in Web search. Searching a corporate collection is influenced both by the structure of the data present in the collection and by the policies of the corporation. These structures and policies may differ from corporation to corporation, and from collection to collection. In particular, an Enterprise Search system must take a document's genre into account. Examples of document genre within a corporate collection might include FAQs, white papers, technical reports, memos, emails and chat messages. Depending on an individual's current work task, it might be appropriate to give one genre a greater weight than another during the processing of a search request. Moreover, this weighting may change as the individual's work task changes. The work presented in this thesis adapts the Okapi BM25 scoring function to weight term frequency based on the relevance of a document genre to a work task. The method utilizes two user-provided resources, relevance judgments and clickthrough data, to estimate a realistic weight for each task-genre relationship. Using this approach, the method matches the purpose of each user search request with the purpose of each document. Therefore, the proper documents are returned to the user and her/his need can be fulfilled. The method has been incorporated into a prototype search engine, X-site, currently deployed on a corporate intranet. X-Site is a contextual search engine that uses the relationships between work tasks and document genres to improve search precision for software engineers. The system provides a customized and user-controlled means of refining search results to suit the task context of a user. Through X-Site, each employee can make a single search request and has access to documents from the Internet, a corporate intranet, and Lotus Notes databases.

Description

Keywords

Document Genre, Enterprise Search

LC Keywords

Citation