UWSpace >
University of Waterloo >
Electronic Theses and Dissertations (UW) >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10012/3160

Title: Weighting Document Genre in Enterprise Search
Authors: Yeung, Peter Chun Kai
Keywords: Document Genre
Enterprise Search
Approved Date: 14-Aug-2007
Date Submitted: 16-Jul-2007
Abstract: The creation of an Enterprise Search system involves many challenges that are not present in Web search. Searching a corporate collection is influenced both by the structure of the data present in the collection and by the policies of the corporation. These structures and policies may differ from corporation to corporation, and from collection to collection. In particular, an Enterprise Search system must take a document's genre into account. Examples of document genre within a corporate collection might include FAQs, white papers, technical reports, memos, emails and chat messages. Depending on an individual's current work task, it might be appropriate to give one genre a greater weight than another during the processing of a search request. Moreover, this weighting may change as the individual's work task changes. The work presented in this thesis adapts the Okapi BM25 scoring function to weight term frequency based on the relevance of a document genre to a work task. The method utilizes two user-provided resources, relevance judgments and clickthrough data, to estimate a realistic weight for each task-genre relationship. Using this approach, the method matches the purpose of each user search request with the purpose of each document. Therefore, the proper documents are returned to the user and her/his need can be fulfilled. The method has been incorporated into a prototype search engine, X-site, currently deployed on a corporate intranet. X-Site is a contextual search engine that uses the relationships between work tasks and document genres to improve search precision for software engineers. The system provides a customized and user-controlled means of refining search results to suit the task context of a user. Through X-Site, each employee can make a single search request and has access to documents from the Internet, a corporate intranet, and Lotus Notes databases.
Program: Computer Science
Department: School of Computer Science
Degree: Master of Mathematics
URI: http://hdl.handle.net/10012/3160
Appears in Collections:Electronic Theses and Dissertations (UW)
Faculty of Mathematics Theses and Dissertations

Files in This Item:

File Description SizeFormat
peter_yeung.pdf2.09 MBAdobe PDFView/Open


This item is protected by original copyright

All items in UWSpace are protected by copyright, with all rights reserved.

 

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

contact us | give us feedback | http://www.lib.uwaterloo.ca | © 2006 University of Waterloo