UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

Weighting Document Genre in Enterprise Search

dc.contributor.authorYeung, Peter Chun Kai
dc.date.accessioned2007-08-14T17:40:27Z
dc.date.available2007-08-14T17:40:27Z
dc.date.issued2007-08-14T17:40:27Z
dc.date.submitted2007-07-16
dc.description.abstractThe creation of an Enterprise Search system involves many challenges that are not present in Web search. Searching a corporate collection is influenced both by the structure of the data present in the collection and by the policies of the corporation. These structures and policies may differ from corporation to corporation, and from collection to collection. In particular, an Enterprise Search system must take a document's genre into account. Examples of document genre within a corporate collection might include FAQs, white papers, technical reports, memos, emails and chat messages. Depending on an individual's current work task, it might be appropriate to give one genre a greater weight than another during the processing of a search request. Moreover, this weighting may change as the individual's work task changes. The work presented in this thesis adapts the Okapi BM25 scoring function to weight term frequency based on the relevance of a document genre to a work task. The method utilizes two user-provided resources, relevance judgments and clickthrough data, to estimate a realistic weight for each task-genre relationship. Using this approach, the method matches the purpose of each user search request with the purpose of each document. Therefore, the proper documents are returned to the user and her/his need can be fulfilled. The method has been incorporated into a prototype search engine, X-site, currently deployed on a corporate intranet. X-Site is a contextual search engine that uses the relationships between work tasks and document genres to improve search precision for software engineers. The system provides a customized and user-controlled means of refining search results to suit the task context of a user. Through X-Site, each employee can make a single search request and has access to documents from the Internet, a corporate intranet, and Lotus Notes databases.en
dc.format.extent2135222 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10012/3160
dc.language.isoenen
dc.pendingfalseen
dc.publisherUniversity of Waterlooen
dc.subjectDocument Genreen
dc.subjectEnterprise Searchen
dc.subject.programComputer Scienceen
dc.titleWeighting Document Genre in Enterprise Searchen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentSchool of Computer Scienceen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
peter_yeung.pdf
Size:
2.04 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
257 B
Format:
Item-specific license agreed upon to submission
Description: