Show simple item record

dc.contributor.authorALHARBI, AIMAN
dc.date.accessioned2016-08-03 19:50:50 (GMT)
dc.date.available2016-08-03 19:50:50 (GMT)
dc.date.issued2016-08-03
dc.date.submitted2016-08-02
dc.identifier.urihttp://hdl.handle.net/10012/10608
dc.description.abstractSecondary assessors, individuals who do not originate search topics and are employed solely to judge the relevancy of documents, have been found to differ in their relevance judgments. Their relevance judgments are used in constructing test collections, which play a significant role in evaluating search systems. These judgments are also used in e-discovery to assist with locating relevant material. To a large extent, our existing understanding of secondary assessors' judging behavior is limited to quantitative measurements. The goal of this thesis is to better understand the relevance judging behavior of secondary assessors. Therefore, we conducted two user studies to achieve this objective. The first study, which forms the main part of this thesis, was a think-aloud study, and provides what may be the first of such qualitative studies of secondary assessors' judging behavior. The second study of the research was to capture the uncertainty in secondary assessors' relevance judgments. Further examination of the behavior of secondary assessors when judging multiple types of documents was also carried out based on the data from the think-aloud study. Data obtained through the think-aloud method, permitted us to achieve more in-depth insight into secondary assessors' relevance judging behavior. We were able to directly listen to and note their thoughts during the assigned search tasks. Based on this data, we found that relevance judgments are made with differing levels of certainty. These levels of certainty vary from low to high. We also found that the varying factors of a search topic, the document, and the assessor can each impact differing judgments. The think-aloud study also reveals preliminary evidence regarding how the amount of detail stated in a search topic's description influences the relevance judging behavior of secondary assessors. To capture the uncertainty in secondary assessors' relevance judgments, we designed four user interfaces in our second user study. The objective was to study the uncertainty in secondary assessors' relevance judgments when the level of uncertainty is self-reported. We found that they tend to make high certain relevance judgments despite the consensus level of a document. In judging high consensus documents, assessors' accuracy was lower when making low certainty relevance judgments, and the judgments were more accurate and tended to agree with NIST assessors when making high certainty relevance judgments. For low consensus documents, we found assessors' accuracy to be low regardless of their certainty level. Finally, we found that assessors tend to spend less time when making high certainty relevance judgments, regardless of the consensus level of the document. Further study of the behavior of secondary assessors when judging multiple types of documents, identified that relevance judgments are occasionally based on incorrect perception. We show how factors such as lack of familiarity, lack of understanding the search topic, absence of keywords and other reasons could be a source of not only incorrect relevance judgments, but also of those which are correct. We also illustrate how the length of search topics and documents, and their level of difficulty may further contribute to the issue of variations in the judgments. Our research overall contributes to a more extensive, meaningful understanding of the behavior of secondary assessors. It establishes a foundation for more pertinent work in the future on the impact of uncertainty in secondary assessor's relevance judgments. Our findings also show that assessor training and background, search topics, and document length should be all considered and given additional attention in order to obtain more reliable results.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectInformation Retrievalen
dc.subjectRelevance Assessmenten
dc.subjectSecondary Assessorsen
dc.titleStudying Relevance Judging Behavior of Secondary Assessorsen
dc.typeDoctoral Thesisen
dc.pendingfalse
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeDoctor of Philosophyen
uws.contributor.advisorSmucker, Mark
uws.contributor.advisorClarke, Charles
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages