A Preference Judgment Interface for Authoritative Assessment

dc.contributor.authorSeifikar, Mahsa
dc.date.accessioned2023-02-06T16:41:09Z
dc.date.available2023-02-06T16:41:09Z
dc.date.issued2023-02-06
dc.date.submitted2023-01-30
dc.description.abstractFor offline evaluation of information retrieval systems, preference judgments have been demonstrated to be a superior alternative to graded or binary relevance judgments. In contrast to graded judgments, where each document is assigned to a pre-defined grade level, with preference judgments, assessors judge a pair of items presented side by side, indicating which is better. Unfortunately, preference judgments may require a larger number of judgments, even under an assumption of transitivity. Until recently they also lacked well-established evaluation measures. Previous studies have explored various evaluation measures and proposed different approaches to address the perceived shortcomings of preference judgments. These studies focused on crowdsourced preference judgments, where assessors may lack the training and time to make careful judgments. They did not consider the case where assessors have been trained and provided with the time to carefully consider differences between items. For offline evaluation of information retrieval systems, preference judgments have been demonstrated to be a superior alternative to graded or binary relevance judgments. In contrast to graded judgments, where each document is assigned to a pre-defined grade level, with preference judgments, assessors judge a pair of items presented side by side, indicating which is better. Unfortunately, preference judgments may require a larger number of judgments, even under an assumption of transitivity. Until recently they also lacked well-established evaluation measures. Previous studies have explored various evaluation measures and proposed different approaches to address the perceived shortcomings of preference judgments. These studies focused on crowdsourced preference judgments, where assessors may lack the training and time to make careful judgments. They did not consider the case where assessors have been trained and provided with the time to carefully consider differences between items. We review the literature in terms of algorithms and strategies for extracting preference judgment, evaluation metrics, interface design, and use of crowdsourcing. In this thesis, we design and build a new framework for preference judgment called JUDGO, with various components designed for expert reviewers and researchers. We also suggested a new heap-like preference judgment algorithm that assumes transitivity and tolerates ties. With the help of our framework, NIST assessors found the top-10 best items of each 38 topics for TREC 2022 Health Misinformation Track, with more than 2,200 judgments collected. Our analysis shows that assessors frequently use the search box feature, which enables them to highlight their own keywords in documents, but they are less interested in highlighting documents with the mouse. As a result of additional feedback, we make some modifications to the initially proposed algorithm method and highlighting features.en
dc.identifier.urihttp://hdl.handle.net/10012/19151
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectpreference judgmenten
dc.subjectoffline evaluationen
dc.subjectinformation retrievalen
dc.titleA Preference Judgment Interface for Authoritative Assessmenten
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0en
uws.contributor.advisorClarke, Charles
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Seifikar_Mahsa.pdf
Size:
5.66 MB
Format:
Adobe Portable Document Format
Description:
Master Thessis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: