Show simple item record

dc.contributor.authorSingh, Anmol
dc.date.accessioned2022-01-17 17:20:10 (GMT)
dc.date.available2022-01-17 17:20:10 (GMT)
dc.date.issued2022-01-17
dc.date.submitted2022-01-06
dc.identifier.urihttp://hdl.handle.net/10012/17889
dc.description.abstractDynamic Sampling is a non-uniform statistical sampling strategy based on S-CAL, a high-recall retrieval algorithm. It is used for the construction of statistical test collections for evaluating information retrieval systems. Dynamic Sampling has been shown to lead to comparable or better test collections compared to pooling methods, at a fraction of the assessment effort. In this work, we adapt a high-recall retrieval system to run a Dynamic Sampling protocol for web-scale datasets. We use this to create relevance assessments for 30 topics from the TREC 2019 Medical Misinformation Track. We compare our relevance assessments to qrels created using two pooling based approaches. We also compare the official NIST qrels, which were based on ClueWeb12B (7% of the full dataset), to qrels based on the full ClueWeb12 dataset. Our results suggest Dynamic Sampling yields a reasonably good test collection, with comparable or lower variance for most evaluation measures. For fixed depth measures like Precision@K, the NIST qrels based on ClueWeb12B appear to have higher bias with respect to the other qrels, suggesting that it might be better to use qrels based on the full collection when possible.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.relation.urihttps://github.com/kshanmol/2019-med-misinfo-qrelsen
dc.subjectinformation retrieval evaluationen
dc.titleTest collections for web-scale datasets using Dynamic Samplingen
dc.typeMaster Thesisen
dc.pendingfalse
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeMaster of Mathematicsen
uws-etd.embargo.terms0en
uws.contributor.advisorCormack, Gordon
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages