Test collections for web-scale datasets using Dynamic Sampling

Loading...
Thumbnail Image

Date

2022-01-17

Authors

Singh, Anmol

Advisor

Cormack, Gordon

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Dynamic Sampling is a non-uniform statistical sampling strategy based on S-CAL, a high-recall retrieval algorithm. It is used for the construction of statistical test collections for evaluating information retrieval systems. Dynamic Sampling has been shown to lead to comparable or better test collections compared to pooling methods, at a fraction of the assessment effort. In this work, we adapt a high-recall retrieval system to run a Dynamic Sampling protocol for web-scale datasets. We use this to create relevance assessments for 30 topics from the TREC 2019 Medical Misinformation Track. We compare our relevance assessments to qrels created using two pooling based approaches. We also compare the official NIST qrels, which were based on ClueWeb12B (7% of the full dataset), to qrels based on the full ClueWeb12 dataset. Our results suggest Dynamic Sampling yields a reasonably good test collection, with comparable or lower variance for most evaluation measures. For fixed depth measures like Precision@K, the NIST qrels based on ClueWeb12B appear to have higher bias with respect to the other qrels, suggesting that it might be better to use qrels based on the full collection when possible.

Description

Keywords

information retrieval evaluation

LC Subject Headings

Citation