A Test Collection for Offline Evaluation of Recommender Systems

Chamani, Houmaan

A Test Collection for Offline Evaluation of Recommender Systems

dc.contributor.advisor	Smucker, Mark
dc.contributor.author	Chamani, Houmaan
dc.date.accessioned	2024-11-07T21:42:16Z
dc.date.available	2024-11-07T21:42:16Z
dc.date.issued	2024-11-07
dc.date.submitted	2024-10-28
dc.description.abstract	Recommendation systems have long been evaluated by collecting a large number of individuals' ratings for items, and then dividing these ratings into test and train sets to see how well recommendation algorithms can predict individuals' preferences. A complaint about this approach is that the evaluation measures can only use a small number of known preferences and have no information about the majority of recommended items. Prior research has shown that offline evaluation of recommendation systems using a test/train split methodology may not agree with actual user preferences when all recommended items are judged by the user. To address this issue, we apply traditional information retrieval test collection construction techniques for movie recommendations. An information retrieval test collection is composed of documents, search topics, and relevance judgments that tell us which documents are relevant for each topic. For our test collection, each search topic is an individual who is looking for movies to watch. In other words, while the search topic is always ``Please recommend me movies that I will be interested in watching,'' the context of the search topic changes to be the individual who is requesting the recommendations. When document collections are too large to be completely judged by assessors, the traditional approach is to use pooling. We followed this same approach in the construction of our test collection. For each individual, we used their existing profile of rated movies as input to a wide range of recommendation algorithms to produce recommendations for movies not found in their profile. We then pooled these recommendations separately for each person and asked them to rate the movies. In addition to rating, we also had each individual rate a random sample of movies selected from their ratings profile to measure their consistency in rating. The resulting new test collection consists of 51 individual ratings profiles totaling 123,104 ratings and 31,236 relevance judgments. In this thesis, we detail the creation of the test collection and provide an analysis of the individuals that comprise its search topics, and we analyze the collection's relevance judgments as well as other aspects.
dc.identifier.uri	https://hdl.handle.net/10012/21175
dc.language.iso	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.relation.uri	https://files.grouplens.org/datasets/movielens/ml-32m
dc.title	A Test Collection for Offline Evaluation of Recommender Systems
dc.type	Master Thesis
uws-etd.degree	Master of Applied Science
uws-etd.degree.department	Management Sciences
uws-etd.degree.discipline	Management Sciences
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	0
uws.comment.hidden	Thanks for reviewing my thesis!
uws.contributor.advisor	Smucker, Mark
uws.contributor.affiliation1	Faculty of Engineering
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Houmaan_Chamani.pdf
Size:: 2.66 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses