Learning to Rank in the Age of Muppets

Hu, Chengcheng

Learning to Rank in the Age of Muppets

dc.contributor.advisor	Lin, Jimmy
dc.contributor.author	Hu, Chengcheng
dc.date.accessioned	2022-04-26T22:30:48Z
dc.date.available	2022-04-26T22:30:48Z
dc.date.issued	2022-04-26
dc.date.submitted	2022-04-19
dc.description.abstract	The emergence of BERT in 2018 has brought a huge boon to retrieval effectiveness in many tasks across various domains and led the recent research landscape of IR to transformer-related technologies. While researchers are fascinated by the power of BERT, along with related transformer models, substantial computational costs incurred by transformers become an unavoidable problem. Meanwhile, under the light of BERT, there are ''out-of-date'' but fairly effective techniques forgotten by people. For example, learning to rank was one of the most popular technologies a decade ago. In this work, we aim to answer two research questions: RQ1 is whether using learning to rank as a filtering stage in a multi-stage reranking pipeline can improve the efficiency of reranking using transformers without sacrificing effectiveness. In addition, we are interested in if using transformer-based features in the traditional learning to rank framework can increase effectiveness as RQ2. To answer RQ1, we implement a multi-stage reranking pipeline which places learning to rank as a filter in the middle stage. This configuration allows the pipeline to only send the most promising candidates using cheap learning to rank module to expensive neural rerankers, hence a speedup in overall latency for transformer-based reranking can be obtained without a degradation in effectiveness. By applying the pipeline on MS MARCO passage and document ranking tasks, we can achieve up to 18 times increase in efficiency while maintaining the same level of effectiveness. Moreover, our method is orthogonal to other techniques that focus on neural models themselves to accelerate inference. Hence, our method can be combined with other accelerating works to further save computational costs and latency. For RQ2, since transformers generate relevance scores for different query-document pairs independently, it is possible to use transformer-based scores as learning to rank features, so that learning to rank can take advantage of transformers to increase retrieval effectiveness. Applied to the MS MARCO passage and document ranking tasks, we gain a maximal 52% increase in effectiveness by adding the BERT-based feature compared to the ''traditional'' learning to rank. Also, we obtain a result with a little bit higher effectiveness by adding transformer-based features with other traditional features in learning to rank, compared to the standard retrieve-and-rerank design with transformers. This work explores potential roles of learning to rank in the age of muppets. In a broader sense, this work illustrates that we should stand on the shoulder of giants, which is what we learned and discovered in history, to explore next unknowns.	en
dc.identifier.uri	http://hdl.handle.net/10012/18183
dc.language.iso	en	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	learning to rank	en
dc.subject	information retrieval	en
dc.title	Learning to Rank in the Age of Muppets	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Mathematics	en
uws-etd.degree.department	David R. Cheriton School of Computer Science	en
uws-etd.degree.discipline	Computer Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	0	en
uws.contributor.advisor	Lin, Jimmy
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Hu_Chengcheng.pdf
Size:: 714.4 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Computer Science