Learning to Rank in the Age of Muppets

dc.contributor.authorHu, Chengcheng
dc.date.accessioned2022-04-26T22:30:48Z
dc.date.available2022-04-26T22:30:48Z
dc.date.issued2022-04-26
dc.date.submitted2022-04-19
dc.description.abstractThe emergence of BERT in 2018 has brought a huge boon to retrieval effectiveness in many tasks across various domains and led the recent research landscape of IR to transformer-related technologies. While researchers are fascinated by the power of BERT, along with related transformer models, substantial computational costs incurred by transformers become an unavoidable problem. Meanwhile, under the light of BERT, there are ''out-of-date'' but fairly effective techniques forgotten by people. For example, learning to rank was one of the most popular technologies a decade ago. In this work, we aim to answer two research questions: RQ1 is whether using learning to rank as a filtering stage in a multi-stage reranking pipeline can improve the efficiency of reranking using transformers without sacrificing effectiveness. In addition, we are interested in if using transformer-based features in the traditional learning to rank framework can increase effectiveness as RQ2. To answer RQ1, we implement a multi-stage reranking pipeline which places learning to rank as a filter in the middle stage. This configuration allows the pipeline to only send the most promising candidates using cheap learning to rank module to expensive neural rerankers, hence a speedup in overall latency for transformer-based reranking can be obtained without a degradation in effectiveness. By applying the pipeline on MS MARCO passage and document ranking tasks, we can achieve up to 18 times increase in efficiency while maintaining the same level of effectiveness. Moreover, our method is orthogonal to other techniques that focus on neural models themselves to accelerate inference. Hence, our method can be combined with other accelerating works to further save computational costs and latency. For RQ2, since transformers generate relevance scores for different query-document pairs independently, it is possible to use transformer-based scores as learning to rank features, so that learning to rank can take advantage of transformers to increase retrieval effectiveness. Applied to the MS MARCO passage and document ranking tasks, we gain a maximal 52% increase in effectiveness by adding the BERT-based feature compared to the ''traditional'' learning to rank. Also, we obtain a result with a little bit higher effectiveness by adding transformer-based features with other traditional features in learning to rank, compared to the standard retrieve-and-rerank design with transformers. This work explores potential roles of learning to rank in the age of muppets. In a broader sense, this work illustrates that we should stand on the shoulder of giants, which is what we learned and discovered in history, to explore next unknowns.en
dc.identifier.urihttp://hdl.handle.net/10012/18183
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectlearning to ranken
dc.subjectinformation retrievalen
dc.titleLearning to Rank in the Age of Muppetsen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0en
uws.contributor.advisorLin, Jimmy
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hu_Chengcheng.pdf
Size:
714.4 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: