End-to-end Neural Information Retrieval

dc.contributor.authorYang, Wei
dc.date.accessioned2019-04-30T19:02:22Z
dc.date.available2019-04-30T19:02:22Z
dc.date.issued2019-04-30
dc.date.submitted2019-04-05
dc.description.abstractIn recent years we have witnessed many successes of neural networks in the information retrieval community with lots of labeled data. Yet it remains unknown whether the same techniques can be easily adapted to search social media posts where the text is much shorter. In addition, we find that most neural information retrieval models are compared against weak baselines. In this thesis, we build an end-to-end neural information retrieval system using two toolkits: Anserini and MatchZoo. In addition, we also propose a novel neural model to capture the relevance of short and varied tweet text, named MP-HCNN. With the information retrieval toolkit Anserini, we build a reranking architecture based on various traditional information retrieval models (QL, QL+RM3, BM25, BM25+RM3), including a strong pseudo-relevance feedback baseline: RM3. With the neural network toolkit MatchZoo, we offer an empirical study of a number of popular neural network ranking models (DSSM, CDSSM, KNRM, DUET, DRMM). Experiments on datasets from the TREC Microblog Tracks and the TREC Robust Retrieval Track show that most existing neural network models cannot beat a simple language model baseline. How- ever, DRMM provides a significant improvement over the pseudo-relevance feedback baseline (BM25+RM3) on the Robust04 dataset and DUET, DRMM and MP-HCNN can provide significant improvements over the baseline (QL+RM3) on the microblog datasets. Further detailed analyses suggest that searching social media and searching news articles exhibit several different characteristics that require customized model design, shedding light on future directions.en
dc.identifier.urihttp://hdl.handle.net/10012/14597
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectinformation retrieval, neural network, text matchingen
dc.titleEnd-to-end Neural Information Retrievalen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws.contributor.advisorLin, Jimmy
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yang_Wei.pdf
Size:
1.19 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.08 KB
Format:
Item-specific license agreed upon to submission
Description: