Sentiment Lexicon Induction and Interpretable Multiple-instance Learning in Financial Markets

Fu, Chengyao

Sentiment Lexicon Induction and Interpretable Multiple-instance Learning in Financial Markets

dc.contributor.advisor	Huang, Alan
dc.contributor.advisor	Li, Yuying
dc.contributor.author	Fu, Chengyao
dc.date.accessioned	2020-09-28T13:58:20Z
dc.date.available	2020-09-28T13:58:20Z
dc.date.issued	2020-09-28
dc.date.submitted	2020-09-24
dc.description.abstract	Sentiment analysis has been widely used in the domain of finance. There are two most common textual sentiment analysis methods in finance: \textit{dictionary-based approach} and \textit{machine learning approach}. The dictionary-based method is the most convenient and efficient method to extract sentiments from the text, but the words in the dictionary are limited and cannot capture the full scope of a particular domain. Additionally, it is expensive and unsustainable to manually create and maintain domain-specific dictionary using expert opinions. Deep learning models become mainstream methods in sentiment analysis because of their better performance by utilizing extra information on a larger corpus and more complex model structures. However, deep learning models often suffer from the interpretability problem. This thesis is an attempt to address the issues of both methods. It proposes a machine learning method to do a corpus-based sentiment lexicon induction, which extends the sentiment dictionary that is customized to analyze corporate conference calls. The new extended dictionary is shown to have a better performance than the original dictionary in terms of the three-day returns of the companies in the MSCI universe. It also proposes a highly interpretable attention-based multiple-instance learning model to perform sentiment classification. It also shows that the newly proposed model has comparable accuracy performance to the state-of-the-art sequential models with better interpretability. A keyword ranking is also generated by the model as a by-product. A new sentiment dictionary is also generated by the deep learning method and shows even better performance than both the extended dictionary and the original dictionary.	en
dc.identifier.uri	http://hdl.handle.net/10012/16382
dc.language.iso	en	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	sentiment analysis	en
dc.subject	natural language processing	en
dc.subject	finance	en
dc.subject	sentiment dictionary	en
dc.subject	sentiment lexicon induction	en
dc.subject	multiple-instance learning	en
dc.subject	stocks	en
dc.title	Sentiment Lexicon Induction and Interpretable Multiple-instance Learning in Financial Markets	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Mathematics	en
uws-etd.degree.department	David R. Cheriton School of Computer Science	en
uws-etd.degree.discipline	Computer Science	en
uws-etd.degree.grantor	University of Waterloo	en
uws.contributor.advisor	Huang, Alan
uws.contributor.advisor	Li, Yuying
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Fu_Chengyao.pdf
Size:: 7.48 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Computer Science