An Unsupervised Approach to Relatedness Analysis of Legal Language

Wang, Ying

An Unsupervised Approach to Relatedness Analysis of Legal Language

Files

Wang_Ying.pdf (1016.44 KB)

Date

2018-09-20

Authors

Wang, Ying

Advisor

Xie, Liang-Liang

Publisher

University of Waterloo

Abstract

Learning distributed representations of sentences and analyzing semantic similarity between sentences is one of the essential works in the field of Natural Language Processing. In the domain of legal language, the future of Artificial Intelligence-related legal-tech applications is very promising. This thesis comprises a very detailed investigation of distributional representations of words and sentences, and the related machine learning and deep learning techniques. Then, we proposed an innovative approach, Word2Sent, for measuring the degree of similarity between sentences. The proposed model is completely in an unsupervised manner. Thus, it can be well applied with unlabeled data. An enhancement of the other unsupervised sentence embeddings model, SIF-model, is made by this thesis. Demonstrated by multiple experiments, our proposed model can effectively work with long legal sentences on several textual similarity tasks.