An Unsupervised Approach to Relatedness Analysis of Legal Language

Xie, Liang-LiangWang, Ying2018-09-202018-09-202018-09-202018-09-07http://hdl.handle.net/10012/13847Learning distributed representations of sentences and analyzing semantic similarity between sentences is one of the essential works in the field of Natural Language Processing. In the domain of legal language, the future of Artificial Intelligence-related legal-tech applications is very promising. This thesis comprises a very detailed investigation of distributional representations of words and sentences, and the related machine learning and deep learning techniques. Then, we proposed an innovative approach, Word2Sent, for measuring the degree of similarity between sentences. The proposed model is completely in an unsupervised manner. Thus, it can be well applied with unlabeled data. An enhancement of the other unsupervised sentence embeddings model, SIF-model, is made by this thesis. Demonstrated by multiple experiments, our proposed model can effectively work with long legal sentences on several textual similarity tasks.enNatural Language ProcessMachine LearningDeep LearningLegalAn Unsupervised Approach to Relatedness Analysis of Legal LanguageMaster Thesis