An Unsupervised Approach to Relatedness Analysis of Legal Language
Learning distributed representations of sentences and analyzing semantic similarity between sentences is one of the essential works in the field of Natural Language Processing. In the domain of legal language, the future of Artificial Intelligence-related legal-tech applications is very promising. This thesis comprises a very detailed investigation of distributional representations of words and sentences, and the related machine learning and deep learning techniques. Then, we proposed an innovative approach, Word2Sent, for measuring the degree of similarity between sentences. The proposed model is completely in an unsupervised manner. Thus, it can be well applied with unlabeled data. An enhancement of the other unsupervised sentence embeddings model, SIF-model, is made by this thesis. Demonstrated by multiple experiments, our proposed model can effectively work with long legal sentences on several textual similarity tasks.
Cite this version of the work
Ying Wang (2018). An Unsupervised Approach to Relatedness Analysis of Legal Language. UWSpace. http://hdl.handle.net/10012/13847