Show simple item record

dc.contributor.authorClancy, Ryan
dc.date.accessioned2020-04-06 17:56:36 (GMT)
dc.date.available2020-04-06 17:56:36 (GMT)
dc.date.issued2020-04-06
dc.date.submitted2020-03-31
dc.identifier.urihttp://hdl.handle.net/10012/15739
dc.description.abstractIn recent years, the amount of data being generated for consumption by enterprises has increased exponentially. Enterprises typically work with structured data, but oftentimes the data being generated is semi-structured or unstructured in nature. In particular, there exists a wealth of unstructured text data (customer reviews, social media posts, news articles, etc.) containing information that could provide value to an organization. As data from different sources often reside in silos, a number of questions arise: How do we integrate the structured and unstructured data? How can we curate and refine the data? Can we do this at scale? In this thesis, I present dstlr -- a platform for scalable knowledge graph construction from text collections. I show how assertions extracted from a collection of unstructured text documents can be used to form a knowledge graph, enabling integration of structured and unstructured data. Further, I show that linking to an existing knowledge graph enables rule-based data curation using the additional external information. I demonstrate this on a large collection of news articles, highlighting the horizontal scale-out of the system.en
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.subjectknowledge graphen
dc.titledstlr: Scalable Knowledge Graph Construction from Text Collectionsen
dc.typeMaster Thesisen
dc.pendingfalse
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.degreeMaster of Mathematicsen
uws.contributor.advisorLin, Jimmy
uws.contributor.affiliation1Faculty of Mathematicsen
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages