DistNeo4j: Scaling Graph Databases through Dynamic Distributed Partitioning

dc.contributor.authorNicoara, Daniel
dc.date.accessioned2014-06-16T12:56:15Z
dc.date.available2014-10-15T05:30:07Z
dc.date.issued2014-06-16
dc.date.submitted2014
dc.description.abstractSocial networks are large graphs which require multiple servers to store and manage them. Providing performant scalable systems that store these graphs through partitioning them into subgraphs is an important issue. In such systems each partition is hosted by a server to satisfy multiple objectives. These objectives include balancing server loads, reducing remote traversals (number of edges cut), and adapting the partitioning to changes in the structure of the graph in the face of changing workloads. To address these issues, a dynamic repartitioning algorithm is required to modify an existing partitioning to maintain good quality partitions. Such a repartitioner should not impose a significant overhead to the system. This thesis introduces a greedy repartitioner, which dynamically modifies a partitioning using a small amount of resources. In contrast to the existing repartitioning algorithms, the greedy repartitioner is performant (in terms of time and memory), making it suitable for implementing and using it in a real system. The greedy repartitioner is integrated into DistNeo4j, which is designed as an extension of the open source Neo4j graph database system, to support workloads over partitioned graph data distributed over multiple servers. Using real-world data sets, this thesis shows that DistNeo4j leverages the greedy repartitioner to maintain high quality partitions and provides a 2 to 3 times performance improvement over the de-facto hash-based partitioning.en
dc.description.embargoterms4 monthsen
dc.identifier.urihttp://hdl.handle.net/10012/8525
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectGraph databasesen
dc.subjectDistributed systemsen
dc.subjectRe-partitioningen
dc.subject.programComputer Scienceen
dc.titleDistNeo4j: Scaling Graph Databases through Dynamic Distributed Partitioningen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentSchool of Computer Scienceen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Nicoara_Daniel.pdf
Size:
683.22 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.67 KB
Format:
Item-specific license agreed upon to submission
Description: