DPQP: A D-representation-based Pipelined Factorized Query Processor for Graph Database Management Systems

Feng, Xiyang

DPQP: A D-representation-based Pipelined Factorized Query Processor for Graph Database Management Systems

Files

Feng_Xiyang.pdf (1.27 MB)

Date

2021-12-17

Authors

Feng, Xiyang

Advisor

Salihoglu, Semih

Publisher

University of Waterloo

Abstract

Factorized databases utilize factorized data representations during query processing to obtain more compact final query results and faster runtimes for queries with many-to-many joins. We revisit this technique in the context of graph database management systems (GDBMSs) whose common workloads are large joins with many-to-many relationships on graph-structured data. We first review the theory of factorized databases and classic flat intermediate tuple structure in traditional pipelined GDBMSs. We then present our tuple representation which mimics factorized representations and can be easily integrated into existing query processors. We further describe how to cache sub-query results with this factorized tuple structure through a static dependency analysis of the query. We have integrated our factorized query processor into GraphflowDB, an in-memory GDBMS. Compared to the original version of GraphflowDB, whose processor is not fully factorized, query plans in our processor can be orders of magnitude faster and produce orders of magnitude smaller result sizes.