Multiple Continuous Subgraph Query Optimization Using Delta Subgraph Queries

Loading...
Thumbnail Image

Date

2018-10-23

Authors

Kankanamge, Chathura

Advisor

Salihoglu, Semih

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

This thesis studies the problem of optimizing and evaluating multiple directed structural subgraph queries, i.e., those without highly selective predicates on the edges or vertices, continuously in a changing graph. Existing techniques focus on queries with highly selective predicates or are designed for evaluating a single query. As such, these techniques do not scale when evaluating multiple structural queries either because their computations become prohibitively inefficient or they use prohibitively large auxiliary data structures. We build upon the delta subgraph query (DSQ) framework that was introduced in prior work. This framework decomposes queries into multiple delta queries, which are then evaluated one query vertex at a time, without requiring any auxiliary data structures. We study the problem of picking good query vertex orderings for a set of DSQs cumulatively to share computation across different DSQs and achieve efficient runtimes in practice. We describe a greedy cost-based optimizer that takes as input a set of DSQs and a subgraph extension catalogue, and generates a single low cost combined plan that cumulatively evaluates all of the DSQs by sharing computation across them. Our combined plans consist of a Scan and multiple Extend/Intersect (E/I) operators. The E/I operator takes a set of partial matches and extends them by one query vertex. We adopt as our cost metric intersection cost (i-cost), which we show is a good estimate of the actual work performed during query evaluation. We further describe an optimization to the base optimizer that expands the DSQs algebraically to even more DSQs to allow more computation sharing between them. On small query sets we demonstrate that our cost-based greedy optimizer is able to find close to optimal combined plans in terms of run time. On larger query sets, we demonstrate that our optimizer and expanded DSQ optimization can yield significant performance improvements against several baselines.

Description

Keywords

subgraph matching, query optimization, graph database, multi query optimization, incremental view maintenance

LC Keywords

Citation