UWSpace is currently experiencing technical difficulties resulting from its recent migration to a new version of its software. These technical issues are not affecting the submission and browse features of the site. UWaterloo community members may continue submitting items to UWSpace. We apologize for the inconvenience, and are actively working to resolve these technical issues.
 

A mathematical foundation for the use of cliques in the exploration of data with navigation graphs

Loading...
Thumbnail Image

Date

2023-01-19

Authors

Shuldiner, Pavel

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

Navigation graphs were introduced by Hurley and Oldford (2011) as a graph-theoretic framework for exploring data sets, particularly those with many variables. They allow the user to visualize one small subset of the variables and then proceed to another subset, which shares a few of the original variables, via a smooth transition. These graphs serve as both a high level overview of the dataset as well as a tool for a first-hand exploration of regions deemed interesting. This work examines the nature of cliques in navigation graphs, both in terms of type and magnitude, and speculates as to what their significance to the underlying dataset might be. The questions answered by this body of work were motivated by the belief that the presence of cliques in navigation graphs is a potential indicator for the existence of an interesting, possibly unanticipated, relationship among some of the variables. In this thesis we provide a detailed examination of cliques in navigation graphs, both in terms of type, size and number. The study of types of cliques informs us of the potential significance of highly connected structures to the underlying data and guides our approach for examining the possible clique sizes and counts. On the other hand, the prevalence of large clique sizes and counts is suggestive of an interesting, possibly unexpected, relationship between the variates in the data. To address the challenges surrounding the nature of cliques in navigation graphs, we develop a framework for the derivation of closed-form expressions for the moments of count random variables in terms of their underlying indecomposable summands is established. We use this framework in conjunction with a connection between intersecting set families to obtain edge counts within a clique cover and thus, obtain closed-form expressions for the moments of clique counts in random graphs.

Description

Keywords

statistics, probability, combinatorics, random graphs, network theory, exploratory data analysis

LC Keywords

Citation