Optimal Path-Decomposition of Tries
MetadataShow full item record
In this thesis, we consider the path-decomposition representation of prefix trees. We show that given query probabilities for every word in the prefix tree, the heavy-path strategy produces the optimal trie with respect to the number of node accesses. We show how to implement the heavy-path strategy in O(N) time for a trie containing n words with total length N. To prove this result, we show a complete characterization of the choices made by the optimal decomposition strategy. Using this characterization, we describe how to efficiently support dynamic operations on the path-decomposed trie while preserving the optimality in O(sigma * |w|) time for an alphabet size of sigma and a word length of |w|. We also give entropy-based bounds of the node accesses per query for their respective probabilities. Finally, we show theoretical and experimental results on the performance of heavy-path versus max-score, another popular path-decomposition strategy.