dc.contributor.author Grant, Oliver David Lester dc.date.accessioned 2016-05-16 14:24:42 (GMT) dc.date.available 2016-05-16 14:24:42 (GMT) dc.date.issued 2016-05-16 dc.date.submitted 2016-05 dc.identifier.uri http://hdl.handle.net/10012/10479 dc.description.abstract We examine optimal and near optimal solutions to the classic binary search tree problem of Knuth. We are given a set of n keys (originally known as words), B_1, B_2, ..., B_n and 2n+1 frequencies. {p_1, p_2, ..., p_n} represent the probabilities of searching for each given key, and {q_0, q_1, ..., q_n} represent the probabilities of searching in the gaps between and outside of these keys. We have that Σ_{i=0}^n q_i + Σ_{i=1}^n p_i = 1. We also assume without loss of generality that q_{i-1}+p_i+q_i != 0 for any i ϵ {1,...,n}. The keys must make up the internal nodes of the tree while the gaps make up the leaves. Our goal is to construct a binary search tree such that expected cost of search is minimized. First, we re-examine an approximate solution of Guttler, Mehlhorn and Schneider which was shown to have a worst case bound of c * H + 2 where c >= 1/(H(1/3,2/3)) ~ 1.08, and H = Σ_{i=1}^{n} p_i * log_2(1/p_i) + Σ_{j=0}^{n} q_i * log_2(1/q_j) is the entropy of the distribution. We give an improved worst case bound on the heuristic of H+4. Next, we examine the optimum binary search tree problem under a model of external memory. We use the Hierarchical Memory Model of Aggarwal et al. The model has an unlimited number of registers, R_1, R_2, ... each with its own location in memory (a positive integer). We have a set of memory sizes m_1, m_2, ..., m_l which are monotonically increasing. Each memory level has a finite size except m_l which we assume has infinite size. Each memory level has an associated cost of access c_1, c_2, ..., c_l. We assume that c_1 < c_2 < ... < c_l. We propose two approximate solutions which run in O(n) time where n is the number of words in our data set. Using these methods, we improve upon a bound given in Thite's 2001 thesis under the related HMM_2 model in the approximate setting. We also examine the related problem of binary trees on multisets of probabilities where keys are unordered and we do not differentiate between which probabilities must be leaves, and which must be internal nodes. We provide a simple O(n log_2(n)) algorithm that is within an additive (n+1)(2n) of optimal on a multiset of n keys. en dc.language.iso en en dc.publisher University of Waterloo en dc.subject binary search trees en dc.subject entropy en dc.subject optimum binary search trees en dc.subject external memory en dc.subject optimal binary search trees en dc.subject approximate binary search trees en dc.title Approximately Optimum Search Trees in External Memory Models en dc.type Master Thesis en dc.pending false uws-etd.degree.department David R. Cheriton School of Computer Science en uws-etd.degree.discipline Computer Science en uws-etd.degree.grantor University of Waterloo en uws-etd.degree Master of Mathematics en uws.contributor.advisor Munro, J. Ian uws.contributor.affiliation1 Faculty of Mathematics en uws.published.city Waterloo en uws.published.country Canada en uws.published.province Ontario en uws.typeOfResource Text en uws.peerReviewStatus Unreviewed en uws.scholarLevel Graduate en
﻿

### This item appears in the following Collection(s)

UWSpace

University of Waterloo Library
200 University Avenue West