Efficient Evaluation of Set Expressions

dc.contributor.authorMirzazadeh, Mehdi
dc.date.accessioned2014-04-17T15:49:27Z
dc.date.available2014-04-17T15:49:27Z
dc.date.issued2014-04-17
dc.date.submitted2014
dc.description.abstractIn this thesis, we study the problem of evaluating set expressions over sorted sets in the comparison model. The problem arises in the context of evaluating search queries in text database systems; most text search engines maintain an inverted list, which consists of a set of documents that contain each possible word. Thus, answering a query is reduced to computing the union, the intersection, or a more complex set expression over sets of documents containing the words in the query. At the first step, for a given expression on a number of sets and the sizes of the sets, we investigate the worst-case complexity of evaluating the expression in terms of the sizes of the sets. We prove lower bounds and provide algorithms with the matching running time up to a constant factor. We then refine the problem further and design an algorithm that computes such expressions according to the degree by which the input sets are interleaved rather than only considering sets sizes. %We prove the running time of our algorithm is asymptotically optimal. We prove the optimality of our algorithm by way of presenting a matching lower bound sensitive to the interleaving measure. The algorithms we present are different in the set of set operators they allow in input expressions. We provide algorithms that are worst-case optimal for inputs with union, intersection, and symmetric difference operators. One of the algorithms we provide also supports minus and complement operators and is conjectured to be optimal when an input is allowed to contain these operators as well. We also provide a worst-case optimal algorithm for the form of problem where the input may contain "threshold'" operators, which generalize union and intersection operators: for a number t, a t-threshold operator selects elements that appear in at least in t of the operand sets. Finally, the adaptive algorithm we provide supports union and intersection operators.en
dc.identifier.urihttp://hdl.handle.net/10012/8327
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectSet Expressionen
dc.subjectAlgorithmsen
dc.subjectComparison Based Algorithmsen
dc.subjectLower Bounden
dc.subjectData Structuresen
dc.subject.programComputer Scienceen
dc.titleEfficient Evaluation of Set Expressionsen
dc.typeDoctoral Thesisen
uws-etd.degreeDoctor of Philosophyen
uws-etd.degree.departmentSchool of Computer Scienceen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mirzazadeh_Mehdi.pdf
Size:
948.1 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.89 KB
Format:
Item-specific license agreed upon to submission
Description: