Show simple item record

dc.contributor.authorFarzan, Arashen
dc.date.accessioned2006-08-22 14:25:20 (GMT)
dc.date.available2006-08-22 14:25:20 (GMT)
dc.date.issued2004en
dc.date.submitted2004en
dc.identifier.urihttp://hdl.handle.net/10012/1019
dc.description.abstractWe study three problems related to searching and sorting in multisets in the cache-oblivious model: Finding the most frequent element (the mode), duplicate elimination and finally multi-sorting. We are interested in minimizing the cache complexity (or number of cache misses) of algorithms for these problems in the context under which the cache size and block size are unknown. We start by showing the lower bounds in the comparison model. Then we present the lower bounds in the cache-aware model, which are also the lower bounds in the cache-oblivious model. We consider the input multiset of size <i>N</i> with multiplicities <i>N</i><sub>1</sub>,. . . , <i>N<sub>k</sub></i>. The lower bound for the cache complexity of determining the mode is &Omega;({<i>N</i> over <i>B</i>} log {<i>M</i> over <i>B</i>} {<i>N</i> over <i>fB</i>}) where &fnof; is the frequency of the mode and <i>M</i>, <i>B</i> are the cache size and block size respectively. Cache complexities of duplicate removal and multi-sorting have lower bounds of &Omega;({<i>N</i> over <i>B</i>} log {<i>M</i> over <i>B</i>} {<i>N</i> over <i>B</i>} - £{<i>k</i> over <i>i</i>}=1{<i>N<sub>i</sub></i> over <i>B</i>}log {<i>M</i> over <i>B</i>} {<i>N<sub>i</sub></i> over <i>B</i>}). We present two deterministic approaches to give algorithms: selection and distribution. The algorithms with these deterministic approaches differ from the lower bounds by at most an additive term of {<i>N</i> over <i>B</i>} loglog <i>M</i>. However, since loglog <i>M</i> is very small in real applications, the gap is tiny. Nevertheless, the ideas of our deterministic algorithms can be used to design cache-aware algorithms for these problems. The algorithms turn out to be simpler than the previously-known cache-aware algorithms for these problems. Another approach to design algorithms for these problems is the probabilistic approach. In contrast to the deterministic algorithms, our randomized cache-oblivious algorithms are all optimal and their cache complexities exactly match the lower bounds. All of our algorithms are within a constant factor of optimal in terms of the number of comparisons they perform.en
dc.formatapplication/pdfen
dc.format.extent327044 bytes
dc.format.mimetypeapplication/pdf
dc.language.isoenen
dc.publisherUniversity of Waterlooen
dc.rightsCopyright: 2004, Farzan, Arash. All rights reserved.en
dc.subjectComputer Scienceen
dc.subjectMemory Hierarchiesen
dc.subjectCache-Oblivious modelen
dc.subjectMultisetsen
dc.subjectDetermining the modeen
dc.subjectDuplicate Eliminationen
dc.subjectSortingen
dc.titleCache-Oblivious Searching and Sorting in Multisetsen
dc.typeMaster Thesisen
dc.pendingfalseen
uws-etd.degree.departmentSchool of Computer Scienceen
uws-etd.degreeMaster of Mathematicsen
uws.typeOfResourceTexten
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record


UWSpace

University of Waterloo Library
200 University Avenue West
Waterloo, Ontario, Canada N2L 3G1
519 888 4883

All items in UWSpace are protected by copyright, with all rights reserved.

DSpace software

Service outages