Approximation Algorithms for Clustering and Facility Location Problems
MetadataShow full item record
Facility location problems arise in a wide range of applications such as plant or warehouse location problems, cache placement problems, and network design problems, and have been widely studied in Computer Science and Operations Research literature. These problems typically involve an underlying set F of facilities that provide service, and an underlying set D of clients that require service, which need to be assigned to facilities in a cost-effective fashion. This abstraction is quite versatile and also captures clustering problems, where one typically seeks to partition a set of data points into k clusters, for some given k, in a suitable way, which themselves find applications in data mining, machine learning, and bioinformatics. Basic variants of facility location problems are now relatively well-u nderstood, but we have much-less understanding of more-sophisticated models that better model the real-world concerns. In this thesis, we focus on three models inspired by some real-world optimization scenarios. In Chapter 2, we consider mobile facility location (MFL) problem, wherein we seek to relocate a given set of facilities to destinations closer to the clients as to minimize the sum of facility-movement and client-assignment costs. This abstracts facility-location settings where one has the flexibility of moving facilities from their current locations to other destinations so as to serve clients more efficiently by reducing their assignment costs. We give the first local-search based approximation algorithm for this problem and achieve the best-known approximation guarantee. Our main result is (3+epsilon)-approximation for this problem for any constant epsilon > 0 using local search which improves the previous best guarantee of 8-approximation algorithm due to  based on LP-rounding. Our results extend to the weighted generalization wherein each facility i has a non-negative weight w_i and the movement cost for i is w_i times the distance traveled by i. In Chapter 3, we consider a facility-location problem that we call the minimum-load k-facility location (MLkFL), which abstracts settings where the cost of serving the clients assigned to a facility is incurred by the facility. This problem was studied under the name of min-max star cover in [32,10], who (among other results) gave bicriteria approximation algorithms for MLkFL when F=D. MLkFL is rather poorly understood, and only an O(k)-approximation is currently known for MLkFL, even for line metrics. Our main result is the first polytime approximation scheme (PTAS) for MLkFL on line metrics (note that no non-trivial true approximation of any kind was known for this metric). Complementing this, we prove that MLkFL is strongly NP-hard on line metrics. In Chapter 4, we consider clustering problems with non-uniform lower bounds and outliers, and obtain the first approximation guarantees for these problems. We consider objective functions involving the radii of open facilities, where the radius of a facility i is the maximum distance between i and a client assigned to it. We consider two problems: minimizing the sum of the radii of the open facilities, which yields the lower-bounded min-sum-of-radii with outliers (LBkSRO) problem, and minimizing the maximum radius, which yields the lower-bounded k-supplier with outliers (LBkSupO) problem. We obtain an approximation factor of 12.365 for LBkSRO, which improves to 3.83 for the non-outlier version. These also constitute the first approximation bounds for the min-sum-of-radii objective when we consider lower bounds and outliers separately. We obtain approximation factors of 5 and 3 respectively for LBkSupO and its non-outlier version. These are the first approximation results for k-supplier with non-uniform lower bounds.
Cite this version of the work
Sara Ahmadian (2017). Approximation Algorithms for Clustering and Facility Location Problems. UWSpace. http://hdl.handle.net/10012/11640
Showing items related by title, author, creator and subject.
Lee, En-Shiun Annie (University of Waterloo, 2014-04-28)Protein sequences are essential for encoding molecular structures and functions. Consequently, biologists invest substantial resources and time discovering functional patterns in proteins. Using high-throughput technologies, ...
Whissell, John (University of Waterloo, 2012-10-12)In this thesis I examine clustering evaluation, with a subfocus on text clusterings specifically. The principal work of this thesis is the development, analysis, and testing of a new internal clustering quality measure ...
Ackerman, Margareta (University of Waterloo, 2008-01-16)Clustering is a widely used technique, with applications ranging from data mining, bioinformatics and image analysis to marketing, psychology, and city planning. Despite the practical importance of clustering, there is ...