Sum-of-norms clustering:  theoretical guarantee and post-processing

Jiang, Tao

Sum-of-norms clustering: theoretical guarantee and post-processing

dc.contributor.advisor	Vavasis, Stephen
dc.contributor.author	Jiang, Tao
dc.date.accessioned	2020-09-11T12:19:14Z
dc.date.available	2020-09-11T12:19:14Z
dc.date.issued	2020-09-11
dc.date.submitted	2020-08-24
dc.description.abstract	Sum-of-norms clustering is a method for assigning n points in d-dimensional real space to K clusters, using convex optimization. Recently, Panahi et al. proved that sum-of-norms clustering is guaranteed to recover a mixture of Gaussians under the restriction that the number of samples is not too large. The first contribution of this thesis is to lift this restriction, i.e., show that sum-of-norms clustering can recover a mixture of Gaussians even as the number of samples tends to infinity. Our proof relies on an interesting characterization of clusters computed by sum-of-norms clustering that was developed inside a proof of the agglomeration conjecture by Chiquet et al. Because we believe this theorem has independent interest, we restate and reprove the Chiquet et al. result herein. Multiple algorithms have been proposed to solve the sum-of-norms clustering problem: subgradient descent by Hocking et al., ADMM and ADA by Chi and Lange, stochastic incremental algorithm by Panahi et al. and semismooth Newton-CG augmented Lagrangian method by Sun et al. All algorithms yield approximate solutions, even though an exact solution is demanded to determine the correct cluster assignment. The second contribution of this thesis is to close the gap between the output from existing algorithms and the exact solution to the optimization problem. We present a clustering test which identifies and certifies the correct clustering from an approximate solution yielded by any primal-dual algorithm. The test may not succeed if the approximation is inaccurate. However, we show the correct clustering is guaranteed to be found by a primal-dual path following algorithm after sufficiently many iterations, provided that the model parameter λ avoids a finite number of bad values. Numerical experiments are implemented to support our results.	en
dc.identifier.uri	http://hdl.handle.net/10012/16279
dc.language.iso	en	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.relation.uri	mixture of Gaussians	en
dc.relation.uri	half moons	en
dc.subject	convex optimization	en
dc.subject	second-order cone programming	en
dc.subject	sum-of-norms clustering	en
dc.subject	mixture of Gaussians	en
dc.subject	finite termination	en
dc.title	Sum-of-norms clustering: theoretical guarantee and post-processing	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Mathematics	en
uws-etd.degree.department	Combinatorics and Optimization	en
uws-etd.degree.discipline	Combinatorics and Optimization	en
uws-etd.degree.grantor	University of Waterloo	en
uws.contributor.advisor	Vavasis, Stephen
uws.contributor.affiliation1	Faculty of Mathematics	en
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Jiang_Tao.pdf
Size:: 477.09 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Combinatorics and Optimization