Sum-of-norms clustering: theoretical guarantee and post-processing

dc.contributor.authorJiang, Tao
dc.date.accessioned2020-09-11T12:19:14Z
dc.date.available2020-09-11T12:19:14Z
dc.date.issued2020-09-11
dc.date.submitted2020-08-24
dc.description.abstractSum-of-norms clustering is a method for assigning n points in d-dimensional real space to K clusters, using convex optimization. Recently, Panahi et al. proved that sum-of-norms clustering is guaranteed to recover a mixture of Gaussians under the restriction that the number of samples is not too large. The first contribution of this thesis is to lift this restriction, i.e., show that sum-of-norms clustering can recover a mixture of Gaussians even as the number of samples tends to infinity. Our proof relies on an interesting characterization of clusters computed by sum-of-norms clustering that was developed inside a proof of the agglomeration conjecture by Chiquet et al. Because we believe this theorem has independent interest, we restate and reprove the Chiquet et al. result herein. Multiple algorithms have been proposed to solve the sum-of-norms clustering problem: subgradient descent by Hocking et al., ADMM and ADA by Chi and Lange, stochastic incremental algorithm by Panahi et al. and semismooth Newton-CG augmented Lagrangian method by Sun et al. All algorithms yield approximate solutions, even though an exact solution is demanded to determine the correct cluster assignment. The second contribution of this thesis is to close the gap between the output from existing algorithms and the exact solution to the optimization problem. We present a clustering test which identifies and certifies the correct clustering from an approximate solution yielded by any primal-dual algorithm. The test may not succeed if the approximation is inaccurate. However, we show the correct clustering is guaranteed to be found by a primal-dual path following algorithm after sufficiently many iterations, provided that the model parameter λ avoids a finite number of bad values. Numerical experiments are implemented to support our results.en
dc.identifier.urihttp://hdl.handle.net/10012/16279
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.relation.urimixture of Gaussiansen
dc.relation.urihalf moonsen
dc.subjectconvex optimizationen
dc.subjectsecond-order cone programmingen
dc.subjectsum-of-norms clusteringen
dc.subjectmixture of Gaussiansen
dc.subjectfinite terminationen
dc.titleSum-of-norms clustering: theoretical guarantee and post-processingen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentCombinatorics and Optimizationen
uws-etd.degree.disciplineCombinatorics and Optimizationen
uws-etd.degree.grantorUniversity of Waterlooen
uws.contributor.advisorVavasis, Stephen
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Jiang_Tao.pdf
Size:
477.09 KB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: