Multiple Cooperative Swarms for Data Clustering
MetadataShow full item record
Exploring a set of unlabeled data to extract the similar clusters, known as data clustering, is an appealing problem in machine learning. In other words, data clustering organizes the underlying data into different groups using a notion of similarity between patterns. A new approach to solve the data clustering problem based on multiple cooperative swarms is introduced. The proposed approach is inspired by the social swarming behavior of biological bird flocks which search for food situated in several places. The proposed approach is composed of two main phases, namely, initialization and exploitation. In the initialization phase, the aim is to distribute the search space among several swarms. That is, a part of the search space is assigned to each swarm in this phase. In the exploitation phase, each swarm searches for the center of its associated cluster while cooperating with other swarms. The search proceeds to converge to a near-optimal solution. As compared to the single swarm clustering approach, the proposed multiple cooperative swarms provide better solutions in terms of fitness function measure for the cluster centers, as the dimensionality of data and number of clusters increase. The multiple cooperative swarms clustering approach assumes that the number of clusters is known a priori. The notion of stability analysis is proposed to extract the number of clusters for the underlying data using multiple cooperative swarms. The mathematical explanations demonstrating why the proposed approach leads to more stable and robust results than those of the single swarm clustering are also provided. Application of the proposed multiple cooperative swarms clustering is considered for one of the most challenging problems in speech recognition: phoneme recognition. The proposed approach is used to decompose the recognition task into a number of subtasks or modules. Each module involves a set of similar phonemes known as a phoneme family. Basically, the goal is to obtain the best solution for phoneme families using the proposed multiple cooperative swarms clustering. The experiments using the standard TIMIT corpus indicate that using the proposed clustering approach boosts the accuracy of the modular approach for phoneme recognition considerably.