Cooperative Based Software Clustering on Dependency Graphs
Ibrahim, Ahmed Fakhri
MetadataShow full item record
The organization of software systems into subsystems is usually based on the constructs of packages or modules and has a major impact on the maintainability of the software. However, during software evolution, the organization of the system is subject to continual modification, which can cause it to drift away from the original design, often with the effect of reducing its quality. A number of techniques for evaluating a system's maintainability and for controlling the effort required to conduct maintenance activities involve software clustering. Software clustering refers to the partitioning of software system components into clusters in order to obtain both exterior and interior connectivity between these components. It helps maintainers enhance the quality of software modularization and improve its maintainability. Research in this area has produced numerous algorithms with a variety of methodologies and parameters. This thesis presents a novel ensemble approach that synthesizes a new solution from the outcomes of multiple constituent clustering algorithms. The main principle behind this approach derived from machine learning, as applied to document clustering, but it has been modified, both conceptually and empirically, for use in software clustering. The conceptual modifications include working with a variable number of clusters produced by the input algorithms and employing graph structures rather than feature vectors. The empirical modifications include experiments directed at the selection of the optimal cluster merging criteria. Case studies based on open source software systems show that establishing cooperation between leading state-of-the-art algorithms produces better clustering results compared with those achieved using only one of any of the algorithms considered.