Measuring the Impact of Code Dependencies on Software Architecture Recovery Techniques

dc.contributor.authorLutellier, Thibaud
dc.date.accessioned2015-08-21T14:39:49Z
dc.date.available2015-08-21T14:39:49Z
dc.date.issued2015-08-21
dc.date.submitted2015-08-20
dc.description.abstractMany techniques have been proposed to automatically recover software architectures from software implementations. A thorough comparison among the recovery techniques is needed to understand their effectiveness and applicability. This study improves on previous studies in two ways. First, we study the impact of leveraging more accurate symbol dependencies on the accuracy of architecture recovery techniques. In addition, we evaluate other factors of the input dependencies such as the level of granularity, the impact of virtual call resolution, global variable usage and whether using direct dependencies provides better results than using transitive dependencies. Previous studies have not extensively studied how the quality of the input might affect the quality of the output for architecture recovery techniques. Second, we study a system (Chromium) that is substantially larger (10 million lines of code) than those included in previous studies. Obtaining the ground-truth architecture of Chromium involved two years of collaboration with its developers. As part of this work we developed a new submodule-based technique to recover preliminary versions of ground-truth architectures. The other systems that we study have been examined previously. In some cases, we have updated the ground-truth architectures to newer versions, and in other cases we have corrected newly discovered inconsistencies. Our evaluation of nine variants of six state-of-the-art architecture recovery techniques on 8 types of dependencies shows that symbol dependencies generally produce architectures with higher accuracies than include dependencies. We also observed that using a higher level of granularity (i.e., module level) and direct dependencies helps generating better architectures. Despite this improvement, the overall accuracy is low for all recovery techniques. The results suggest that (1) in addition to architecture recovery techniques, the type of dependencies used as their inputs is another factor to consider for high recovery accuracy, and (2) more accurate recovery techniques are needed. Our results show that some of the studied architecture recovery techniques (ACDC, Bunch-SAHC, WCA and ARC) scale to the 10M lines-of-code range (the size of Chromium), whereas others do not.en
dc.identifier.urihttp://hdl.handle.net/10012/9554
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterloo
dc.subjectSoftware architectureen
dc.subjectEmpirical software engineeringen
dc.subjectMaintenance and evolutionen
dc.subject.programElectrical and Computer Engineeringen
dc.titleMeasuring the Impact of Code Dependencies on Software Architecture Recovery Techniquesen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Applied Scienceen
uws-etd.degree.departmentElectrical and Computer Engineeringen
uws.peerReviewStatusUnrevieweden
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Lutellier_Thibaud.pdf
Size:
441.52 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.17 KB
Format:
Item-specific license agreed upon to submission
Description: