Previous MSc Theses
"How Clean is Open Source Code?." V. Blazevicius. M. Wood. Department of Computer and Information Sciences, University of Strathclyde. 2015. Download PDF (BibTeX)
Abstract:
Context: In the last decade, the rapid shift towards agile techniques has greatly impacted the industry of software development. Developers started to get pressured for more and more quality source code in less amount of time. Many software experts have come up with the principles and practices to be followed in order to make sure that source code is more understandable and prone to enhancements. But are these practices and principles actually followed in the real world projects, and if they are, to what extent?
Objective: The goal of this study was to identify a set of lower level characteristics and principles that make source code more readable and understandable and determine to what extent they are followed in selected open source systems.
Method: Custom analysis tool was developed in order to measure certain code metrics, such as average variable, class or function name, the amount of polyadic functions within systems, average amount of statements per function. Using the combination of this custom analysis tool and a 3rd party tool called SourceMonitor, a set of 20 open source systems were selected based on their size and analyzed.
Results: Commenting and especially the public API documentation remains to be overused by majority of the systems. Software size seems to be affecting quite a few other metrics, such as the average number of function parameters and average complexity. Shorter functions with few parameters are preferred across the board by all systems. Short name variables are nearly extinct and are barely used.
Conclusions: With the exception of commenting, most guidelines and principles have indeed been followed by the selected open source systems to a decent degree. Some systems perform better in regards to certain metrics, signifying difference in emphasis on particular principles and techniques by developers. Future work areas identified include analyzing different aspects of code cleanliness, analyzing the semantics of comments in order to determine their usefulness as well as semantics of function names to discover whether the function implementation is appropriate for the given name, determining the value of each 'Clean Code' characteristic to the overall cleanliness of the code and studying incremental growth of large software systems in order to find out whether the guidelines and principles are followed throughout the development lifecycle and to what extent.