Exploring Community Structure of Software Call Graph and its
Applications in Class Cohesion Measurement
Yu Qu, Xiaohong Guan, Qinghua Zheng, Ting Liu, Lidan Wang,
Yuqiao Hou, Zijiang Yang
Reference: JSS 9528
To appear in: The Journal of Systems & Software
Received date: 20 November 2014
Revised date: 16 April 2015
Accepted date: 7 June 2015
Please cite this article as: Yu Qu, Xiaohong Guan, Qinghua Zheng, Ting Liu, Lidan Wang, Yuqiao Hou,
Zijiang Yang, Exploring Community Structure of Software Call Graph and its Applications in Class
Cohesion Measurement, The Journal of Systems & Software (2015), doi: 10.1016/j.jss.2015.06.015
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Highlights • We show that software static Call Graphs exhibit significant community structures. • We propose two new class cohesion metrics based on community structures. • The new metrics provide new and useful measurement of software class cohesion. • The new metrics perform better than existing metrics in software fault prediction. 1
Exploring Community Structure of Software Call Graph and its Applications in Class
Yu Qua, Xiaohong Guana,∗, Qinghua Zhenga, Ting Liua, Lidan Wanga, Yuqiao Houa, Zijiang Yangb aMinistry of Education Key Lab for Intelligent Network and Network Security, Xi’an Jiaotong University, Xi’an, Shaanxi, China 710049 bDepartment of Computer Science, Western Michigan University, Kalamazoo, MI, USA 48167
Many complex networked systems exhibit natural divisions of network nodes. Each division, or community, is a densely connected subgroup. Such community structure not only helps comprehension but also finds wide applications in complex systems. Software networks, e.g., Class Dependency Networks, are such networks with community structures, but their characteristics at the function or method call granularity have not been investigated, which are useful for evaluating and improving software intra-class structure.
Moreover, existing proposed applications of software community structure have not been directly compared or combined with existing software engineering practices. Comparison with baseline practices is needed to convince practitioners to adopt the proposed approaches. In this paper, we show that networks formed by software methods and their calls exhibit relatively significant community structures. Based on our findings we propose two new class cohesion metrics to measure the cohesiveness of object-oriented programs. Our experiment on 10 large open-source Java programs validate the existence of community structures and the derived metrics give additional and useful measurement of class cohesion. As an application we show that the new metrics are able to predict software faults more effectively than existing metrics.
Keywords: Software cohesiveness measurement, Class cohesion metrics, Complex network analysis, Community structure 1. Introduction
Many natural and man-made complex networked systems, including metabolic networks, computer networks and social networks, exhibit divisions or clusters of network nodes (Flake et al., 2000; Girvan and Newman, 2002; Palla et al., 2005; Fortunato, 2010; Mucha et al., 2010). Each division, or community (Girvan and Newman, 2002), is a densely connected and highly correlated subgroup. Such community structure not only helps comprehension but also finds wide applications in complex systems. For example, researchers in Biology and
Bioinformatics have applied community detection algorithms to identifying functional groups of proteins in Protein-Protein
Interaction networks (Dunn et al., 2005; Jonsson et al., 2006).
For online auction sites such as ebay.com, community structure is used to improve the effectiveness of the recommendation systems (Jin et al., 2007; Reichardt and Bornholdt, 2007). A survey on the applications of community detection algorithms can be found in Fortunato (2010).
There are also research efforts to investigate community structures in software, a very complex system (Pan et al., 2011; ˇSubelj and Bajec, 2011, 2012; Concas et al., 2013; ˇSubelj et al., ∗Corresponding author.
Email addresses: firstname.lastname@example.org (Yu Qu), email@example.com (Xiaohong Guan), firstname.lastname@example.org (Qinghua Zheng), email@example.com (Ting Liu), firstname.lastname@example.org (Lidan Wang), email@example.com (Yuqiao Hou), firstname.lastname@example.org (Zijiang Yang)
Figure 1: Community structure of jEdit with 5,979 nodes and 34 communities, detected by Louvain algorithm (Blondel et al., 2008). 2014). Most of them reported a significant community structure of a certain type of software network such as Class Dependency
Networks ( ˇSubelj and Bajec, 2011). Some pioneering applications of software community structure are proposed (for more details, please refer to Section 2). However, there are still some unsolved problems.
Firstly, most of the measurements are performed on the network of classes. Little results are reported on the granularity of software method or function call, i.e., method/function Call
Graphs (Graham et al., 1982). Such investigation is necessary from both theoretical and practical perspectives. In addition, measurements of the network of classes cannot be used in intraclass structure, which limits their applications in software quality evaluation and improvement.
Preprint submitted to Journal of Systems and Software June 13, 2015