Modularity is calculated by the network clustering algorithms available in NodeXL.
NodeXL implements Givan-Newman, Clauset-Newman-Moore, and Wakita and Tsurumi.
Modularity is a percentage of edges that stay within their groups.
The bibliographic refernces:
Finding community structure in mega-scale social networks:
Ken Wakita Tokyo Institute of Technology, Tokyo, Japan
Toshiyuki Tsurumi Tokyo Institute of Technology, Tokyo, Japan
Community analysis algorithm proposed by Clauset, Newman, and Moore (CNM algorithm) finds community structure in social networks. Unfortunately, CNM algorithm does not scale well and its use is practically limited to networks whose sizes are up to 500,000 nodes.
We show that this inefficiency is caused from merging communities in unbalanced manner and that a simple heuristics that attempts to merge community structures in a balanced manner can dramatically improve community structure analysis. The proposed techniques
are tested using data sets obtained from existing social networking service that hosts 5.5 million users. We have tested three three variations of the heuristics. The fastest method processes a SNS friendship network with 1 million users in 5 minutes (70 times
faster than CNM) and another friendship network with 4 million users in 35 minutes, respectively. Another one processes a network with 500,000 nodes in 50 minutes (7 times faster than CNM), finds community structures that has improved modularity, and scales
to a network with 5.5 million.
Finding community structure in very large networks
Phys. Rev. E 70, 066111 – Published 6 December 2004
Aaron Clauset, M. E. J. Newman, and Cristopher Moore
The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present
a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O(mdlogn) where d is the depth of the dendrogram describing the community
structure. Many real-world networks are sparse and hierarchical, with m∼n and d∼logn, in which case our algorithm runs in essentially linear time, O(nlog2n). As an example of the application of this algorithm we use it to analyze a network of items for sale
on the web site of a large on-line retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400 000 vertices and 2×106 edges. We show that our algorithm can extract meaningful communities from
this network, revealing large-scale patterns present in the purchasing habits of customers.
Finding and evaluating community structure in networks
Phys. Rev. E 69, 026113 – Published 26 February 2004
M. E. J. Newman and M. Girvan
We propose and study a set of algorithms for discovering community structure in networks—natural divisions of network nodes into densely connected subgroups. Our algorithms all share two definitive features: first, they involve iterative removal of edges from
the network to split it into communities, the edges removed being identified using any one of a number of possible “betweenness” measures, and second, these measures are, crucially, recalculated after each removal. We also propose a measure for the strength
of the community structure found by our algorithms, which gives us an objective metric for choosing the number of communities into which a network should be divided. We demonstrate that our algorithms are highly effective at discovering community structure
in both computer-generated and real-world network data, and show how they can be used to shed light on the sometimes dauntingly complex structure of networked systems.