Aug 11, 2014 at 3:44 PM
Which is the range of values that Node Xl considers when calculates Modularity? which bibliographic reference is used?
Sep 2, 2014 at 5:37 PM
Modularity is calculated by the network clustering algorithms available in NodeXL.

NodeXL implements Givan-Newman, Clauset-Newman-Moore, and Wakita and Tsurumi.

Modularity is a percentage of edges that stay within their groups.

The bibliographic refernces:

Finding community structure in mega-scale social networks:
Ken Wakita Tokyo Institute of Technology, Tokyo, Japan
Toshiyuki Tsurumi Tokyo Institute of Technology, Tokyo, Japan

Community analysis algorithm proposed by Clauset, Newman, and Moore (CNM algorithm) finds community structure in social networks. Unfortunately, CNM algorithm does not scale well and its use is practically limited to networks whose sizes are up to 500,000 nodes. We show that this inefficiency is caused from merging communities in unbalanced manner and that a simple heuristics that attempts to merge community structures in a balanced manner can dramatically improve community structure analysis. The proposed techniques are tested using data sets obtained from existing social networking service that hosts 5.5 million users. We have tested three three variations of the heuristics. The fastest method processes a SNS friendship network with 1 million users in 5 minutes (70 times faster than CNM) and another friendship network with 4 million users in 35 minutes, respectively. Another one processes a network with 500,000 nodes in 50 minutes (7 times faster than CNM), finds community structures that has improved modularity, and scales to a network with 5.5 million.

Finding community structure in very large networks
Phys. Rev. E 70, 066111 – Published 6 December 2004
Aaron Clauset, M. E. J. Newman, and Cristopher Moore

The discovery and analysis of community structure in networks is a topic of considerable recent interest within the physics community, but most methods proposed so far are unsuitable for very large networks because of their computational cost. Here we present a hierarchical agglomeration algorithm for detecting community structure which is faster than many competing algorithms: its running time on a network with n vertices and m edges is O(mdlogn) where d is the depth of the dendrogram describing the community structure. Many real-world networks are sparse and hierarchical, with m∼n and d∼logn, in which case our algorithm runs in essentially linear time, O(nlog2n). As an example of the application of this algorithm we use it to analyze a network of items for sale on the web site of a large on-line retailer, items in the network being linked if they are frequently purchased by the same buyer. The network has more than 400 000 vertices and 2×106 edges. We show that our algorithm can extract meaningful communities from this network, revealing large-scale patterns present in the purchasing habits of customers.

Finding and evaluating community structure in networks
Phys. Rev. E 69, 026113 – Published 26 February 2004
M. E. J. Newman and M. Girvan
We propose and study a set of algorithms for discovering community structure in networks—natural divisions of network nodes into densely connected subgroups. Our algorithms all share two definitive features: first, they involve iterative removal of edges from the network to split it into communities, the edges removed being identified using any one of a number of possible “betweenness” measures, and second, these measures are, crucially, recalculated after each removal. We also propose a measure for the strength of the community structure found by our algorithms, which gives us an objective metric for choosing the number of communities into which a network should be divided. We demonstrate that our algorithms are highly effective at discovering community structure in both computer-generated and real-world network data, and show how they can be used to shed light on the sometimes dauntingly complex structure of networked systems.
Jan 22, 2015 at 5:44 PM
Hello, MarcSmith.
Thank you for your answer, but I still have doubts.

In your response, Marc say that modularity is "calculated by the network clustering algorithms available in NodeXL#, but you also say that "modularity is a percentage of edges that stay within their groups". ...
Modularity, according to algorithms that NodeXL have (and I read the original articles), is an indicator of the quality division of the network, its significance and ffor each of the algorithms the reference values are different.

So I ask again how the NodeXL calculates modularity?
The Overall Metrics sheet has the value of modularity, what does it mean? It's the percentage of edges that stay within their groups or it's (according to the definition of the algorithms) the quality of network division into clusters?
Jan 28, 2015 at 9:25 PM

NodeXL calculates graph cluster Modularity by passing the graph to a calculation library called SNAP.


The SNAP codebase is avaliable for inspection there.


Marked as answer by MarcSmith on 1/29/2015 at 9:37 AM