
Hello,
I'm trying to analyse the netscience data set on nodexl. When I run the Wakita Tsurumi algorithm on the same data I'm getting different number of communities (282 and 410). Can anybody share some light on this?
Also does this algorithm support overlapping nodes in different communities?



I'm not sure I know what you mean. Is the following an accurate description of what you're doing?
 Open the workbook containing the netscience data.
 Calculate clusters using WakitaTsurumi and note the number of groups.
 Close the workbook without saving it.
 Open the workbook again.
 Calculate clusters again using WakitaTsurumi and note the number of groups.
...and you're saying that the number of groups changes?
Regarding your second question, each vertex ends up in one and only one group.
 Tony



I had analysed the netscience data a week ago calculating clusters using Wakita Tsurumi algorithm. It had given 282 clusters. I had saved it and when I calculate it again in the same workbook now it still gives me 282. But when I was showing the procedure
to my friend on a new workbook importing the netscience data afresh it is giving me 410 communities. I'm relatively new to NodeXL so am I missing something very basic here?
Thanks a lot for your response.



That's interesting to see that much of a change with the number of groups. When using a local community detection technique like WakitaTsurumi, or FastNewman, small fluctuations in results can happen, due to different starting points the network, however,
that's not a small fluctuation. When using a global technique, like GirvanNewman, you should get the same results every time. I'm not sure what is causing your discrepancy, and I've never experienced that, but that definitely isn't normal.



Ok thank you for your insight.
I had an another question. Is there a way to find out the number of boundary edges in each community thus detected?



from where i can read about wakita and tsurumi algorithm ..
i have no idea about algorithm ..


Sep 2, 2014 at 6:11 PM
Edited Sep 2, 2014 at 6:11 PM

Here is the paper:
http://dl.acm.org/citation.cfm?id=1242805
Finding community structure in megascale social networks:
Community analysis algorithm proposed by Clauset, Newman, and Moore (CNM algorithm) finds community structure in social networks. Unfortunately, CNM algorithm does not scale well and its use is practically limited to networks whose sizes are up to 500,000 nodes.
We show that this inefficiency is caused from merging communities in unbalanced manner and that a simple heuristics that attempts to merge community structures in a balanced manner can dramatically improve community structure analysis. The proposed techniques
are tested using data sets obtained from existing social networking service that hosts 5.5 million users. We have tested three three variations of the heuristics. The fastest method processes a SNS friendship network with 1 million users in 5 minutes (70 times
faster than CNM) and another friendship network with 4 million users in 35 minutes, respectively. Another one processes a network with 500,000 nodes in 50 minutes (7 times faster than CNM), finds community structures that has improved modularity, and scales
to a network with 5.5 million.
Authors:
Ken Wakita Tokyo Institute of Technology, Tokyo, Japan
Toshiyuki Tsurumi Tokyo Institute of Technology, Tokyo, Japan


Sep 2, 2014 at 6:12 PM
Edited Sep 2, 2014 at 6:12 PM

NetworkAnalysis asks the Q: Is there a way to find out the number of boundary edges in each community thus detected?
Yes: See the Group Vertices worksheet where there is a row describing the connections between every pair of connected groups.
Regards,
Marc

