Girvan-Newman algorithm never returns any clusters

May 1, 2010 at 7:03 PM
Edited May 2, 2010 at 5:36 AM

I am trying to cluster about 8k nodes and about 160K edges between them, two of the other algorithms run very fast but Girvan-Newman gets lost in the space.

Is it algorithm problem or is it implementation thing?.

 

Cheers,

May 2, 2010 at 7:07 PM

It's just a slow algorithm.  With 160K edges, it could take hours.  I recommend one of the other two algorithms for larger graphs.

-- Tony

May 2, 2010 at 7:28 PM

For the next release, I've changed the menu text from "Use Girvan-Newman (Slower)" to "Use Girvan-Newman (Slower, for Small Graphs Only)."

-- Tony

May 3, 2010 at 2:19 AM

Thanks for your reply.

Any idea when source code for new version will come out?.

Cheers,

May 3, 2010 at 7:34 AM

Hello Tony,

Actually i am doing my thesis and want to compare CNM and Wakita algorithms. For these purposes, i need to look at the source code and compare modularity for different community sizes of both algorithms.

Is there any way where i can get the updated source code?

 

Cheers,

-Bassam

 

May 3, 2010 at 3:59 PM
Edited May 3, 2010 at 4:25 PM

Hi Tony,

Sorry to be a pain, i just ran through the code of Wakita alogrithm, in their paper they have used consolidation ratios. They use these ratios, along with modularity, to maintain the balance of merging communities.

In NodeXL implementation of this algorithm, i cannot find any such thing. Am i missing a point?

Thanks,

 

May 3, 2010 at 5:53 PM

Bassam:

The next version of NodeXL, to be released around May 12, will include source code.  However, it won't include the SNAP source code used for the CNM algorithm, only an executable built from the SNAP source code.  You can get the SNAP source code today from http://snap.stanford.edu/index.html.  The SNAP source code for the CNM algorithm is in the CommunityGirvanNewman() function in cmty.cpp.

Wakita and Tsurumi, in "Finding Community Structure in Mega-scale Social Networks," introduced a data structure to dramatically improve computational efficiency, and a set of heuristics involving "consolidation ratios" for further improvement.  NodeXL implements the data structure but not the heuristics, so maybe NodeXL's version should be called "Wakita-Tsurumi Partial."  I've added a work item to investigate adding one of the heuristics.  However, NodeXL will likely use a hard-coded heuristic (HE, HE', or NE, to use the paper's terminology), and won't allow you to change it.  So if you are doing research to compare modularities, you might want to modify the NodeXL source code yourself and experiment with the different heuristics.  The paper explains that modularity performance varies with the choice of heuristic.

-- Tony

Dec 11, 2013 at 11:52 PM
hi could you help me please!!! i'am a phd student and i'am working with the algorithme of girvan newman, it's work only for a small graph. however when i've tried it with a big graph (around 80k nodes) python return this message:
Traceback (most recent call last):
File "C:\Python27\community-master\cmty.py", line 73, in <module>
A = nx.adj_matrix(G)
File "C:\Python27.64\lib\site-packages\networkx\linalg\graphmatrix.py", line 146, in adjacency_matrix
return nx.to_numpy_matrix(G,nodelist=nodelist,weight=weight)
File "C:\Python27.64\lib\site-packages\networkx\convert.py", line 520, in to_numpy_matrix
M = np.zeros((nlen,nlen), dtype=dtype, order=order)
MemoryError

please help meee!!!
Dec 12, 2013 at 12:08 AM
Edited Dec 12, 2013 at 12:33 AM
I'm afraid you're asking your question in the wrong place. This discussion is about NodeXL, a program that uses a C# implementation of the Girvan-Newman algorithm. You are using a Python package that implements Girvan-Newman. You would have to direct your question to wherever that Python package came from, because we don't know anything about it.

Good luck,
Tony