Problems with betweenness centrality

Jan 7, 2013 at 9:36 PM

Hi,

I am having some problems with understanding the results of the betweenness centrality calculation for my network (a directed graph). I have several nodes with very high out-degree, but zero in-degree, or vice versa and these are receiving very high scores for betweeness centrality. As they can only be on the beginning or end of a path I would expect them to score zero in a directed network. Am I misunderstanding this?

Cheers, Alex

Coordinator
Jan 8, 2013 at 6:32 AM
Edited Jan 8, 2013 at 6:32 AM

Alex:

Yes, you are misunderstanding this.  If a vertex has very high out-degree (or in-degree, for that matter), then it is on the shortest paths between many vertex pairs, namely each pair of its many adjacent vertices.  If A is connected to B, C, and D, for example, then A is on the shortest paths between B and C, C and D, and B and D.

A vertex's betweenness centrality is determined by the number of shortest paths that include the vertex, so your high-out-degree vertex has a high betweenness centrality.

Please see the http://en.wikipedia.org/wiki/Centrality#Betweenness_centrality article for information on the betweenness centrality algorithm.

-- Tony

Coordinator
Jan 8, 2013 at 2:41 PM

I forgot to mention an important fact that might better answer your question: The direction of individual edges is not taken into account when the shortest paths are determined.

-- Tony

Jan 9, 2013 at 4:35 PM

Hi Tony,

 

yes, thanks! The second answer was the one that I needed. I had previously found an answer on the forum that said that directionality was taken onto account on calculating betweenness centrality (discussion:261244  Edge Weights and Betweenness). I think it would be useful to flag this up in the documentation.

 

Cheers,

Alex

Jan 9, 2013 at 4:38 PM

PS Do any of the other centrality measures take directionality into account? I need to gain a measure of the relative importance of my nodes as links in a metabolic/ecological network so directionality is key.

Thanks again

Coordinator
Jan 10, 2013 at 3:25 AM

Alex:

You're confusing two different things here.  The direction of individual edges is NOT taken into account when the shortest paths are determined.  However, the directionality of the graph IS taken into account in the final step of the betweenness centrality calculations, where NodeXL has to decide whether to divide the result by 2 based on the graph's directionality.  (I would copy over the relevant equations from http://en.wikipedia.org/wiki/Centrality#Betweenness_centrality to show you what I mean, but I'm sure the equation formatting wouldn't paste properly.)

So my statements here and at http://nodexl.codeplex.com/discussions/261244 are both correct.

-- Tony

Coordinator
Jan 10, 2013 at 3:40 AM
Edited Jan 10, 2013 at 3:41 AM

Alex:

The graph's directedness is ignored when calculating PageRank, closeness centrality and eigenvector centrality.  It is taken into account, obviously, when calculating degree, in-degree and out-degree, which are also considered a type of centrality (http://en.wikipedia.org/wiki/Centrality).

-- Tony

Oct 9, 2013 at 11:20 PM
Tony,

I need some clarification about computing other basic centrality measures except for in-degree and out-degree.
You are saying the graph's direction is ignored when measures are computed, even though a graph is directed, right?
I guess now I understand NodeXL outputs are different from other tools' outputs.
Is there any specific reasons why NodeXL ignores the direction of edges?
If you take edges' direction into account, the number of shortest paths between nodes would be slightly different from one without edge's directions.
I am just wondering the rationale behind this computation.

Jinie
Coordinator
Oct 10, 2013 at 12:26 AM
Edited Oct 10, 2013 at 2:13 AM
Jinie:

I said that the graph's directedness is ignored when calculating PageRank, closeness centrality and eigenvector centrality.

NodeXL uses a graph library called SNAP (http://snap.stanford.edu/) to calculate these. In the particular version of SNAP that NodeXL uses, the graph's directedness is not taken into account for these calculations. I can't say why; I don't even know if directedness should be used. The best way to get more information about this is to contact the SNAP team directly; they should be able to answer you more knowledgeably than I can.

-- Tony