Average Geodesic Distance

Apr 29, 2013 at 5:07 AM
I'm doing network analysis of cosponsorship in Congress for a college project. I am learning about network analysis and NodeXL as I go, so I apologize for this probably stupid question but I could not find the answer elsewhere online.

I grouped my vertices and then had NodeXL calculate the group metrics. The average geodesic distance of one of my groups is listed as about 0.961. I thought that geodesic distance was at least 1, as two people who are connected to each other have a distance of 1. Could someone tell me how I am wrong?

Thank you for your help. Relatedly, I would also really appreciate good resources for how to interpret the statistics NodeXL churns out.
Apr 29, 2013 at 4:21 PM

Average geodesic distance is the sum of the shortest paths between vertex pairs divided by the number of possible vertex pairs, so the average can be less than one.

The graph metrics that NodeXL can calculate are described in the bottom half of the Graph Metrics dialog box, accessible from NodeXL, Analysis, Graph Metrics. Select a metric in the list at the top of the dialog box, and a description will appear in the bottom. Note that you can resize the dialog box to make the descriptions easier to read.

-- Tony
May 1, 2013 at 7:11 AM
Thank you for the information about the statistics, it is really helpful.

I guess my question for average geodesic distance then is: what are the components of the sum in the numerator? Do we sum the shortest path between every possible vertex pair, or is it some other subset of geodesic distances that we are summing?

(It would seem to me that if there were x possible vertex pairs [so the denominator of average geodesic distance is x], the sum [the numerator] would have to be at least x as each of these x distances is at least 1. Then the result would of course be 1 or greater. So I suppose somewhere in here is where I am getting confused).

Does it make a difference whether the graph is directed or undirected?

Thank you again.
May 1, 2013 at 8:32 PM

NodeXL uses a library called SNAP (http://snap.stanford.edu/) to calculate geodesic distances, so I had to dig into that library to see what it is doing.

Take the simple three-vertex graph A-B-C, where A is connected to B and B is connected to C. The shortest paths between the vertex pairs are as follows:

A-B: 1
A-C: 2
B-A: 1
B-C: 1
C-A: 2
C-B: 1

The sum of the shortest paths is 8.

The number of possible vertex pairs is 9:


Note that SNAP takes the possibility of self-loops into account, so there are 9 possible vertex pairs, not just 6.

It then calculates the average geodesic distance as 8/9, or 0.889.

The graph's directedness does not affect the calculations.

-- Tony
May 8, 2013 at 7:40 PM
Thanks so much--that info is really helpful for interpreting the statistic and also gives me an idea of how to recalculate the statistic so that the denominator does not include self-loops (as I am not using those in my dataset).

Does NodeXL calculate statistical significance, for example to compare a statistic between two groups? I have not been able to discover how to find statistical significance in social network analysis, so any help would be fantastic.

Thanks again for all this help. My Political Science professor plans to implement NodeXL in classes and this will all be a great help.
May 9, 2013 at 1:25 AM
Edited May 9, 2013 at 2:28 AM

I'm afraid it doesn't, and I'm not familiar with that particular metric. We may have to add it to our feature request list.

Good luck in your poly sci class. (That's what we called it when I was in school, anyway.)

-- Tony