May 5, 2011
May 5, 2011

Is there anyone who is able to explain the value of edge weights? 

I have crawled data of twitter using nodeXL; and convert it to workbook style. During the process, the NodeXL automatically merges data and generates the score of edge weighted and reports it to workbook.

I want to know what the number exactly means... the tutorial says that the number is calculated because of duplication of data. In that case, how can I interpret the number, and these numbers have specific meaning in explanation the relationship of dyads, or followers and followees.

The Merge Duplicate Edges function takes every instance of an edge (some pair "AB") and counts the number of those instance and replaces them all with a single edge ("AB") and a "weight" equal to the count of edges.

In the Twitter data import you can set the importer to collect multiple instances and types of edges.  

NodeXL currently can optionally create edges for each "Follows", "Replies", and "Mentions" event found in the data.  

These events can happen with various frequencies.  "A replies to B", for example, can happen many times, while "A follows B" can only happen once.

I usually do not Merge Duplicate Edges when I look at Twitter data because I like to filter the network based on the different types of edges present.  

Our Merge Duplicate Edges feature is currently very crude.  It cannot merge edges flexibly.  

For example, you might want to merge all follows, replies, and mentions into three different merged edges.  We do not yet do this.

You might want to sum, average, take the max, min, or median some attribute each edge: we do not do this yet.

We plan to address better support for Merge Duplicate Edges in the coming months.  I suggest that some existing Excel features for merging and counting data can be applied now to get the results you need.



In general, edge weights are just numbers that are assigned to the edges in a network graph.  Their meaning depends on what is being studied.  If a graph's vertices represent cities, for example, then the edge weights might be the distances between the city pairs, or the cost of travelling from one city to another.

In NodeXL the graph's vertices sometimes represent people, in which case the edge weights can be used to indicate the strength of the relationship between pairs of people.  For example, when you import a Twitter network using an item on the NodeXL, Data, Import menu, you have the option to create an edge for each of several types of relationships.  If Bill follows John and also replies to John, for example, then your graph will end up with two edges between Bill and John.

In some cases you might want to merge those two edges into a single edge, which you can do with NodeXL, Prepare Data, Merge Duplicate Edges.  NodeXL will replace the two Bill-John edges with a single edge, add an Edge Weight column to the Edges worksheet, and set its value to 2 to indicate that there were actually two edges between Bill and John.  You will lose the two edges' details -- the fact that one was a Follows edge and one was a Replies To edge -- but if you only need to know the strength of the relationship and not the details, then that might be okay.

NodeXL only merges duplicate edges automatically if you use NodeXL, Graph, Automate and check "Merge duplicate edges."  You do not have to merge duplicate edges if it is not appropriate for your particular data.

-- Tony