Create edge variables based on node attributes

Mar 17, 2009 at 8:50 PM

I would like a quick and easy way to make variables in the edge sheet based on attributes of both sender and recipient attributes that are strored in the node sheet.    

I realize that Vlookup has some capability for this type of thing in general, but I have not had any success in using it for this purpose.   

For instance, I have a dataset that measures when people in a community adopted a new technology, and records of the structure of their interaction.   I want to visualize the flow of influence in the network, and thus I will color nodes based on whether they have not yet adopted, are adopting, or adopted in the past.   I also want to see when 'infected people' talk to those who are just then adopting (possibly a source of influence on their adoption decision).   I want to color edges that come from nodes that areinfected differently than edges that come from un-infected nodes.  I also want to color edges differently when they are coming from infected nodes and leading to newly infected nodes.  
Thus I need to make variables in the edgesheet that depend on data from the nodesheet for both sender and recipient.  This could be done in two steps.  It is totally fine to return variables from sender in one cell and recipient in the next and simply define a third variable as their combination.  

For instance, in this case my node sheet, in addition to other variables has variables that indicate whether a node has adopted an innovation in that time period (coded as a 2) adopted in the past (coded as 1) or has not yet adopted the innovation (coded as a 0). 


Node Name                  T1Adopt         T2Adopt         T3 Adopt

Bob                                1                     1                    1
Fred                                0                    2                     1
Mary                               0                    0                     2
Susan                               2                    1                    1

Sender  Recipient        T1Edge   T2Edge  T3 Edge   Tetc. 
Bob        Fred                    0          1           1                        
Bob        Susan                   1          0          0

On my edgesheet I want to make variables that indicate when, for instance, a newly infected node has a tie with a node that was infected in the past.   

Sender  Recipient        T1Edge   T2Edge  T3 Edge   Tetc.   T1Infect    T2 Infect       T3Infect
Bob        Fred                    0          1           1                        1              2                 1
Bob        Susan                   1          0          0                         2              0                 0


this is more complex than Vlookup expects, partly because I need excel to check the sender or the recipient column in the edgesheet and then check for that name in column 1 of the nodesheet.     My understanding is that vlookup wants to find the id variable in column 1.

any ideas?


Mar 18, 2009 at 8:31 PM
Edited Mar 18, 2009 at 8:34 PM
Hello, Ted:

If what you need to do is look up Vertex 1 or Vertex 2 attributes in the Vertices table and insert them into the Edges table, you can indeed do that with Excel's VLOOKUP() function.  You might be misunderstanding VLOOKUP's column-1 requirement, which applies to the function's Table_array parameter and not its Lookup_value parameter.  The value you're looking up has to be in the first column of the range you tell Excel to search (Table_array).  The only restriction this introduces is that you can't retrieve a cell to the left of that range, which doesn't apply here, and there are ways around that, too.

If your Vertices table has a column called "Vertex Attribute" and you want to add a column called "Vertex 1 Attribute" to your Edges table, the formula for the Vertex 1 Attribute column would be this:

=VLOOKUP(Edges[[#This Row],[Vertex 1]], Vertices, 20, FALSE)

(This assumes that "Vertex Attribute" is the 20th column in the Vertices table, where column 1 is the Vertex column.)

If you also want to add a column called "Vertex 2 Attribute" to your Edges table, the formula would be this:

=VLOOKUP(Edges[[#This Row],[Vertex 2]], Vertices, 20, FALSE)

-- Tony
Mar 22, 2009 at 6:53 PM
Very useful, thanks Tony

BTW, how does one retrieve a cell to the left of the Table_array range?

--- Pierre
Mar 23, 2009 at 2:39 AM

To retrieve a cell to the left of the Table_array range, you ditch the VLOOKUP() function in favor of the MATCH() and INDEX() functions.  MATCH() searches for a specified lookup value within a specified range and gives you the index of the lookup value within the range.  INDEX() returns the value at a specified index in a specified range.  Combine the two, and you can retrieve the value in column X of the row that contains the lookup value in column Y, where X can be to the left or right of Y.

For example, let's say you've inserted an "Attribute in Column A" column to the left of the Vertex column in the Vertices table, and you want to add a "Vertex 1 Attribute in Column A" column to the Edges table.  Here is the formula to do that:

= INDEX(Vertices[[#All],[Attribute in Column A]], MATCH(Edges[[#This Row],[Vertex 1]], Vertices[[#All],[Vertex]], 0))

It looks complicated, but that's because the MATCH() function is embedded as an argument to the INDEX() function.  Pick it apart and it should make sense.

-- Tony
Apr 1, 2009 at 8:14 PM
Hey Tony- 

Thanks a bunch for your help with this!   That works perfectly and it is a very helpful operation to be able to perform.   Sorry for the delay getting back to you but I was out of town for a bit and away from this project.

thanks again,