Displaying selected edges and vertices

Jun 1, 2010 at 6:08 PM
I'm using NodeXL for citation analysis. We have created an edges worksheet that contains all of our core documents listed under Vertex 1 column and all the citations of those core documents in Vertex 2 column. Additionally, we have codes for each core documents that separately define the content (e.g. "policy", "instruction" and "policy and instruction") and type (e.g. "research" and "commentary), and we have color labels in the vertices worksheet for the source of every document (both core documents and those cited by the core; documents from academic journals are labeled green; documents from organizations are labeled red; documents from government sources are labeled blue, etc). We have about 2400 edges altogether, 2088 vertices. I've been spending a fair amount of time trying to figure out how I can display all vertices representing core documents of a certain type of content (e.g. display only "Policy" core documents) and all the vertices representing the citations of those core documents of a certain type (e.g. all the cited documents for those core documents that are "policy" documents). I want to be able to see whether our core documents that are "policy" cite a certain type of document (e.g. more organization versus more journal publications). Examples of what I can do: I can filter out all edges but those representing "policy and instruction" core document citations. But, then I still have all the other vertices. So it is difficult to view what vertices are connected only to edges representing citations of "policy and instruction" core documents. Or I can filter out all but the vertices representing core documents that we have labeled "policy and instruction" core documents. Then, I can view the core documents that are "policy and instruction" documents. However, I cannot then view the adjacent vertices for those documents unless I select each specific vertex (actually a little difficult to do with so many vertices sometimes). And even in that case when I indicate that I want to view adjacent vertices, the "selected" vertices are all a certain color, which negates all the source color coding that I care about seeing for any citations of "policy and instruction" documents. Any suggestions for what I can do? I'd appreciate any advice.
Jun 1, 2010 at 10:04 PM
Edited Jun 1, 2010 at 11:23 PM

I believe you are describing several problems here.  Let's start by addressing the first one.

When you say "I can filter out all edges but those representing 'policy and instruction' core document citations," you didn't say how you were doing the filtering.  I'll assume that you were using Excel's table-filtering feature on the Vertices worksheet -- that is, the down-arrows within the column headers.  That won't work, as you've discovered.  Filtering out a vertex that way won't prevent it from showing in the graph.  The following technique will work, however.

If your graph has a subset of vertices S, and you want to show only those vertices and the edges that connect them, then do the following:

1. For each vertex in S, set its Visibility cell on the Vertices worksheet to Show if in an Edge.

2. For all other vertices, set their Visibility to Skip.

When a vertex's Visibility is Skip, NodeXL skips the vertex's row and any edge rows that use the vertex, which will give you the behavior I think you want.  The Skip option and all other Visibility options are explained in the popup message that appears when you hover your mouse over the Visibility column header.

Given the size of your graph, I'm sure you don't want to fill in the Visibility cells by hand, and that's where Excel's formulas come in handy.  You can enter a formula in the first cell of the Visibility column that computes the Visibility based on other column values, and Excel will automatically fill the column with your formula.  I suggest looking at the IF() function, which can output either "Show if in an Edge" or "Skip", depending on a condition you specify.  (Outputting 1 and 0 will do the same thing.)

Does that get you started, or did I misunderstand your post?

-- Tony

Jun 2, 2010 at 1:00 PM


Thanks for this great answer. I tried what you suggested, and it works. But I didn't tell you that I was trying to filter using the Dynamic Filter. But I did also figuerd out another technique will work as well. I didn't realize when I wrote the post above that I could also filter on the edges worksheet by column. I had already labeled each edge with several different column codes depending on the origin of the core document. So, for example, if the core document from which the citation came was a "policy" document, that column has "policy" as 1. All other documents that are not "policy" got a 0. Then, I could filter out all the non-policy documents using the down-arrow on the column header. This was actually particularly useful because I could filter out all the edges I didn't want and compute new in-degree metrics for each set of core documents (e.g. for all citations coming from "policy" docs, for all citations coming from "instruction" docs, etc). I don't know if you can do that using the visibility column.

But, thanks again, for your response. I hadn't used the visibility column yet, and now I know what I can use it to do.