automatically hide vertices with only one connection

Oct 21, 2012 at 11:18 AM

I'm trying to generate a map of academic citations of the papers I'm reading for my PhD.  I have input about a dozen papers in Vertex 1 (Edge tab) and all their references in Vertex 2 and generated a circle graph.

I would like to hide all vertices (references) which are cited only once so I have a map of papers which have been cited two or more times. 

This will be a work in progress, so it doesn't make sense for me to do it manually because every time I add a new paper and its references I will have to search the entire list to see if there are papers which are cited twice after the addition of the new vertices (references).

I've looked through other people's questions and through help, and not found an answer yet.

Is there any solution?

Coordinator
Oct 21, 2012 at 1:53 PM

Once you enter the edge data representing connections between papers you can calculate graph metrics.

If you have set your graph to be "Directed" the calculation of graph metrics will generate an "In-degree" and an "Out-degree" column.

Using the "Dynamic Filters" feature (the button is at the top of the graph pane, the window labled "Document Actions") you can now set the minimum threshold for vertices to be greater than "1" for in-degree'.

Regards,

Marc

Oct 22, 2012 at 6:56 AM

Thanks for your quick reply Marc. 

I set the graph to 'Directed' and went to 'Dynamic Filters' but I don't understand how to set the minimum threshold.  For Edge Filters I see "The Edges worksheet doesn't have any numeric, date or time columns that can be used for dynamic filtering".  And for Vertex Filters I have a small graphic for 'x' which has a range of 1,168.27 to 8,839.73 and a 'y' range of 197.84 to 9,801.16 - I can change these but all it does is delete vertices along the x and y axes. At the bottom I also have a check box 'Show edges and vertices if cells are empty" and "Filter opacity".

So I don't understand how/where to set the minimum threshold.  Any further suggestions for what to try?

Thanks again in advance!

Coordinator
Oct 22, 2012 at 1:46 PM

It sounds like you have not yet calculated graph metrics for this network.

As a result, there are no in-degree values to use to filter.

Use NodeXL>Graph Metrics to generate results to filter on.

-

Marc

Oct 22, 2012 at 1:48 PM

Another option, although it gets at the same thing, would be to count and merge duplicate edges.

NodeXL>Prepare Data>Count and Merge Duplicate Edges

Oct 22, 2012 at 4:41 PM
Edited Oct 22, 2012 at 5:09 PM

Marc - you're right that I didn't yet calculate graph metrics, so I did that and now I see the option for setting the in-degree; when I perform all the steps many of the directed edges disappear but the vertices remain 'floating in space', vertices which have no relationship with any other vertex. I tried with a very simple dummy data set and here's the problem:  if I set the in-degree to 1 many vertices disappear, including the ones representing the journal articles from which the entire data set originates from (as they should).  So if I have three original/source articles, Smith, Jones, and White, and I input all their references which total 9, unless Smith is cited by Jones and White, his vertex disappears, leaving any of the references Smith has in common with Jones and White 'floating'.

Is there some way to set it so that Vertices in Vertex 1 aren't subject to the dynamic filter, or is there some way to manually make them 'immune'?  I have the feeling that there might be a way through Groups, but I can't quite understand how to make it work.

luckystrike - I also tried your suggestion to see what happened.  Unless I misunderstood your suggestion, my problem isn't that I have duplicates that I want to get rid of, its that I have some vertices with only one relationship/edge, and I don't want to see them in the network map.  Let's say I have 5 journal articles and among these they cite a total of 20 other articles.  So I have in the Vertex 1 column the original 5 articles and in Vertex 2 the 20 articles cited by them.  Let's say that only 9/20 articles are cited by 2 or more of the original 5 articles.   I would like to hide/exclude all vertices which are only cited by one article (in Vertex 1) and show/include the original 5 journal articles.

Sorry if I'm asking very basic questions.  I'm completely new to this!

Oct 22, 2012 at 5:07 PM
sanspm wrote:

luckystrike - I also tried your suggestion to see what happened.  Unless I misunderstood your suggestion, my problem isn't that I have duplicates that I want to get rid of, its that I have some vertices with only one relationship/edge, and I don't want to see them in the network map.  Let's say I have 5 journal articles and among these they cite a total of 20 other articles.  So I have in the Vertex 1 column the original 10 articles and in Vertex 2 the 20 articles cited by them.  Let's say that only 9/20 articles are cited by 2 or more of the original 5 articles.   I would like to hide/exclude all vertices which are only cited by one article (in Vertex 1) and show/include the original 5 journal articles.

Sorry if I'm asking very basic questions.  I'm completely new to this!

I think I might have originally misunderstood exactly how your network is set up. Now that I re-read things, it sounds like Marc is on the best route to accomplish what you're trying to do. But I'll defer to him to answer the specifics of your question.

Ask a lot of questions! That's how we all learn!

Coordinator
Oct 22, 2012 at 5:39 PM

Glad to hear you have made some progress!

It now sounds like you have calculated edge and vertex metrics.  But your description makes me think you are using the Edge filters to remove connections from the network, rather than the Vertex filters which remove nodes from the network.

You can control Vertex and Edge Visibility in several ways in NodeXL.  Both the Vertices and Edges worksheets have columns labeled "Visibility".

** You may need to use the NodeXL>Show/Hide>Workbook Columns>Visual Properties to display these columns!

Visibility can be set to "Show" (or 4), "Hide" (or 2), "Skip"  (or 0) or "Show if in an edge" (or 1).

By default Visibility is set to "Show".

You may want to apply "Show if in and edge" to your vertices.

This can also be accomplished through the NodeXL>Visual Properties>Autofill Columns feature.  

NodeXL Autofill Feature

Look for the rows labeled "Edge Visibility" and "Vertex Visibility". These control the rule for display of a vertex or an edge.  You *could* set the rule to be that in-degree must be greater than "n".

-

Marc