Exporting a subset of edges

Feb 15, 2012 at 5:31 PM

I am working with an initial set of over 6K edges.  Having generated the metrics for this initial set, I now want to filter out the nodes whose "out centrality" is under a specific threshold as well as all the edges which include at least one of these nodes.  I can do this for the graph using the dynamic filters but how can I export the new edge data resulting from this filtering into a new file?

Thanks in advance for your help.

Feb 15, 2012 at 6:20 PM

I can't think of a way to do that with NodeXL's dynamic filters.  However, you can use Excel's AutoFilters instead to do what you want.  AutoFilters are the down-arrows you see in each column header cell.

Do the following:

1. In the Dynamic Filters dialog box, click Reset All so everything is visible again.  You won't be using dynamic filters for this.

2. Switch to the Vertices worksheet.

3. Click the down-arrow in the "out centrality" column header.  (Do you mean "Out-Degree"?)

4. In the popup menu, select Number Filters, Greater Than.

5. In the Custom AutoFilter dialog box, enter your threshold.  Now, only the vertex rows that have out centrality greater than the threshold are shown in the Vertices worksheet.

6. Select all the rows in the Vertices worksheet.  That will select the vertices and their incident edges.

7. Use NodeXL, Data, Export, Selection to New NodeXL Workbook.  You'll end up with a new NodeXL workbook that has only those vertices that have out centrality greater than the threshold, along with their incident edges.

-- Tony

Feb 15, 2012 at 8:02 PM

Tony,

Thanks for this great answer.  Is there also a way to accomplish the following variation of the export?

Filter out the nodes whose "out degree" is under a specific threshold as well as all the edges between these nodes (i.e., exclude any incident edges with only one of their nodes in the filtered node subset).

Thanks.

Feb 16, 2012 at 7:00 AM
Edited Feb 16, 2012 at 7:01 AM

I'm not sure I understand your latest request, but in reading my previous answer, I realized it was incorrect.  It happened to work on the data I tried it on, but it won't work in general.  Sorry about that.

Try this instead, and tell me if it gives you what you need:

1. In the Dynamic Filters dialog box, click Reset All so everything is visible again.  You won't be using dynamic filters for this.

2. Switch to the Vertices worksheet.

3. Sort the Vertices worksheet by Out-Degree, smallest to largest.

4. For all vertices for which Out-Degree is under the threshold, set the Visibility column to Skip.  You can do this manually or use an Excel formula.*  With a formula, step 3 isn't required.

5. Refresh the graph.  The vertices for which Out-Degree are under the threshold will be gone from the graph pane.

6. Right-click the graph pane and select Select All, Vertices and Edges.  (Or click the graph pane and enter Ctrl-A.)

7. Use NodeXL, Data, Export, Selection to New NodeXL Workbook.  You'll end up with a new NodeXL workbook that has only those vertices that have Out-Degree greater than the threshold, along with their incident edges.

-- Tony

* Sample formula for a threshold of 1:   =IF(Vertices[[#This Row],[Out-Degree]] > 1, "Show if in an Edge", "Skip")

Feb 17, 2012 at 7:10 PM

Tony,

Thanks so much for your revised instructions - they work.  The use of the formula is a very effective tool for manipulating the dataset.  Where can I find more documentation on the use of formulas in NodeXL?

-- John

Feb 21, 2012 at 7:53 PM

John:

If you like online resources, here is Microsoft's overview of Excel formulas:

http://office.microsoft.com/en-us/excel-help/overview-of-formulas-HP010081865.aspx

If you prefer books, my favorite is "Excel 2007: The Missing Manual":

http://www.amazon.com/Excel-2007-Missing-Matthew-MacDonald/dp/0596527594/ref=sr_1_1?ie=UTF8&qid=1329854002&sr=8-1

-- Tony

Feb 22, 2012 at 8:30 PM

Great.  Thanks, Tony!