Display of multiple components: NodeXL vs NetDraw

Apr 12, 2010 at 8:42 AM

Hi

Yesterday I downloaded some data on postings made to an email list that I manage. I uploaded the edited data into NodeXL, then dispayed the graph using the Harel-Koren layout. The result can be seen here: http://www.mande.co.uk/temp/GraphImage1.jpg

In the background is one large component. In the centre are multiple much smaller components. In NetDraw (using the spring-embedding layout) the latter would be displayed around the periphery of the screen, outside of the main component. This is a much better way of viewing the overall graph structure.

Is there anything that can be done within NodeXL as it is, to get a NetDraw-like result as described?

If not, might future versions of NodeXL be able to address this issue?

regards, rick davies

Apr 12, 2010 at 5:02 PM
Edited Apr 12, 2010 at 5:05 PM

Rick:

If the smaller components in the center are disconnected from the rest of the graph, you can tell NodeXL to place them at the bottom of the graph pane.  To do so, go to NodeXL, Graph, Layout, Layout Options and check "Put the graph's smaller components at the bottom of the graph."

If you don't like having the disconnect components at the bottom, you can select them all by dragging a box around them while they're at the bottom, set the Layout to Fruchterman-Reingold, and select Lay Out Selected Vertices Again.  (It's under the Lay Out Again menu in the graph pane.)  That might drive the disconnected components to the periphery, although it depends on the nature of the components.  The Circle layout might also help for this second step.  You can also manually move the selected components anywhere you want by dragging a selected vertex.

-- Tony

Coordinator
Apr 12, 2010 at 5:34 PM
Edited Apr 12, 2010 at 6:04 PM

From this:to this:  in just a few clicks.

Many network graphs contain disconnected smaller graphs, called "components", within them.

Most layout algorithms do a poor job of managing to group each component in a separate space. Instead, often, components are laid over one another, suggesting connections that are not real.

A simple solution we have implemented in NodeXL is to offer to sweep up all the smaller components in the graph and order them in neat rows at the bottom of the canvas.  This feature was mentioned in a previous post, but finding the feature may not be obvious:

From the NodeXL network graph canvas toolbar, select the drop down menu next to the selected layout type.

This will display the following menu of layout choices and options:

Select the last option: "Layout Options..." Which reveals:

Select the option: "Put the graph's smaller components at the bottom of the graph".  This dialog also presents other options related to how long the Fruchterman-Reingold layout should calculate and how strong the parameter that governs the force that pushes nodes away from one another should be.  You may find that changing these values improves the FR layout for your data. Here is a graph that is mapped without the component ordering feature selected.  

Many components are scattered around the chart.

This image represents the connections among a population of Twitter users who mentioned the term "Cisco".  This chart was created using the Fruchterman-Reingold layout.  It is noisy and messy given the nature of the graph it has to render. The Harel-Koren layout option is better but has a significant flaw: all the isolates are jumbled on top of one another in that smear at the center of the ring in the upper left of the graph.

Here is the same graph created with the Harel-Koren layout with the added  "Put the graph's smaller components at the bottom of the graph" option selected:

All the many lightly connected Twitter authors are lined up in size order (size is mapped to the number of followers that user has in Twitter).  This removes them from getting in the way of the "giant component", the big connected group of Twitter users who both tweet the word "cisco" but also follow, mention, or reply to someone else who also mentioned the word "cisco".  The core of this group is visible along with some peripheral groups or people who both mention the company and talk to other people who do as well.  The isolates mention Cisco but do not do so as part of a larger conversation (as seen at the time of this snapshot).

An additional tip: nodes are plotted on the screen in NodeXL in an order governed by the "Layout Order" column in the Vertices worksheet.  If we use the "Autofill Columns" feature we can easily set the Vertex Layout Order to the same value to which Vertex Size was set.  This has the effect of lining up the nodes by size, making a kind of histogram.  All the singletons or isolates, the nodes with no connections to any other node, line up first, then the dyads, the triads, and the quads.  Each larger sized component sorts from its smallest to its largest by the size of the largest node in the component.