NodeXL performance : is it multi-threaded?

Aug 8, 2014 at 11:16 AM
Hi,

NodeXL is taking forever to show graphs when I use it on a large dataset (50,000 edges and 15,000 vertices). CPU utilization is 100% for hours.

I was wondering whether its performance will improve if I run it on a server with multi-threading enabled excel. Is it multi-threaded? Which features support the multi-threading?

Please do let me know any other tips and tricks to improve the performance.

thank you,
Samatha
Aug 11, 2014 at 3:31 PM
Edited Aug 11, 2014 at 3:36 PM
Hello, Samantha:

You've encountered NodeXL's Achilles heel, which is its poor performance with large graphs. However, I'm surprised that it's taking hours in your case.

Do you have bundled edges* turned on? Bundled edges are very CPU intensive, and turning them off should significantly speed up the showing of the graph.

I've never tried it, but I would be surprised if turning on multithreading in Excel improved anything. Most of NodeXL doesn't use Excel's calculation engine, and the engine doesn't come into play at all when the graph is shown.

-- Tony

* Right-click the graph pane, select Graph Options from the right-click menu, select the Edges tab in the Graph Options dialog box, and look under "Curvature."
Coordinator
Aug 11, 2014 at 6:26 PM
Hello!

NodeXL does have performance limits. One way to expand its capacity is to offer more RAM to the system. 8GB of RAM is often NOT enough for many large graphs while more than 16GB is often not fully used.

Turning off some of the more performance intensive tasks is also a way to manage a bigger data set. For example, not using images for vertex shapes is a big savings. Turning off "Sub-graph Images" and some of the more expensive network metrics (Eigenvector? Closeness?) can also speed things up.

If the bulk of the time is being spent in the rendering of the network, you might want to collapse some groups or "Skip" the display of some part of the network.

Regards,

Marc
Aug 12, 2014 at 5:30 AM
Hi Tony,

You made my day. Yes, I was using bundled feature and changing it to Curvature improved quite allot. With my current data set it is taking less than a minute to refresh though calculating graph metrics is still time consuming. The culprit is calculating "Twitter search network top items" not the eigenvector nor closeness centrality metrics.

BTW, do you have any idea why some settings are not taken over when they are changed in NodeXL. For example, I wanted the vertex label to appear at "Bottom Center" of vertex shape. I set it using Group Option -> Other -> Label -> Vertex Labels. But whatever I am setting here is not having any effect. It is placing the labels at "Middle Center". I could not find what is overwriting my choice.

Cheers,
Samatha
Aug 12, 2014 at 3:50 PM
Edited Aug 12, 2014 at 3:50 PM
I'm glad to hear that there has been some improvement, Samantha.

Regarding the labels, I'm wondering if you're changing the correct option. NodeXL has both vertex labels and group box labels, and each has its own options. To reposition vertex labels, you would follow these steps:

1) Go to Graph Options.

2) In the Graph Options dialog box, select the Other tab.

3) On the Other tab, click the Labels button.

4) In the Label Options dialog box, look in the "Vertex labels" section for a drop-down called "Label annotations Position." Set that to "Bottom Center".

To reposition group box labels, you would follow steps 1 through 3, but in step 4 you would use the "Position" drop-down in the "Group box labels" section of the Label Options dialog box.

-- Tony