Importing raw data to NodeXL

Oct 25, 2011 at 11:55 PM

Hi,

I am totally new to this and I did a fair amount of search to see if anyone else has brought up this topic. My question is: imagine I have a java program that generates a large graph in terms of vertices and edges. Obviously manual date entry in NodeXL workbook is not an option. What is the easiest way to import this raw data in NodeXL? I considered automatically generating a .csv file but it does not seem that NodeXL imports such files.  What I need is a data format where I can use to create a NodeXL compliant data file directly from the java program. Is this possible?

Coordinator
Oct 26, 2011 at 1:12 AM

NodeXL imports from CSV, GraphML, open Excel workbooks, and directly from various social media network data sources including Twitter, Flcikr, Facebook, YouTube, email, and the WWW.

Your Java code could write CSV edge lists which NodeXL could then import.

-

Marc

Coordinator
Oct 26, 2011 at 1:39 AM
Edited Oct 26, 2011 at 2:00 AM

Here are a few options, in order of complexity.

1. Have your Java program write a tab-delimited file.  (I wouldn't use CSV unless you can guarantee that a vertex name will never include a comma.)  NodeXL can't read such a file directly, but you can import it into a standard Excel workbook using Excel Button, Open, and filter on Text Files.  Once you get the file into a standard Excel workbook, you can use NodeXL, Data, Import, From Open Workbook to copy the data from the standard Excel workbook to a NodeXL workbook.  It requires two steps, but if this is just an occasional task, then this is what I would do.

2. Have your Java program write a GraphML file (http://graphml.graphdrawing.org/primer/graphml-primer.html), then import it directly into a NodeXL workbook using NodeXL, Data, Import, From GraphML File.  This requires just one step, and GraphML supports edge and vertex attributes.

3. Implement a "graph data provider," which is a custom plug-in DLL that appears in NodeXL's Import menu.  See http://nodexl.codeplex.com/discussions/71182 for details.  You would have to use one of the Java-.NET bridge technologies if you wanted to do this in Java.  (I've never done that, so I don't know how hard it is.  It's easy in C# or Visual Basic .NET.)

-- Tony

Oct 26, 2011 at 1:41 AM

Marc,

Thank you for your quick reply.

I have two related questions about using csv when importing to NodeXL:

(1) When I click on the import drop down menu I cannot seem to be able to locate any menu item that allows for importing csv format, or at least it is not clear from the name of the menu choices. Could you possibly show me which of them can be used for importing a csv files if any?

(2) When using csv, is there any way to also specify other graph properties, such as edge labels, vertex/edge colors, etc? and if so are there documentation on what format to use to specify such  properties?

 

Best,

K-

Oct 26, 2011 at 1:44 AM

Tony,

Thank you for your reply. I received your note right after I submitted my reply to Mark. I will try out what you described.

Regards,

K-

Coordinator
Oct 26, 2011 at 1:59 AM

K:

If you go the tab-delimited route (or the CSV route, if you enjoy courting trouble), it is indeed possible to import edge attributes when using NodeXL's Import From Open Workbook command.  It's also possible to import vertex attributes when you go that route, but it's convoluted and difficult to understand.  If you have both edge and vertex attributes, GraphML is the simpler, more elegant solution.

-- Tony

Nov 16, 2012 at 12:36 PM

Hi Tony,

I am Very beginer to nodexl, i am generating one excel sheet from my c# code and i want to draw graph according to that excel sheet.how can i give that excel sheet to nodexl programatically can u please explain.

Coordinator
Nov 16, 2012 at 5:01 PM
Edited Nov 16, 2012 at 5:02 PM

Because you are using C#, as opposed to Java, the language the original poster was using, you have another option available to you.  You can use the Excel object model to create a new NodeXL workbook from the NodeXLGraph.xltx template file that gets installed with NodeXL, then populate the Vertex 1 and Vertex 2 columns on the Edges worksheet in the new NodeXL workbook.

There are many resources available that explain how to use the Excel object model to manipulate Excel workbooks.  Here is just one example:

    http://www.codeproject.com/Articles/20228/Using-C-to-Create-an-Excel-Document

And here is the Microsoft documentation for the object model:

    http://msdn.microsoft.com/en-us/library/vstudio/wss56bz7(v=vs.100).aspx

One important point is that you won't be using Application.Workbooks.Add(), without an argument, to create a NodeXL workbook.  Application.Workbooks.Add() creates a plain, empty Excel workbook, which isn't what you want.  Instead you should use the version of the Add() method that takes a template path:

    String nodeXLTemplatePath = "SomeFolder\NodeXLGraph.xltx";
    Workbook newNodeXLWorkbook = Application.Workbooks.Add(nodeXLTemplatePath);

That creates a NodeXL workbook from the NodeXL template file.

You'll find NodeGraph.xltx file in NodeXL's program folder.  The program folder's location varies, but on English 64-bit computers it's at "C:\Program Files (x86)\Social Media Research Foundation\NodeXL Excel Template".

Note that you must have NodeXL installed on the computer on which you are running your own custom C# program, or this technique will not work.  If you want your program to be able to run on any computer, you will have to revert to one of the solutions I offered earlier in this discussion.

-- Tony

Apr 15, 2013 at 12:31 PM
Hello,

I have a similar problem. My program is generating a list of edges in the network which I would like to analyse using NodeXL automatically. But my code is in Java. Is there a way I can give my .csv file to NodeXL programmatically or should I write it in C# only?
Coordinator
Apr 15, 2013 at 8:52 PM
Please clarify what you mean by "analyse using NodeXL automatically." I understand that you have a Java program that creates an edge list. If you could add a button to your program that would do something automatically, what exactly would happen if you clicked the button?

-- Tony
Apr 16, 2013 at 6:09 AM
Hi Tony

Sorry if I'm being unclear. I have a Java code which generates a .csv file containing an edge list. I have manually imported this file into NodeXL using the Import from open workbook option and calculated the graph metrics and the groups. My question is can this importing process be programmed instead of doing it manually? (As I understand from your previous post this can be done using C#, is it possible with Java?)
Coordinator
Apr 16, 2013 at 7:00 AM
Thanks, now I understand what you want to do.

I don't know if there is a way to access the Excel object model from Java. When I did a search for "Excel API Java," all I came up with was something called "JExcelAPI", at http://sourceforge.net/projects/jexcelapi/. Unfortunately, that's an older project that supports only Excel 2003 and earlier, so it won't be of much use for Excel 2007. NodeXL uses the workbook format introduced by Excel 2007.

That doesn't mean there isn't something more recent out there; I just haven't found it.

If it ends up that Java won't work, then C# or Visual Basic .NET might be the way to go.

-- Tony
Apr 16, 2013 at 11:44 AM
Edited Apr 16, 2013 at 11:45 AM
Thanks Tony for your help. Since I'm very new to C#, Java would be preferable. Thanks again for your time.