Dynamically feeding NodeXL with data from a Google Spreadsheet

Sep 13, 2011 at 12:36 AM

Previously I've posted Getting twitter edge data from Google Social Graph API using Google Spreadsheets which is then imported into a NodeXL sheet for analysis. I'm interested to hear what the best way is to create a dynamic link so that when a Google Spreadsheet is updated that data is automatically filtered through to NodeXL. It's possible to publish a Google Spreadsheet as a live html page (here's an example) and then link to this in Excel using the webpage option in Get External Data. The problem is that the two Edges columns in Google Spreadsheet don't start in the excel sheet until B3 and its not easy to get this to fill the Edges sheet in Excel.

I'm considering going down the IGraphDataProvider Interface route but I'm a hobbyist programmer at best and more familiar with Javascript/PHP.

Any suggestions on the best way forward? 



Sep 13, 2011 at 7:06 AM
Edited Sep 14, 2011 at 5:15 AM

Hi, Martin:

Excel’s Get External Data From Web feature is powerful, but it does have its limits. One of them is that it won’t put the data into an existing table, and NodeXL’s edge list is actually a table. You’ve probably already discovered that if you try getting data from the Google Spreadsheet into the edge table, Excel will tell you that “Query results cannot overlap a table or XML mapping.”

Here are some possible workarounds:

1. Have Excel put the data in a plain Excel workbook, then use the NodeXL, Data, Import, From Open Workbook command to copy the data from the plain workbook to the NodeXL workbook. The disadvantage here is that a second step is required.

2. Create a custom NodeXL graph data provider that fetches the data from the Google Spreadsheet, parses it and formats it into GraphML. This can be done in any .NET language, including Visual Basic .NET. It’s pretty easy in .NET to fetch a Web page and parse one of its tables. For more information, see “For Programmers: About NodeXL Graph Data Providers” at http://nodexl.codeplex.com/discussions/71182.

3. Write a VBA macro that fetches the data from the Google Spreadsheet and inserts it into NodeXL’s edge table.

Unfortunately, none of these solutions provides automatic periodic refreshing.  If you went with 3, you might be able to run the macro on a timer's tick event.

Let us know what you end up doing.

-- Tony