Nodexl doesn't recognize data

Apr 5, 2013 at 4:04 AM
Hello, I am trying to compile 7 databases into a single one, but I have had problems with two of the mentioned databases. Once they are added, the program simply ignores the nodes and relations and, although they are shown in the excel rows, they are not in the graph nor considered in the calculations. Any help will be useful. Thanks in advance.
Coordinator
Apr 5, 2013 at 4:10 AM
Hello!

This may be hard to diagnose based on the information you have provided.

Some thoughts that may help: are the added edges duplicates of prior edges? Are there any blank rows in the edges worksheet?

What is the "Visibility" of the rows that are being skipped (you may need to select NodeXL>Show/Hide>Workbook Columns>Visual Properties to display this column)? Is it "1"?

How are you concatenating these data sets?

With some exploration I bet we can uncover the issue!
Regards,
Marc
Apr 5, 2013 at 5:04 AM
Xanat:

If you are using a simple copy/paste to merge rows from one NodeXL workbook into another, you may be running into a known Excel bug that prevents the copy/paste from working properly. Here is how you can work around that bug:
  1. Select and copy the data from the first workbook.
  2. Click the cell in the second workbook where you want to paste the data. DO NOT PASTE THE DATA.
  3. In the Excel ribbon, select Home, Clipboard, Paste, Paste Special.
  4. In the Paste Special dialog box, click Values and click OK.
  5. In the Excel ribbon, select Home, Clipboard, Paste, Paste Special.
  6. In the Paste Special dialog box, click Formats and click OK.
There is a long discussion about this problem at http://nodexl.codeplex.com/discussions/242896, if you are interested. In that discussion, I mentioned that I don't recommend attempting to merge vertex rows this way unless you are sure that no duplicate vertices will result.

-- Tony
Apr 6, 2013 at 5:24 AM
Edited Apr 6, 2013 at 5:25 AM
MarcSmith, thanks for your response! Some of the edges are duplicates. In the "Visibility" column, it does not matter if these edges have a blank value or skip, they simply do not show up. I experimented putting the value 1 in one of the problematic edges and the program showed an error [comeexception] from Hresult: 0x800aC472.
Apr 6, 2013 at 5:24 AM
tcap479, many of the rows contain duplicates.
Apr 6, 2013 at 5:55 AM
Xanat:

I'm afraid I don't understand your latest post. If I understood your original question, you merged two workbooks and found that NodeXL ignored some of the rows in the combined workbook. I know one way that can happen--namely, if you do the merging with a simple copy/paste, Excel does the paste part incorrectly and prevents NodeXL from "seeing" the pasted rows. It's a known bug in Excel.

Did you try my alternative copy/paste technique, and did that fix the problem? I can't tell if your original problem is now fixed and you want to discuss the duplicates, or if some of the rows are still being ignored.

-- Tony
Apr 6, 2013 at 6:03 AM
Edited Apr 6, 2013 at 6:03 AM
Hi, Tony.

The thing is, if I do what you suggest only the columns containing vertex name and the first two columns with edges are pasted correctly. In order to have all the information I need (Followers, Tweet time, etc)... should I first do the steps you suggested and then copy paste the rest of the columns?

Thanks in advance.
Apr 6, 2013 at 9:21 PM
Hello, Xanat:

Actually, I don't recommend trying to do what you're trying to do, not at all. If the columns in your workbooks are identical, and you can figure out what you need to paste where, and you can do it in such a way that you don't trigger an ugly bug in Excel that makes it look like the paste operation was successful when it really wasn't, and there are no duplicate vertices in the workbooks, or you understand the consequences of having duplicate vertices, then you might get this to work. But multiply all that by seven workbooks, and I'd say the chances of getting everything to work properly are slim to none.

NodeXL does not provide a solution for merging complex data sets. The only thing I can suggest is to export the data from the workbooks, merge them using other software (preferably something that can deal intelligently with duplicates), and then import the merged results into a new NodeXL workbook.

-- Tony
Apr 8, 2013 at 3:00 AM
Hi, Tony.

Which software do you recommend to do the trick?

Regards.
Apr 8, 2013 at 6:02 AM
So, in the future, which would be your suggestion? To open an existing dataset and to order the program to collect the data on the same spreadsheet on successive days?
Apr 8, 2013 at 6:06 AM
I was able to append all of them! What I did was to copy paste the columns I need ignoring the ID column. Everytime I added a new database, I erased all duplicated vertices and edges with the exact same tweets (no need for duplicated tweets in here) and saved it with a different name. The calculations also look right. Thanks for the help!