how to merge several Nodexl files together

Jan 23, 2011 at 5:42 AM

Hi.

I make several youtube nodexl files by import youtube and try to merge them into one file. However, when I copy all rows from 1 file to another file, Nodexl prompts a warning message to rename to column name. I rename it and the results seem to be incorrect and some functions in "dynamic filter" like adjustment of "views" are disable. I think I merge the nodexl files incorrectly.

Can anyone tell how to merge Nodexl files in a correct way or some to merge the nodexl files automatically?

Jan 23, 2011 at 6:19 AM

Dear all,

I find the error message is "A formula or sheet you want to move or copy contains" the name 'ValidVertexLablePositions', which already exists on the destination worksheet. Do you want to use this version of the name?"

Can anyone help me to solve this problem?

Jan 23, 2011 at 4:30 PM

Excel does some odd things when you try to copy table data from one workbook to another using a simple copy-and-paste.  I can't quite figure out what it's doing.

But I do know of two ways to get things to work the way you want them to work.  If you don't mind importing the YouTube networks again, here is the simple way:

1. Import the first YouTube network.

2. Uncheck this checkbox in the Excel ribbon: NodeXL, Data, Import, Clear NodeXL Workbook First.

3. Import the second YouTube network.

Because you chose not to clear the NodeXL workbook before importing the second network, the second network will get added to the first network instead of replacing it.

If you can't import the YouTube networks again, here is one way to copy-and-paste data from one workbook to another.  It requires a few more steps than usual.

1. Select and copy the data from the first workbook.

2. Click the cell in the second workbook where you want to paste the data.  DO NOT PASTE THE DATA.

3. In the Excel ribbon, select Home, Clipboard, Paste, Paste Special.

4. In the Paste Special dialog box, click Values and click OK.

5. In the Excel ribbon, select Home, Clipboard, Paste, Paste Special.

6. In the Paste Special dialog box, click Formats and click OK.

-- Tony

 

Coordinator
Jan 23, 2011 at 4:56 PM
Edited Jan 23, 2011 at 4:57 PM

Another path to append two existing workbooks is to follow Tony's first suggestion to uncheck the checkbox in: NodeXL, Data, Import, Clear NodeXL Workbook First.

Then, open both the target and source workbooks in the same Excel session.

Select the target workbook and select NodeXL, Data, Import, Import from Open Workbook.

This should append your source data to the target workbook.

-
Marc

May 31, 2012 at 1:53 PM

Ciao to all of you,

I am producing an hourly map for an online campaign linked to the earthquake that shook us all here in Italy.

Output files are scheduled to be produced in graphm and then converted to nodexl workbooks.

I am trying to merge several hourly snapshpts into 1 file so i open all the workbooks and then import them through:

NodeXL, Data, Import, Import from Open Workbook

I am just not sure about the result: it does not seem to have merged the two workbook i was trying to use. I see new columns labeled as "imported column 1, 2,...n" being added...any idea what i am doing wrong?

If I do the copy and paste instead i see my workbook increasing in size and all info under the correct column. It seems correct but i have almost 1500 ties for every graphml file so it is hard for me to be sure 100% that the operation performed. however, the number of rows corresponds to the sum of the 2 separate workbooks.  Assuming this operation went fine, i have 2 questions:

  1. if i get vertices from the edge through prepare data, the number of vertices does not change. is there something wrong i am doing or does it mean all nodes in time 2 were already there in time 1 (where time 2= time 1 + 1 hour)
  2. to avoid duplicate tweets and relations, can i lean on the imported code at the end of the workbook? is that number the unique tweet identifier? if so, i should look for duplicates and, therefore, be able to clean up my data

thanks a lot

best

elena

May 31, 2012 at 2:22 PM

Hi all,

after several attempts, i concluded with myself that perhaps the best way to append different graphml files is to import them in sequence without cleaning the workbook.

however, it would be precious if you could help me out with the issue of checking for duplicate relations based on the same tweet to clean up data :)

 

thanks 

elena

May 31, 2012 at 5:37 PM

Elena:

I assume that your problem can be summarized like this: You have two or more rows in the Edges worksheet that came from the same tweet, and you want to get rid of all but one of them.  I also assume that you are using a recent version of NodeXL.  (Earlier versions don't have the "Imported ID" column mentioned below.)

One way to do this is as follows:

1. Select a cell in the Edges worksheet.

2. In the ribbon, go to Data, Data Tools, Remove Duplicates.

3. In the Remove Duplicates dialog box, unselect all columns except "Imported ID."

4. Click OK.

Each tweet has a unique "Imported ID," so these steps will remove all but one of the rows that came from the same tweet.

As a side effect, these steps will also remove all but one of the rows that is a "Followed" relationship, because all such rows share the same Imported ID, which is an empty Imported ID.  If you have Followed rows that you need to keep, there are additional steps I can give you that should fix that.

-- Tony

Jun 13, 2012 at 7:49 PM

Dear Tony/Mark, 

I followed your advice in merging four twitter files. 

I first followed Mark's advice by 

1) uncheck the checkbox in: NodeXL, Data, Import, Clear NodeXL Workbook

2) open both the target and source workbooks in the same Excel session, and 

3) Select the target workbook and select NodeXL, Data, Import, Import from Open Workbook

while I could merge vertex 1 and vertex 2, values in other columns were lost.  I also tried by clicking "is vertex 1 the property column", which failed too in merging the rest of the columns. 

I then tried Tony's 2nd advice - copy and paste special. Tony suggested "4. In the Paste Special dialog box, click Values and click OK". I noticed that if I only clicked "values", some of the time format would be lost. So that I clicked "values and format".  While I  was able to merge all the 4 files in one, NodeXL refused to analyze the newly appended ties.  When asked to do filter or visualization, it only worked with the original ties before the merge.  

Could you please help me figure this out as I hope to keep the rest of the columns with tweets and other info? 

Also, when merging, do you also manually cut and paste not only the edge sheet but also the vertex sheet? 

many thanks - 

wendy 

Jun 14, 2012 at 12:59 AM
Edited Jun 26, 2012 at 3:56 PM

Wendy:

The Import from Open Workbook feature will not import columns from the Vertices worksheet.  It imports vertex columns that are included in an edge list, but that's a different story and it's not what you need.

When I said that Excel does some odd things when you try to copy table data from one workbook to another, I really meant it.  I hadn't tried the "values and number formats" option in the Paste Special dialog box until I read your post, and now I see that that particular option silently corrupts the table that you paste the data into.  That explains why NodeXL won't do anything with the appended data.  I don't know why Excel does this, but it's out of NodeXL's control.

The "values" option does work, although as you noted, some formatting gets lost.  That option is the best solution I can offer.  And yes, if you want to retain vertex attributes, you have to copy and paste both the Edges and the Vertices worksheets.

I don't actually recommend any of this business of appending one workbook to another, because the meaning of such an action is not at all clear and the results are unpredictable.  What if you append vertices to the Vertices worksheet and you end up with two or more rows for the same vertex?  Which row would you expect NodeXL to use?

-- Tony