How to specify the time period for data collection from Twitter using nodeXL?

Apr 5, 2013 at 12:07 AM
Is there a function in the nodeXL importer to enable the users to collect data from Twitter within a certain period time? For example, if I want to examine "#flu" on Twitter from Jan 1, 2013 to Feb 1, 2013, how can I do this? Thanks a lot!
Apr 5, 2013 at 12:11 AM
Twitter does not allow queries for data that is older than about a week.

You will need to connect to services that sell archival Twitter data (like Topsy, GNIP, Data Sift) for data older than about a week.

NodeXL can help you schedule the regular collection of data from Twitter starting now. In a few weeks you can create a useful data set and, if you continue for long periods of time, often capture the impact of external events.


Apr 5, 2013 at 5:28 AM
Hi Marc!

Thank you very much for your reply. I have a follow-up question regarding using NodeXL. I wonder if I can conduct statistical analyses for network on NodeXL (such as the function of QAP in the UCINET )? I'd like to conduct some basic network analyses with correlation and would like to know if NodeXL provide these tools, or I need to bring the data to UCINET to run the stats? Thank you very much for your time!


Apr 5, 2013 at 1:02 PM

QAP is not yet implemented in NodeXL.

You may need to start your data analysis in UCINet and export it to NodeXL once QAP metrics have been calculated.

NodeXL does calculate a range of network metrics, see NodeXL>Analysis>Graph metrics for details.

Apr 5, 2013 at 9:35 PM
Hello Marc!
I see. Thanks again for your reply. I have another question regarding my data from Twitter through nodeXL. By using Twitter search network, I used "Truvada" as the keyward to collect data. As you can see in the file, however, the edges which represent the relationship between two nodes look a bit inconsistent to me. Most of the cases in the relationship column, if it's labeled as "Tweet", it seems to me that it represnets the self-reflective tie within oneself, which means the person simply tweeted something on his own wall. While if it's labeled as "mention" or "replied to", it means that the person put "@XXXX" in the tweets. What's confusing to me is that, for some cases which are labled as "Tweet" are actually "RT" from other people (I can see the original tweets in the column "Tweet"), but why these "RTs" didn't count as "mention" or "replied to"? Can you explain how nodeXL categorize these relationships? I am confused and this makes it difficult for me to create different networks based on different relationships. Thanks!

Apr 5, 2013 at 11:42 PM
Edited Apr 5, 2013 at 11:54 PM
Hello, Selene:

I can explain the Relationship types in the NodeXL search network. I know it can be pretty confusing.

First, the people in the network are those who have tweeted the search term you entered. If you specify "NodeXL," for example, then NodeXL will ask Twitter for all recent tweets that contain "NodeXL." It will create one (and only one) vertex for each tweeter in that set of tweets, and then analyze the text of the tweets to determine the relationships among the tweeters.

If a tweet starts with the name of someone else in the network, which means that the tweet is a reply to that person, then a "Replies to" edge is created between the tweeter and the replied-to. Here is an example of that:
@zacklangway Identify key people in the NodeXL SNA map, be sure to connect with them
If a tweet includes (but doesn't start with) the name of someone in the network, then a "Mentions" edge is created. Example:
Let's talk about NodeXL with @marc_smith
Otherwise, a "Tweet" edge is created. That's shorthand for "this tweet contains the search term, but it wasn't a Replies-to or Mentions." Example:
Created NodeXL graphs today.
Note that a Replies-to or Mentions edge is created only if the replied-to or mentioned person also tweeted the search term. The following tweet is a reply, but no Replies-to edge is created because johndoe did not tweet "NodeXL" and so he is not part of the network.
@johndoe  See NodeXL graph at
-- Tony
Apr 6, 2013 at 4:43 AM
Edited Apr 6, 2013 at 4:57 AM
I should add that as a result of the "edge creation steps" I outlined earlier, a single tweet can result in:

Exactly one "Tweet" edge


Zero or one "Replies to" edges AND zero or more "Mentions" edges.

-- Tony
Apr 6, 2013 at 5:12 AM
Hi Tony!

Thanks for the detailed reply. It helps a lot! I understand that "RT" stands for "retweet". I guess my previous question about retweet is that it seems to me that nodeXL counts "retweet" as "mentions" because the retweet as follows was labeled as "mentions":

RT@selenehu: nodeXL is a good tool.

But this tweet is actually the person's re-tweet from @selenehu, instead of his mention about "selenehu". I am just a bit confused about this because I thought "retweet" and "mentions" are different, but nodeXL treats them as identical?!

Thank you very much for your clarification!


Apr 6, 2013 at 5:29 AM
The NodeXL Twitter Importer does not create an edge on the basis of the presence of "RT" in the tweet text.

It creates "Followed", "Tweet", "Reply", and "Mention" edges.

A "RT" is often a "Mention" (but not all "Mentions" are RTs).

Apr 6, 2013 at 5:40 AM
I got it! That's helpful! Thank you for your clarification, Marc!


Apr 15, 2013 at 9:19 PM
Hi Marc,

I have one additional question regarding the data collected through NodeXL. I am wondering if I can combine data which were collected from different dates. For example, I collected data on Twitter on three different days so that I have three different dataset (though some nodes may overlap in the three datasets). Do you have any function in NodeXL that I can use to combine the three dataset into one spreadsheet? I greatly appreciate your help! Thank you!

Best regards,

Apr 15, 2013 at 9:26 PM
Edited Apr 15, 2013 at 9:26 PM

NodeXL does not have an automated merge function because not all networks can be concatenated sensibly.

You may find this thread to be useful, it covers a similar task:

Short answer: you can append edges lists manually by copying and pasting them to the bottom of the Edges worksheet.

Caveat, this can be error prone if you duplicate edges. Also, vertex attributes for each time slice must be adjudicated, which attributes will be the attributes for that vertex?