Twitter Search - not extracting all mentions from tweets

Jan 4, 2013 at 1:24 PM
Edited Jan 4, 2013 at 1:26 PM

When doing a Twitter search, I've realized that NodeXL is not extracting all mentions, and instead is classifying these tweets as a self-loop. Therefore, this appears as a node with an edge pointing back to itself, when it really should be pointing to another node on the graph.

I've highlighted the rows in this sample workbook I'm talking about, where an edge appears as a self-loop when it shouldn't be. I've created a basic macro to run in a seperate workbook to extract all replies and mentions, but then I lose all the excellent metadata.

I hope this is something that is easily remedied, and not just the way NodeXL retreives the data from Twitter.




Jan 4, 2013 at 3:34 PM
Edited Jan 4, 2013 at 3:35 PM


NodeXL doesn't add an edge for each mention.  It only adds an edge if there is a vertex for the mentioned; in other words, if the mentioned also tweeted the search term.  This is by design.

The search network is created in two steps:

1. Add a vertex for each person whose recent Tweet contains the search term.

2. Add edges for the relationships that exist among those people.

-- Tony

Jan 4, 2013 at 4:31 PM

Thanks for the response Tony.

I understand how the search network is created. However, it seems like there is a lot of under utilized data with that approach. The majority of these tweets that end up being a self-loop mention another user, but they are just not captured in the 1,5000 maximum users limit. Instead, why not add a vertex for everyone who either tweeted the search term, or is mentioned/replied to in a tweet with the search term, and then draw the edges between those people?

Jan 4, 2013 at 11:52 PM
Edited Jan 5, 2013 at 6:23 PM


I can see how that might be useful in some cases.  However, it would complicate things considerably, and because the NodeXL team has limited resources, we generally strive for simple solutions.  I think the current design has proven useful for many people, but unfortunately it doesn't cover all use cases, including yours.

Here are some of the complications that would arise if we extended the search network in the way you're suggesting:

1. The simple concept of "specify your vertices, then specify how to connect them with edges" would be replaced with the cumbersome "specify some vertices, then specify some edges types, which might add more vertices, and those additional vertices would represent something different from the vertices you originally specified".  Alarm bells usually go off in my head when it takes that many words to describe a feature.

2. For consistency, shouldn't the same thing be done for "follows" relationships?  Namely, if you check "follows relationships," then shouldn't the people who follow the people who tweeted the search term also be included as vertices, even if those followers didn't tweet the search term themselves?  If so, then the graph has now gotten completely out of control.

3. If you check "mentions" and NodeXL needs to include all mentioned people as vertices, it has to make additional calls to Twitter to get information about those people, because that information isn't included in Twitter's search results.  That will slow things down, and it will slow things way down if it causes you to start running into Twitter's rate limits.  When that occurs, NodeXL has to pause for an hour before Twitter will provide additional information.

4. I'm not convinced the resulting graph would be informative.  How would you tell the difference between people who tweeted the search term and those who were merely mentioned?  (Maybe that's not important; I don't know.)

-- Tony