Importing data from Twitter. NodeXL to give not more than 18,000 tweets or 7-8 days of data, which ever comes first.

Jan 13 at 3:53 PM
Hi

I am trying to collect data from twitter network using a certain keyword. The problems are:

1) I key-in search word e.g. "amazon lang:en", and I got approximately 2300 data/tweets. I saved the data and run NodeXL again using the same keyword (which is "amazon lang:en" here) but it only gave me around 228 data/tweets. I thought it's supposed to give me the same amount of tweets or more, isn't it?

2) From those 2300 data/tweets, I can see from the tweet date column that those tweets were from, say 13/01/2016 14:14 to 13/01/2016 15:15. I'm confused why the program didn't give me a complete data which is here not more than 18,000 tweets or 7-8 days of data, which ever comes first.

May I know how to fix this?
I'm using basic NodeXL.

Thanks in advance.
Coordinator
Jan 13 at 6:30 PM
Hello!

Thank you for the interest in NodeXL!

NodeXL Basic provides limited access to the Twitter API. Only 2000 tweets can be extracted from Twitter in each query using NodeXL Basic.

Twitter limits all data access through its API.

Twitter will not deliver more than 18,000 tweets via the REST API in any query.

Twitter will not deliver more than 7 or 8 days of data.

NodeXL Pro provides access to the full Twitter API and is limited to 18,000 tweets or 7-8 days - which ever comes first.

When you perform the same query multiple times, you may get very different results. Just as you can never enter the same river twice, Twitter may have varying levels of activity or server load that leads them to deliver different sets and volumes of tweets.

Only a commercial relationship with one of the Twitter data resellers can get closer to a "complete" data set.

Note, different platforms have different approaches to data access limits. For example, Facebook provides access to data based on a rate limit but does not impose a historical limit - you can collect Page and Group data back to the dawn of time (or 2007).

Regards,

Marc
Marked as answer by MarcSmith on 7/5/2016 at 7:05 PM
Jan 15 at 4:24 PM
Hi Marc. Thanks for your kind explanation.

I'm sorry but actually, I'm using NodeXL Pro. I just performed a query 'amazon lang:en' and NodeXL gave me only 780 tweets from 15/01/2016 15:48 to 15/01/2016 15:50.

Before this, I could get like 3000++ tweets per query . I wonder if it is something wrong with my version because it is very unlikely to get only 700 tweets for 'amazon lang:en' because the PRO program provides full access to Twitter API, right? Looking forward for your reply

Thank you in advance
Coordinator
Jan 15 at 4:40 PM
Hello!

I suspect the variation in data volumes returned by Twitter is caused on Twitter's side, not NodeXL Pro.

NodeXL Pro sends your query to the Twitter public API. It parses whatever Twitter returns.

In many cases a query with a LARGE volume of content will only return a small amount of content.

The higher the volume of content, often the smaller the volume delivered by Twitter.

One way to validate this is to try more detailed queries where the results are lower in volume.

Regards,

Marc
Mar 4 at 7:17 AM
Hi,

I also work with NodeXL Pro.

I do not understand why the search has NodeXL pro is limited by the Twitter API and tool TAGS project by Martin Hawksey not limited.
The analysis that I do with NodeXL are not 100% real not have all the tweets.
It is not normal that if Import "from twitter search network" with esta word: #ConstruyendoPodemos with Node XL extract 150 tweets me but TAGS / extract 1.800 tweets. No less than 10%.
any solution?

Sorry for my bad English.

Best regards,
jb
Coordinator
Mar 6 at 10:47 PM
Hello,

there is a "Limit to" option in the "Import from Twitter Search Network" dialog box. Please set that limit to the maximum number (18.000). I just tried your query and got 5000 tweets.

Regards,
Arber
Mar 7 at 5:53 PM
After trying several times he captured all tweets.

Regrds,
Barri
Apr 5 at 3:00 PM
Continued having trouble extracting data from NodeXL
The hashtag #PodemosEsMentira in TAGS gives me more than 15000 tw's and Node XL over 300 tweets.
How is possible?

Best Regards,
jb
Jun 7 at 7:49 PM
Hi,

I had a similar question regarding the completeness of the data set that is delivered when running a twitter search network query. Perhaps someone can help. If I un-tick the "clear notebook before data is imported" then run the same search multiple times, is this a way to gradually increase the completeness of the data? Or does this just duplicate the results? Do I then need to merge and weight duplicate edges?

Also, all my searches come back with ######## in the columns for relationship date and tweet date... is this fixable?

Many thanks,

Ben
Jun 7 at 8:11 PM
Just to add a bit more detail... I am trying to get data on all relationships network associated with the current battle in Fallujah, using this search term. But I have noticed that a number of the most important nodes in this network, people who I know have had tweets retweeted, mentions etc over the last few days, don't appear in the data I get back.. I am just wondering if I need to keep running the search with "clear notebook before data is imported" unticked repeatedly in order to get a complete data set?

But I am also concerned that if this brings back duplicate edges, then won't weighting these duplicates ruin the data set since some of these duplicates will just exist because I have run the search multiple times and it has thrown back the same data...?
Coordinator
Jun 7 at 9:12 PM
Edited Jun 7 at 9:12 PM
Helllo!

The Excel columns with "######" contain data that is too wide to be displayed in the column.

Please expand the width of those columns to reveal the complete data.

You can double click the column dividers to expand the column to the width needed for full display of the contents.


Marc
Coordinator
Jun 7 at 9:17 PM
Edited Jun 7 at 9:17 PM
"If I un-tick the "clear notebook before data is imported" then run the same search multiple times, is this a way to gradually increase the completeness of the data? Or does this just duplicate the results? Do I then need to merge and weight duplicate edges? "

If "Clear notebook before data is imported" is not checked new data will be appended to the workbook.

This may import duplicate edges.

You may want to run NodeXL>Data>Prepare Data>Merge Duplicate Edges prior to analysis of the resulting composite workbook,

Regards,

Marc
Jun 7 at 9:49 PM
Hi Marc,

Thanks for these helpful tips, can't believe it was so easy to fix the time-date issue!

I am still a bit unclear about the nature of the data NodeXL is throwing up. Does each search basically throw out a net which captures random portion of the total data set for a search term over the last 7 days, and by repeating the search over and over I can gradually build up a more complete data set?

Many thanks,

Ben
Jun 7 at 9:51 PM
I've also noticed all my results are from today's date, if I keep searching will it start to throw up results from further back in time? Ideally, I would want it to go back several days...
Coordinator
Jun 7 at 10:57 PM
The NodeXL Twitter Importer passes a query to the Twitter Search API and receives back all that Twitter will return.

This is the Twitter REST API documentation:

https://dev.twitter.com/rest/public

In particular the SEARCH API is documented here:

https://dev.twitter.com/rest/public/search

Where they note:

Rate limits
The GET search/tweets is part of the Twitter REST API 1.1 and is rate limited similarly to other v1.1 methods. See REST API Rate Limiting in v1.1 for information on that model. At this time, users represented by access tokens can make 180 requests/queries per 15 minutes. Using application-only auth, an application can make 450 queries/requests per 15 minutes on its own behalf without a user context.

...the Search API is focused on relevance and not completeness. This means that some Tweets and users may be missing from search results....

And:

https://dev.twitter.com/rest/reference/get/search/tweets

In short, Tweets are not returned 100%, nor will Twitter deliver tweets older than about 7-8 days.

Regards,

Marc
Marked as answer by MarcSmith on 7/5/2016 at 7:05 PM
Jun 8 at 3:37 PM
Hi Marc,

Thanks for clearing this up for me, big help!

Ben