Collecting Twitter search results - how far back?

Aug 12, 2013 at 4:54 PM
Hi, I am trying to collect Twitter data off a hashtag for an event that took place on June 28-Aug 1, 2013. NodeXL only gives me the most recent handful of tweets from 8/8/13-8/12/13. But Twitter search (online) for the hashtag gives me tweets much further back, even from May 2013. Why does NodeXL not pick those up? And if it cannot do that automatically, is there a way to import those tweets manually? I have storified them, so have direct links to each one. Thanks!
Aug 12, 2013 at 6:24 PM
Edited Aug 12, 2013 at 6:28 PM
Twitter provides NodeXL with tweets in reverse chronological order, with the most recent tweets first. The oldest tweets that Twitter provides depends partly on your "Limit to" setting in the Import from Twitter Search Network dialog box--if you set it to a low number, NodeXL will stop at that number of tweets

But even if you enter a large "limit to" number, Twitter will filter and truncate the list it provides. Here is what Twitter says about that (from https://dev.twitter.com/docs/using-search):

"The Search API is not complete index of all Tweets, but instead an index of recent Tweets. At the moment that index includes between 6-9 days of Tweets."

[The "Search API" is what NodeXL uses to get the tweets from Twitter.] If the search term is very popular, you might not get even 6 days.

One thing you can try is the "until" operator, which you can learn about by clicking the "Advanced search help" in the Import from Twitter Search Network dialog box. Here is an example:

NodeXL until:2013-01-01

That might give you an empty graph, though, for the reason I just mentioned.

-- Tony
Aug 12, 2013 at 6:44 PM
Tony, thanks! That explains a lot.

I will try the search operator, of course (and it looks like "since" will be more useful to me than "until". Although I wonder if this means the Twitter web search uses a different mechanism than the Search API – since tweets do show up if I search through the web interface.

If I do have the tweets archived in Storify, I guess the only way would be to manually enter them into the vertex table, correct?
Aug 13, 2013 at 4:27 PM
Edited Aug 13, 2013 at 4:28 PM
Can you give me the URL for the Twitter web search you're using for comparison? I want to check it out.

I don't know how you would go about merging tweets obtained from Storify with those obtained through NodeXL. NodeXL does some text processing on the tweets it gets from Twitter. It scans the tweets for mentioned and replied-to user names and adds rows to the Edges worksheet based on what it finds, so it wouldn't be a matter of just copying additional tweets into the NodeXL workbook.

-- Tony
Aug 13, 2013 at 4:35 PM
Tony,

That is what I was concerned with – Storify saves the tweets and links to them, but they're probably not in a format NodeXL would read.

The Twitter hashtag search I'm doing is https://twitter.com/search?q=dsst2013&src=typd&mode=realtime (the hashtag is #dsst2013)

Thanks!

-- Tanya
Aug 13, 2013 at 5:10 PM
Tanya:

Yes, Twitter is drawing from a larger set of tweets for its own search page than it exposes via its Search API. Unfortunately, there is no way for NodeXL to get to that larger set.

-- Tony
Coordinator
Aug 13, 2013 at 6:05 PM
The data sets you are interested in may be included in these workbooks.

https://nodexlgraphgallery.org/Pages/Default.aspx?search=dsst

Regards,

Marc
Aug 13, 2013 at 7:05 PM
Marc, hi!
Thanks, I've already downloaded these datasets (workbooks) - they provide some of the data, but not the complete set (no data beyond July 31 and before July 25). I guess for the purposes of the exercise I'm doing, this is the best option I have.

Thanks!
-- Tanya