Record Tweets from specific search or user

Sep 12, 2013 at 4:35 AM
Hi all,

I'm a new Node XL user. I'm french PhD candidate in CIS / intelligence studies. My thesis project will discuss about the social media capacity for intelligence purposes, especially for strategic & tactical purposes in operations (conflict prevention, particular area or group survey, etc...). I will mainly use for this research SNA theory, Psychology, human & social sciences, DATA mining & fusion theories, & geography.

I ask through Twitter some questions to Mark Smith, which was very kind to answer. He suggested me to post further questions here.

The aim is to focus on a object of research. I choose to survey Syria situation. The aim would be to extract & record continuously Tweets from Search, Lists or specific users to demonstrate my theory. I understand that Twitter API & Node XL imposes some limitations.

Could you explain so far ?

@Soc_Net_Intel
University of Toulon - France
Coordinator
Sep 12, 2013 at 6:29 PM
This thread is a good place to start:
Sep 12, 2013 at 7:18 PM
Dear Marc,

Just back home here it's 21h30. I will have a look tonight !
Preparing my thesis subject. Will drop you an email if you are interested in.

Kind regards,

@Soc_Net_Intel
Sep 12, 2013 at 9:51 PM
Dear Marc,

Excellent link. I understand that with limited object of research I can schedule in a sufficient time schedule (not continuously) some network configuration recordings using NodeXL Network Server & Windows task Scheduler. That could be sufficient for my research.
I will manage with that this week-end. I'll write you back.

Thanks :-)

@Soc_Net_Intel
Sep 12, 2013 at 9:57 PM
Edited Sep 12, 2013 at 9:58 PM
If you use the NodeXL Network Server program and you choose to save each network to a NodeXL workbook (as opposed to GraphML), make sure you schedule your tasks so they don't overlap in time. Here is a note from the NodeXLNetworkServerFAQ.docx file:

~~~~~~~~~~~~
IMPORTANT: If you schedule more than one task and you save each network directly to a NodeXL workbook, you should specify times that will prevent the tasks from overlapping each other. For example, if you have three such tasks and they each take about 15 minutes to run, you should not schedule all of them to run at noon each day. Instead, schedule the first to run at noon, the second at 1 PM, and the third at 2 PM, for example.

This will avoid random problems that can occur when multiple copies of Excel are simultaneously attempting to perform automated tasks.

This restriction does not apply if you save each network only as GraphML.
~~~~~~~~~~~~

-- Tony
Sep 12, 2013 at 10:08 PM
Dear Tony,

Thank you. I need to look after the estimated time of calculation of NodeXL application in different cases: specific user, list and search. I think it would certainly depends on the parameters selected. Do you know how many tweets at maximum I would be able to record from specific user, specific list and specific search.

The aim would be to use R data mining, image & video mining programs to analyze the tweets (structured & unstructured data) in order to evaluate the potentiality of social media networks. The SNA with NodeXL would demonstrate the capacity of network theory & petrology to find interesting communities inside a larger network, and potential interesting nodes (social media user) and edges (social media conversation). The record should be in this case continuous for better results. If it wouldn't be possible, the best schedule times (one record per hour) during several weeks on a specific area or subject should be sufficient.

Thanks in advance for further details.

@Soc_Net_Intel

PS: Sorry for my bad english :-(
Sep 14, 2013 at 6:52 PM
Edited Sep 14, 2013 at 7:01 PM
With NodeXL's Twitter search network, where you get people who have recently tweeted a specified search term, NodeXL can collect several thousand tweets in a few minutes. You can ask for many more tweets than that, but the number you'll actually get is up to the Twitter servers--sometimes you'll get more, sometimes you'll get far fewer. And you're unlikely to get more than a week or so of tweets. Twitter says "between 6-9 days of Tweets"; see https://dev.twitter.com/docs/using-search.

The other two networks--the Twitter user network and Twitter list network--are severely limited by what Twitter calls "rate limiting," where NodeXL can make only so many information requests per 15-minute period. In the user network, where you want to know who is following whom, NodeXL can ask for information about only 15 users per 15-minute period before it has to pause before asking for more. Depending on the size of the network you ask for, you might have to wait hours or even days to get the entire network. And to make matters worse, Twitter sometimes refuses further requests even after NodeXL has waited the required period, and all your waiting goes for naught.

The list network is similarly hamstrung.

The result is that the search network is useful, but you might find the user and list networks not suitable for your needs.

-- Tony

(By the way, the user and list networks were designed before Twitter tightened their rate limits earlier this year. The networks used to be far more practical.)
Sep 14, 2013 at 8:25 PM
Dear Tony,

As always, thank you for your precise answer.

I can now precise my research strategy with Node XL solution on Twitter Network. I will certainly try some samples with Twitter Search API to get some Tweets recording on a specific subject for my research object. I would probably just use Twitter user network in order to analyze social & human relations between specific nodes. The matter is that you could not separate further analysis between social & human relations (edges) and human sources (nodes), due to the relation between reliability, authority & objectivity of a source (nodes) and credibility, accuracy, currency, objectivity of information (edges between nodes).

I was using a free service to backup user tweets, which now currently not working due to Twitter API changes (http://twdocs.com from Andris Alnis). That's really a shame. The service was limited to 3200 tweets, but was useful for my research (register each 3200 tweets of each specific edges +2 or edges +3 users connected with a interesting user). Do you think I can find an alternative solution to http://twdocs.com ?

Kind regards,

@Soc_Net_Intel
Sep 16, 2013 at 5:08 PM
Edited Sep 16, 2013 at 5:08 PM
I'm not familiar with http://twdocs.com or any possible alternatives, so I'm afraid I can't help you there.

-- Tony