Partial network problem in Twitter

Jun 20, 2013 at 2:34 PM
Hi. In recent weeks, I've noticed I am only getting partial networks through Twitter. This almost always happens with large networks (10,000+ vertices), but it's happened a couple times with smaller networks (~2,000) as well.

Under "details," the error message usually states too many requests sent to the server. The percentage of the network that imports varies; on a 65K network collecting only the followers of a single node I consistently get 35K or less. Some of these accounts are protected, but I don't think so many.

I'm wondering if this is related to the API change, and what, if anything, I can do to work around the problem, or if there are any plans to introduce a workaround within NodeXL (such as retrying pages that don't load).

I have shut down all other Twitter API calls from my IP address (by closing Tweetdeck etc.) and I am using a Twitter ID that is dedicated to NodeXL only, and I'm using Excel 64-bit with 32 gigs of memory and PowerPivot installed on Windows 8.

Thanks for any insights, and for the great product.
Jun 20, 2013 at 5:07 PM
Edited Jun 20, 2013 at 5:08 PM
Hello, John:

I assume that you are using the Twitter User's Network, with "followed/following relationship" checked, "levels to include" set to 1.0, and "limit to" unchecked. If that's the case, then here is how things are supposed to work:

One. NodeXL requests 5,000 followers of the specified user.

Two. It repeats this for a total of 15 requests, at which point Twitter forces NodeXL to pause for about 15 minutes.

Three. NodeXL tells you that it is pausing until such-and-such a time.

Four. NodeXL wakes up and continues requesting followers.

Are you seeing the "Reached Twitter rate limits. Pausing until..." message in step 3? And have you noticed if the process quits (with a partial network message) right after NodeXL wakes up in step 4? (I won't be surprised if you don't know the answer to the second question, as you are probably not sitting there staring at a paused NodeXL window!)

Thanks,
Tony
Jun 20, 2013 at 5:14 PM
Thanks for the response! Yes, those are the settings. I've encountered more severe problems with the 1.5 level. At the 1.0 level, I get a message saying "Out of 377 tries, one page was not returned" (or similar). At the 1.5 level, it seems to quit sooner and produce a smaller network (even though the seed is the same). The message then is less specific, and but does refer to too many requests.

I am getting the "Reached Twitter Rate Limits, Pausing Until" message on the status bar, and you're right, I haven't been sitting and watching, but I could probably work something out to watch for that (or record my screen, I guess).
Jun 20, 2013 at 5:48 PM
Is the clock on your computer set accurately?

-- Tony
Jun 20, 2013 at 5:51 PM
It looks fine and auto-synchronizes with time.windows.com.
Jun 20, 2013 at 7:58 PM
Please see if you can tell if the "pausing" business usually works--in other words, NodeXL pauses, then wakes up and continues, then pauses again. If it NEVER works on your computer--if NodeXL always puts up the Partial Network message after pausing--the diagnosis might be different.

-- Tony
Jun 20, 2013 at 10:30 PM
So I caught it in the act, and it did indeed trip exactly when it woke up from the pause.
Jun 20, 2013 at 10:34 PM
Edited Jun 20, 2013 at 10:34 PM
Hi,

I have the same problem here. Like said by johndrake it concerns the 1.5 level with "followed/following relationship" checked and "limit to" unchecked. It gives the same error message that one page was not returned and that only the partial network data can be downloaded. Obviously that is not what I am after.

Strangely enough I asked one of my students to also download the latest version of NodeXL and to use the same settings and the same Twitter account to collect the network data and it worked without any problem on his machine.

This is quite frustrating so if anyone has any idea what is going wrong here please give a hint.
  • Remko
Jun 20, 2013 at 10:36 PM
Hi John,

How did you exactly solve it?
  • Remko
Jun 20, 2013 at 10:39 PM
I haven't solved it yet, thus I am here. :-)
Jun 20, 2013 at 10:47 PM
Here is an image of the error message (see url below) that keeps popping up. Sometimes two cycles of collecting, pausing and waking up works and then it fails, sometimes it fails already after the first cycle.

Remko

Image
Jun 20, 2013 at 10:49 PM
Yes, that's what I'm getting too.
Jun 21, 2013 at 7:38 AM
In the meantime I tried it now on three different laptops with two different authorized Twitter accounts and from home and work so different IP addresses. But I am still experiencing the same problem over and over :-(

Remko
Jun 21, 2013 at 4:59 PM
I'm not sure what's causing this. I've asked for help on the Twitter programmer forums.

-- Tony

http://dev.twitter.com/discussions/18999
Jun 21, 2013 at 6:41 PM
Thanks Tony.

My clock is automatically syncing with the Windows Internet clock so it can not be too much out of syc. Seems to me that if the described error occurs the program should add some slack time and try again.

Remko
Jun 21, 2013 at 6:47 PM
Thanks, Tony, I look forward to hearing what they come up with. In addition to the timing issue, I wonder if a Twitter client running on the same router IP might trigger a problem as well. I turned off all my API-using accounts on the desktop, but my phone has a tendency to reconnect on wireless even when I tell it not to, so it may be sneaking some API calls past me.
Jun 21, 2013 at 7:36 PM
Hi, Remko:

I don't want to add a "try again" mechanism here. We have that already, but it kicks in only for network errors. This is not an error; it is a legitimate response from Twitter.

I could add slack time to the original pause time to accomplish the same thing, but the question would be "how much." Ten seconds? If the client computer clock is off by fifteen seconds, this will still fail. One minute? If the client computer clock is off by 61 seconds, this will still fail.

It seems to me that there is a design problem here in that Twitter is saying "wait until this time," which depends on the client clock being set accurately, instead of "wait N seconds," which doesn't. But that's how they designed their system.

Anyway, I'm not certain that this is the cause of the problem we're seeing in NodeXL. I, too will be eager to see responses to my question.

-- Tony
Jun 21, 2013 at 7:40 PM
Would it be worth trying to change our clock settings? I assume from your response that NodeXL is basing its timing on the system clock, If it is running a counter, then the speed of my system might be throwing things off. I could also try the same collection from a slower machine and report back.
Jun 21, 2013 at 7:47 PM
Another possibility is that Twitter is deliberately halting further requests based on server load. In the Twitter Search Network, I've found that identical requests made a few minutes apart will return vastly different tweet counts--sometimes I'll get 500 tweets, and sometimes I'll get 100. Twitter rate limiting doesn't kick in with the Search Network, so I'm guessing that Twitter throttles requests based on various factors.

But this is all just guesswork on my part.

-- Tony
Jun 21, 2013 at 7:50 PM
You can certainly try that, John. You would want to temporarily set your clock backwards, so NodeXL will pause for a longer period.

This could cause other problems, though. If you get an error from Twitter that mentions "not authorized," it's because Twitter's authorization scheme also uses a system that is clock-dependent.

-- Tony
Jun 21, 2013 at 9:10 PM
Regarding the timing. I tried three different laptops so that would mean that I had three times bad luck with the clock settings ...
Jun 21, 2013 at 9:34 PM
What OS are you using, Remko?

I'm running the same search on Windows 7 right now (and a slower PC as well) that was tripping up before on my Windows 8 machine. It's been going a while and hasn't hit a crash point yet. I will let you know if it gets to the end.
Jun 22, 2013 at 4:51 PM
So the collection is still running, but my previous efforts had crashed by now, so this is very encouraging. The differences between then and now:

1) I shut down absolutely everything in the house that might be making an API call (my spare cell phone was still on before).

2) Running on a slower Windows 7 machine compared to faster Windows 8

3) Running Excel 32-bit instead of 64-bit.

I'm inclined to think that issue No. 1 was causing the problem. If another application was making API calls from my router's IP address, would that cause the error we were seeing? I will continue testing after this collection completes to rule out #2 and #3.
Jun 22, 2013 at 5:54 PM
Edited Jun 22, 2013 at 5:54 PM
John:

Twitter's rate limits for reading information are on a "per-user, per-application" basis. I'll include the details for reference below my signature, but the bottom line is that I wouldn't expect your cell phone to be the culprit. (Unless your cell phone were running NodeXL, that is.)

I'm surprised that no one has answered my question yet at http://dev.twitter.com/discussions/18999. I thought the question was an obvious one, with an answer that some Twitter developer must know. I certainly hope the answer isn't "we never thought of that."

-- Tony

From https://dev.twitter.com/docs/rate-limiting/1.1:

Rate limits on "reads" from the system are defined on a per user and per application basis, while rate limits on writes into the system are defined solely at the user level. In other words, for reading rate limits consider the following scenario:

If user A launches application Z, and app Z makes 10 calls to user A’s mention timeline in a 15 minute window, then app Z has 5 calls left to make for that window

Then user A launches application X, and app X calls user A’s mention timeline 3 times, then app X has 12 calls left for that window

The remaining value of calls on application X is isolated from application Z’s, despite the same user A
Jun 22, 2013 at 6:00 PM
Hm, my experience has been that running Tweetdeck and NodeXL at the same time would result in both getting glitchy. even if signed in as different users, but that experience was prior to their retirement of the old API. If that's not the issue, it might point to either processor speed or Excel 64 as the culprit.

Thanks again.
Jun 22, 2013 at 6:28 PM
Actually, I would trust your experience over the Twitter documentation, regardless of the API version.

I'm not ruling anything out at this point.

-- Tony
Jun 22, 2013 at 6:58 PM
Once this finishes I will try a test with Tweetdeck running, and see how that goes.
Jun 23, 2013 at 10:48 AM
Hi John,

How did your test go?

To answer your earlier question. I have been testing with two different Windows 7 machines (laptops) and one Windows XP machine (laptop). To rule out that my telephone would make a call to the API I already tried using a Twitter account that was not on my iPhone or iPad. In all cases I was unsuccessful. Sometimes it stopped after the first pause sometimes after four so I find it really hard to pinpoint the problem. Regarding my computer clock, on the Windows XP machine I have been able to synchronize with the internet time server, so that should be fine as well.

I know the timing issue is tricky because I had a student who wrote an application to do the same one and a half year ago. When I went to Vancouver Island for some months it was not working and that turned out to be a timing issue which he resolved. Unfortunately after the API change this application was not working anymore. It is really a show stopper on my research so I hope to get NodeXL working soon and even then it is the question if I can continue because the rate limit has been lowered.

Cheers, Remko
Jun 23, 2013 at 10:38 PM
The batch I started Friday is still running, so I guess it's working (it's medium large). As the API was explained to me, the rate-limiting is based on the IP address making the call regardless of which accounts are in use. I could be wrong about this, I am not by any means an expert, but so far so good. However this is taking so long it's going to run into a trip I had planned for tomorrow so I won't be able to do the follow-up tests until I get back.

FWIW I have also noticed Excel 64 doesn't seem to be all that stable, when working with NodeXL and somewhat otherwise, so I just installed some updates to that too and will get back into all of this later this week.
Jun 27, 2013 at 3:44 PM
So I'm back. The collection I ran on the Windows 7, 32-bit machine worked fine. I went back to 64-bit Windows 8 with all other variables equal and ran a much, much smaller network that produced the error again. I signed out of the account I'd been using and tried a different one, with the same error.
Jun 28, 2013 at 10:54 PM
Edited Jun 29, 2013 at 3:32 AM
Second update: The partial network problem continues.

Update: I ran Excel in Windows 7 compatibility mode, and it took care of the problem below. I am now running collection to see if it solves my other problem.

J

I'm starting to think the problem is that NodeXL doesn't play well with Excel 64-bit, which would be unfortunate since I actually uninstalled 32-bit and reinstalled for just this purpose. I've been having other problems, most recently trying to export, when I get the error below. It happened when trying to export a UCINET file as well as to a matrix workbook. I went back to my 32-bit Excel installation and it worked fine.

NodeXL

An unexpected problem occurred. If it occurs again, please copy the details to the clipboard by typing Ctrl-C, then post the details to http://www.codeplex.com/NodeXL/Thread/List.aspx.



Details:



[COMException]: PasteSpecial method of Range class failed





Server stack trace:





Exception rethrown at [0]:

at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

at Microsoft.Office.Interop.Excel.Range.PasteSpecial(XlPasteType Paste, XlPasteSpecialOperation Operation, Object SkipBlanks, Object Transpose)

at Smrf.AppLib.ExcelUtil.PasteValues(Range range)

at Smrf.AppLib.ExcelTableUtil.TryAddTableColumnWithRowNumbers(ListObject table, String columnName, Double columnWidthChars, String columnStyle, ListColumn& listColumn)

at Smrf.NodeXL.ExcelTemplate.DuplicateEdgeMerger.DeleteDuplicateEdges(ListObject oEdgeTable, Object[,] aoVertex1NameValues, Object[,] aoVertex2NameValues, Object[,] aoThirdColumnValues, Boolean bGraphIsDirected)

at Smrf.NodeXL.ExcelTemplate.DuplicateEdgeMerger.MergeDuplicateEdges(Workbook workbook, Boolean countDuplicateEdges, Boolean deleteDuplicateEdges, String thirdColumnNameForDuplicateDetection)

at Smrf.NodeXL.ExcelTemplate.DuplicateEdgeMerger.MergeDuplicateEdges(Workbook workbook)

at Smrf.NodeXL.ExcelTemplate.WorkbookExporter.ExportToNewMatrixWorkbook()

at Smrf.NodeXL.ExcelTemplate.ThisWorkbook.ExportToNewMatrixWorkbook()

OK

Jun 29, 2013 at 10:25 AM
John, thanks for your research. On two of the laptops I used there is definitely a 64-bit version installed. I will now also check my other laptop with an XP installation. I belief there is a 32-bit installee but I am not sure. I will keep you posted. Remko
Jun 30, 2013 at 1:08 PM
:-( My laptop with an XP installation runs Excel 32 bit and the latest version (June 19) of NodeXL but experiencing the problems discussed before. But great news that your installation is at least working.
Jul 1, 2013 at 3:24 PM
To be clear, I have not solved the original problem, only the export problem that came up in the meantime.
Aug 3, 2013 at 3:07 PM
I'm just following up. I continue to have this problem, but only on the 64-bit Excel/Window 8 machine. The issue is that machine has the most memory so preferred for working on if at all possible. Is there any update on this issue?
Aug 3, 2013 at 3:56 PM
No, there is no update.

I'm at a loss to explain how the 64- vs. 32-bitness of either Windows or Excel could affect this behavior. Perhaps that's just a coincidence, and something else on the 64-bit machine is different?

Do you get a partial network every time on the 64-bit machine?

-- Tony
Coordinator
Aug 3, 2013 at 4:39 PM
Edited Aug 3, 2013 at 6:13 PM
This is not a solution, but is it possible for you to collect data on the 32 bit system and analyze on the 64 bit system?

-- Marc
Aug 9, 2013 at 5:21 PM
John:

I've made a change in the latest release (1.0.1.248) that you might want to test on your 64-bit machine. From the release notes:

"NodeXL will now pause for an extra 15 seconds when Twitter rate limits are reached. This is an attempt to work around a problem where Twitter occasionally refuses to provide more information even after NodeXL pauses for the time specified by Twitter. The symptom of that problem is a message that includes the text "A likely cause is that you have made too many Twitter requests in the last 15 minutes." (This might not fix the problem, the cause of which is unknown.)"

-- Tony
Aug 9, 2013 at 5:25 PM
Tony, that's awesome, thanks. I will try it this weekend and see how it goes.

Marc, that's what I have been doing until now, which is OK most of the time, but I run into memory errors on larger networks (which I can work around on the machine that has more memory). But that is often an option. I've been testing the upper limits of what one can reasonably expect to do in NodeXL.
Aug 10, 2013 at 1:52 PM
I started a huge import last night, and it's still running today without the error (which had come up pretty quickly in the past). I am optimistic this has resolved the issue, but I will run some more stuff over the weekend and let you know if I run into the problem again.

Thank you!!!!
Aug 12, 2013 at 7:42 PM
Thanks for the follow up and update. Will test it soon on my machine as well.

Remko
Aug 13, 2013 at 8:34 PM
I ran collection all weekend, with no problems. Thanks so much for the fix!

Now who's going to point me to the signup page for the Graph Server? :-)
Aug 13, 2013 at 9:22 PM
Edited Aug 13, 2013 at 9:22 PM
John:

I'm not 100% convinced it's a fix, because that would imply that there is a fundamental design flaw in Twitter's rate limiting scheme. The flaw would be that Twitter tells NodeXL (and other such programs) to wait until a specified time without considering that the client computer's clock might be out of sync with Twitter's clock. Did they really implement such a fragile system, or am I missing something? (I asked about this at https://dev.twitter.com/discussions/18999 but didn't receive an answer.)

Anyway, I'll be pleased if this really is a fix and your positive results are more than just random. If the problem returns, please let me know.

Thanks,
Tony
Aug 13, 2013 at 9:33 PM
Thanks, I really appreciate it. Not sure what the problem is, but for whatever reason, so far so good.
Aug 18, 2013 at 2:22 AM
Hello Tony,

It appears that I'm facing the same problem and I can't seem to find a pattern.

When the account is especially big, the process quits when NodeXL wakes up. Other times, it will keep going for a bit after waking up, then come back with the partial network error. I'm using NodeXL on Office 2013 if that's any help.

Thank you,
Chris
Aug 19, 2013 at 3:40 PM
Chris:

You didn't say which version of NodeXL you're using. Version 1.0.1.248 included a change that might fix the problem you're seeing.

You can tell which version of NodeXL you have by going to NodeXL, Help, Help in the Excel ribbon.

-- Tony
Aug 19, 2013 at 5:30 PM
Thanks for the prompt reply Tony!

I'm sorry about not including the version. It was version .245.

As I couldn't find version .248, I upgraded to version .249 and unfortunately the process stopped as soon as NodeXL woke up again.

So that's Version .249 - Windows 8 - Office 2013.

Thank you for your time and help,
Chris
Aug 20, 2013 at 12:53 AM
Hello, Chris:

I'd like to make sure that I understand what's happening on your computer. Are you saying that when using version 1.0.1.249 of NodeXL, you are getting a message that includes the text "A likely cause is that you have made too many Twitter requests in the last 15 minutes"? This particular discussion has gotten quite long and complex, and I need to verify we're talking about exactly the same problem.

Thanks,
Tony
Aug 20, 2013 at 1:18 AM
Hi Tony.

That is exactly the case.

When I use version 1.0.1.249 I get the message that includes "A likely cause is that you have made too many Twitter requests in the last 15 minutes" as soon as the process wakes up.

Thanks,
Chris
Aug 20, 2013 at 1:20 AM
Just curious, but when NodeXL gets an error message and retries, does it count that against the rate limit? Although I haven't had the problem since .248 came out, I noticed the other day I was seeing a lot of error/retry messages on a collection I was running.

J
Aug 20, 2013 at 4:11 AM
Hi, John:

Good question! Unfortunately, I don't know the answer. If you're feeling ambitious, you can ask that question on the Twitter API forums at https://dev.twitter.com/discussions.

Regarding retries, I can tell you that nothing has changed in NodeXL's retry mechanism in ages, so the retries you've suddenly started seeing are likely due to new Twitter or network errors. The way it works, by the way, is that if there is a Twitter or network error, NodeXL pauses for 1 second, tries the request again, pauses for 1 second again if necessary, tries again, pauses again for 5 seconds if necessary, tries again, and then gives up.

-- Tony
Aug 20, 2013 at 4:16 AM
I'm no developer, but in conversations with people who work with the new API, I have noted that it seems to be more error-prone. So if Twitter is newly logging the failed requests as API calls, but NodeXL doesn't count them when it's managing the rate limit, then it could cause the error message we're seeing.
Aug 20, 2013 at 4:18 AM
Chris:

Please make sure that the clock on your computer is set accurately. One guide for doing that is "How To Synchronize Windows Clock With Internet Time Server" at http://www.guidingtech.com/3119/windows-clock-sync/ .

If the problem still occurs when you clock is set accurately, then I am stumped for the moment. I do not have a solution.

-- Tony
Aug 20, 2013 at 4:23 AM
John:

NodeXL doesn't do that kind of counting; it's Twitter that counts the requests and enforces the rate limits. NodeXL merrily makes requests until Twitter tells it to pause, at which point NodeXL pauses until the time specified by Twitter (plus 15 seconds, as of version 1.0.1.248), then resumes making requests.

The problem that Chris is seeing (and that you were seeing before) is that Twitter is refusing further requests after the pause time.

-- Tony
Aug 20, 2013 at 4:57 AM
And that's why I'm no developer. :-)
Aug 20, 2013 at 4:41 PM
That doesn't seem to work I'm afraid.

Thanks anyway Tony. :)