X
Tech

Skype - Is 10,000 Page Faults per Second normal?

I am writing this as a follow-up to yesterday's Tech Talk about Skype. I mentioned this issue then, and I have received a number of requests for additional information about it.
Written by J.A. Watson, Contributor

I am writing this as a follow-up to yesterday's Tech Talk about Skype. I mentioned this issue then, and I have received a number of requests for additional information about it. There have been some new developments today, as well. Those who want to review the complete information should read this thread in the Skype User Forums:

http://forum.skype.com/index.php?showtopic=98518

What disturbs me the most about this is not so much the actual "bug", or "problem", or "feature", whatever you decide to call it. I am much more disturbed by the way Skype has tried to ignore, suppress, deny, discredit, and avoid any discussion of it. Simply following a time line of events, accompanied by statements from Skype and their direct or indirect representatives, is very enlightening:

Oct 3: A user discovers that Skype is producing 10,000 Page Faults per second on his computer. Assuming this is a bug of some sort, he reports it to the Skype User Forums asking for help. "Jamie Watson", who initially tried to help find the problem, is me.

Oct 10: A second user confirms the problem, reporting 9,000 Page Faults / second.

Oct 12: The user files a bug report on "Jira", the Skype Developer Zone bug tracking system

Oct 16: Another Skype user reports the same problem, with 6,000 Page Faults / second

Oct 17: The bug report is marked "Closed" by Raul Liive, who is listed as "Skype Staff" and apparently works in their development group. The comment he wrote was "This is by design".

Oct 30: Another Skype user files another Jira bug report on this issue, stating that it must have been a mistake or misunderstanding to close the initial report.

Nov 2: The second bug report is marked "Closed" by Raul Liive, with a statement that it is a duplicate of the original report. He gave no additional information or comment.

Nov 1-30: Numerous other Skype users report the same problem, including several who have been designated as "Super Users" in the Skype Forums. It becomes apparent that this is not an anomaly, it is happening on every Windows PC that is running Skype 3.5.0.229 or later, with a range of 700 to 10,000 Page Faults per second.

Nov 6, 05:40: Another user opened a Jira bug report about this problem, with quite a lot of additional information.

Nov 6, 08:40: THREE HOURS after the report is opened, it is marked "Closed" by Raul Liive, with the comment "We are aware of the high number of page faults and this is by design"

Nov 6, 11:40: Raul Liive posts the following comment to the Skype User Forum:

I would like to conclude some things in this topic: * There are number of page fault generated by Skype and it is by design. * There is no point in generating more than 1 issue report about the same thing, espesically when it is closed as designed. * Unfortunately I can't explain things to you in detail why we have chosen such a design, but I can say that it does not have impact on computer performance.

Nov 6, 12:25: A Skype "Super User" is so incredulous at this information that he asks Liive to confirm that he did not misquote the Skype programmers.

Nov 6, 23:30: Raul Liive says "No I did not misquote anyone about this statement."

Nov 21: A user reports that the Visual Studio debugger attached to a running Skype process says that there is a steady stream of unhandled exceptions, "Access violation reading location 0xffffffff". This is starting to look like a programming error.

Nov 28: Another person reports that using Windows Process Explorer he was able to isolate the specific Skype process thread which is producing the huge number of Page Faults, stop it, and was apparently still able to use Skype without problem.

Nov 29: Two more users confirm the same information about the Skype process thread, and that stopping only that thread reduces Skype-produced Page Faults to less than 10 per second. It appears that these users might be starting to close in on what is causing these page faults and perhaps even why.

Nov 29, 18:35: A Skype User Forum Moderator, "Gladiator", asks a "Senior Skype Engineer" about this issue. He says the reply he got was "We actually don't understand what they are talking about". Strange, since the discussion has been going on for nearly two months, and Skype Development (in the person of Raul Liive) has closed at least three bug reports about it. The "Senior Engineer" goes on to try to pooh-pooh the whole issue, saying they don't know what tools were used for the measurement or what was considered a "Page Fault", but he can't imagine that it is anything other than normal operation.

Nov 29, 23:59: Gladiator reports that the "Senior Skype Engineer" now says "FYI, the guys indentified the thread that's causing the faults and found a way to fix this." So they have gone from "we don't know what you are talking about" to "we found and fixed it" in five hours, after two months of discussion and complaint?

Nov 30: Numerous Skype users and Super Users say that they would be glad if it were fixed, but after all this discussion, and all the scorn and disrespect from Skype Staff over it, they would like to know what it was and what it was doing.

Nov 30, 14:00: Gladiator says "it was there for a valid (simple) reason and was forgotten to be removed", but he is not allowed to say what that reason was.

Nov 30, 21:23: Gladiator says "The actual reaon the thread was implemented is actually luaghable, and it was for gotten to be removed". So the "pooh-pooh, it was nothing, you are all overreacting" campaign is well under way now - but accompanied with "we can't tell you what the real reason was".

Dec 1, 09:18: Gladiator says "the thread was put in to investigate another issue and basically forgotten about". So apparently Skype is more willing to have people believe that their programmers are careless, sloppy or incompetent? Anyway, he says, "All wells That Ends Well"??? It is difficult for me to imagine what could be construed as "ending well" in this fiasco.

Dec 1, 15:49: Another Skype Super User posts what turns out to be a prescient word of caution: "Not yet, i didn't see a new Skype client version yet"

Dec 1, 19:29: Gladiator says "engineering didn't mind me posting the reason the 'thread' was implemented, but suggested I check clearance with some one else. Frankly I couldn't be bothered". Couldn't be bothered? Something doesn't ring true about that, after two months of problems, discussion, accusations, denials...

Dec 6, 09:06: Raul Liive posts the following comment:

Giving you an update on the situation with page faults.

Thank you all for pointing out this issue and I must say that I was wrong in the beginning about it being fully by design. Some amount of page faults are indeed by design on any software application, but we are creating them lot more than we should be.

We have decided to have a second look on the issue and hopefully we will have a solution for this problem soon. --------

I am stunned. I have absolutely no idea what the truth about this entire mess is, and I strongly suspect that it will end up being exactly like the "Great Skype Outage" last August - we will never know the truth. Just to recap how the discussion has gone:

Users: There is a Page Fault Problem

Skype: It's not a problem, we designed it that way

Users: It's really a big problem

Skype: We told you already, we designed it that way

Users: Are you sure? This is really a big problem for us.

Skype: We are sure, and please stop talking about it.

Users: We have isolated the Skype process thread that is causing the Page Faults

Skype: We have no idea what you are talking about

Skype: We have found and fixed the problem

Skype: It was an oversight, something that we put in for debugging and forgot to take out

Skype: We have decided to take a second look, and hope to have a solution soon

Now, if it was "by design", as Liive said three times in closing bug reports, how did it suddenly become "unknown"? And if it was "unknown", how did it suddenly get found and fixed in less than five hours? And if it was either "found and fixed", or even if it was "an oversight that we forgot to take out", why does Skype now need to go back and "take a second look", and why would they "hope to have a solution soon" for a problem that a Senior Engineer has already said was "found and fixed"? As the prescient "Super User" said, this isn't over until we see a new release of the Skype client that is not producing thousands of Page Faults per Second.

Something is rotten in the state of Estonia... what have they really been up to, and why are the trying so hard now to cover it up? As my father would have said, they are now working harder than a cat trying to cover its mess on a marble floor!

Now it's "Conspiracy Theory" time. One of the Skype "Super Users" has pointed out at least twice that starting with version 3.6, Skype is holding open an excessive number of network connections, and asked for an explanation of why that is. That has been met with complete silence from Skype. Looking at this through the prism of the recent revelations about FaceBook, is it far-fetched to suspect that Skype is up to something nefarious, when their program is constantly doing something that has nothing whatsoever to do with its stated purpose (text/audio/video communication), they refuse to explain what it is doing, and they are holding open excessive, unexplained network connections that could easily be used for shipping in/out information?

I am starting to wonder if this is a company that I want to allow to put an unknown program on my computer any more, regardless of whether it is "sloppy" (Skype's current explanation) or "malicious" (my current suspicion).

Today is "Santa Claus Day" here in Switzerland, but I am no more willing to believe in Santa Claus any more than I am to take any statements or explanation from Skype at face value.

jw 6/12/2007

Editorial standards