X
Tech

Microsoft IE9 SunSpider JavaScript results raise questions

A Mozilla engineer has uncovered an oddity with Microsoft's Internet Explorer 9 where one test that is part of the SunSpider JavaScript benchmark gives an odd, unexpected result.
Written by Adrian Kingsley-Hughes, Senior Contributing Editor

A Mozilla engineer has uncovered an oddity with Microsoft's Internet Explorer 9 where one test that is part of the SunSpider JavaScript benchmark gives an odd, unexpected result.

Mozilla engineer Rob Sayre set about benchmarking Firefox 4.0 beta against a selection of other browsers and found that IE9 was about ten times faster at one certain test (the math-cordic test) than the other browsers, with IE9 completing the test in around 1ms while Chrome and Opera took around 10ms.

Curiosity piqued, Sayre did some further investigating:

One last issue that can crop up has to do with over-specialization for a specific test. While I was running the SunSpider tests above, I noticed that IE9 got a score that was at least 10x faster than every other browser on SunSpider's math-cordic test. That would be an impressive result, but it doesn't seem to hold up in the presence of minor variations. I made a few variations on the test: one with an extra "true;" statement (diff), and one with a "return;" statement (diff). You can run those two tests along with the original math-cordic.js file here.

All three tests should return approximately the same timing results, so a result like the one pictured above would indicate a problem of some sort.

This effect shows up nicely in the raw benchmark tests results I carried out last week. Notice how the math-cordic test result for IE9 are consistent.

So what could be behind this. Three possibilities spring to mind:

  • A bug in the JavaScript engine
  • Deliberate optimization for the SunSpider test
  • Accidental optimization for the SunSpider test

Can we put this down to cheating, as suggested by Digitizor (which was later picked up on by Slashdot)? Well, without access to the code it's impossible to be sure, and we don't have access to the code. The effect of this one aberration is quite small and tweaking the values from 1ms to 10ms in the tests I ran only drops the SunSpider score to 403.7ms per run, up from 394.7ms. But this is just one result out of many. It depends if there are other, more subtle, optimizations there.

I'm not ready to call this a cheat yet, but it's certainly fishy. But even is there is some degree of optimization, I'm more likely to believe that it's accidental rather than deliberate. The consistency of the result on IE9 is odd in that across multiple machines I get a consistent score of 1ms, which is not something I'd expect to see. Combine that with the fact that the change made to the benchmark code by Sayre should "functionally" make no difference, the fact that you can see wildly different results is again very odd and not something I'd expect to see.

Sayre has submitted this as a "bug" to Microsoft.

The take away: Benchmarks are odd, fickle things. Put too much faith in the numbers and you lose sight of the wood for the trees.

[UPDATE: It seems some commentators find it hard to see the wood for the trees and wonder why anyone would be suspicious of this test result. I understand the need for nerd-rage diplomacy when dealing with anything Microsoft (pro or anti) but this is about the facts at hand.

Let me sum them up for you:

- Result is consistent across multiple runs (1ms, no variation)

- Result is consistent across multiple platforms - I've run the test on several systems and get a 1ms +/- 0.0% with each and every run.

- The changes made to the code don't functionally change the code (it's hard to put the changes into context, but in this example it's like a car's lap varying by a factor of ten simply by changing the wording on the starting line, that's fundamentally the difference between the JavaScript functions that Sayre tested). It's hard to understand how these changes could add 20ms to the result. Hacker News (via Digitizor) offers some insights as to how this could be a bug.

So it could be a bug, or could be a feature. Either way it's an inconsistency in the code that needs attention just in case it has implications elsewhere. There are many people who make serious business decisions based on benchmark results.]

Ed note: Changed headline of post.

Microsoft had the following to say about the issue:

One of the changes we made to the IE9 JavaScript Engine, codenamed Chakra, to improve performance on real world web sites involves dead code elimination.  Yesterday afternoon, someone posted a question (“What sorts of code does the analysis work on, other than the exact [math-cordic test] function included in SunSpider,”) on the Microsoft Connect feedback site.

Briefly, the IE9 JavaScript engine includes many different changes to improve the performance of real-world Web sites and applications. You can see this in action by visiting www.ietestdrive.com and trying the samples there with IE9 and other browsers. The behavior of the IE9 JavaScript engine is not a “special case optimization” for any benchmark and not a bug.

Some of the optimizations we’ve made to the JavaScript interpreter/compiler in IE9 are of a type known in the compiler world as dead code elimination. Dead code elimination optimizations look for code that has no effect on a running program, and removes the code from the program. This has a benefit of both reducing the size of the compiled program in memory and running the program faster.

The company said it will follow up with more on its IE blog.

[UPDATE: Microsoft has now attributed this anomaly to dead code elimination but this explanation still doesn't account for the fact that the three functions tested by Sayre (and I've run them myself) all include he same amount of dead code ... the addition of true and return statements to the function doesn't in any way change the amount of dead code the JScript engine has to process.

Why the JS engine gives different results for what is functionally the same code with the same amount of dead code is still very interesting and worthy of discussion. The fact that the piece linked to on /. made wild unsupported accusations doesn't change the fact that there's something interesting going on here. Like I said, it's highly unlikely to be cheating but if dead code elimination can achieve such good results for the code used for the math-cordic test, this performance should be translated to the variants using the true and return statements too.]

[UPDATE 2: I've been experimenting with a SunSpider deadcode fork (https://github.com/cheald/SunSpider-deadcode) and the results show that IE 9 does indeed carry out dead code elimination, but that it only seems to kick in under certain circumstances.

There's a lot of dead code in JavaScript out there, you browsers process countless lines of it daily. If Microsoft can make this work more generally, then if would be great stuff all round.

Lot of good discussion here on different code thrown at IE9 being hadled differently ... http://apps.ycombinator.com/item?id=1913368]

Editorial standards