Vista SP1 vs. XP SP2 - Part Deux

Vista SP1 vs. XP SP2 - Part Deux

Summary: Following the benchmarks I carried out last week, I decided that the PC Doc HQ lab rats needed to pull a few all-nighters and carry out some more benchmarking tests on Vista SP1 and XP SP2. Is it possible to determine conclusively which is the faster, more responsive OS?

SHARE:

Following the benchmarks I carried out last week, I decided that the PC Doc HQ lab rats needed to pull a few all-nighters and carry out some more benchmarking tests on Vista SP1 and XP SP2.  Is it possible to determine conclusively which is the faster, more responsive OS?

One of the main criticisms of my initial benchmark tests was that I overlooked the fact that, under the hood, Windows Vista SP1 works differently to XP SP2 in the file copy department.  The difference is down to caching and how Vista carried out uncached I/O while XP caches the files.  On top of that is the further complication that the file transfer progress bars are coded to work differently.  Vista's file copy dialog box goes away when the cache is committed, while under XP the copy dialog goes away while the committal is still pending.  In other words XP is coded to appear fast.  Mark Russinovich has the details:

Perhaps the biggest drawback of the algorithm, and the one that has caused many Vista users to complain, is that for copies involving a large group of files between 256KB and tens of MB in size, the perceived performance of the copy can be significantly worse than on Windows XP. That’s because the previous algorithm’s use of cached file I/O lets Explorer finish writing destination files to memory and dismiss the copy dialog long before the Cache Manager’s write-behind thread has actually committed the data to disk; with Vista’s non-cached implementation, Explorer is forced to wait for each write operation to complete before issuing more, and ultimately for all copied data to be on disk before indicating a copy’s completion. In Vista, Explorer also waits 12 seconds before making an estimate of the copy’s duration and the estimation algorithm is sensitive to fluctuations in the copy speed, both of which exacerbate user frustration with slower copies.

OK, so if this is the case XP has tricked us into believing that file copy is fast by sacrificing reliability for perceived performance.  In that case let's put copy speed on one side for a while and try to look at this problem from a different angle.  Let's consider responsiveness during large file transfers.  OK then, let's benchmark the system while it's performing file copy operations.

Next -->

The Test

This time we're going to combine synthetic benchmarks with real-world operations.  To put that in simpler terms, we're going to carry out a background file copy of multiple files (2,000 files in 40 folders, 3.8GB) while running a PassMark PerformanceTest 6.1 benchmark.

Note:  Note that this system is not representative of the kind of PCs around when XP SP2 was released.  Also, this system predates the release of Vista by a few months.

The test system is the same system detailed here.

Note:  With PassMark ratings, the higher the number, the better the score.

First, a baseline was established by benching the systems under no load.  We use our usual number crunching method of making four separate benchmark runs, remembering to reboot, defrag, and force the processing of idle tasks between each run.  We then examined the four results, discarded the lowest score and averaged the remaining results.

Note:  The only change that was made to Windows Vista SP1 for this test was to disable Windows Defender.  Testing indicates that the effect that Windows Defender has on file copy in Vista SP1 is negligible, but to satisfy the critics I chose to disable it for this test.

Using this method we ended up with the following PassMark ratings for the systems.

Average PassMark ratings:

  • XP SP2: 509.0
  • Vista SP1: 469.5

XP SP2 - no load PassMark rating    Vista SP1 - no load PassMark rating XP SP2 results on the left, Vista SP1 on the right.

Based on these results from the synthetic benchmark, XP SP2 gets a PassMark rating that's 8.4% greater than Vista SP1

Next -->

The Results

With a baseline established we carried out another set of benchmark runs while the file copy operation was running.  This is a pretty aggressive test of responsiveness but since sluggishness would result in a drop in the final PassMark rating (and the greater the sluggishness, the greater the drop in responsiveness), we decided to give this a try.

Eight runs later, the data was collected. 

Here are the PassMark ratings averages for both systems under copy load.

Average PassMark ratings, both systems under copy load:

  • XP SP2: 490.1
  • Vista SP1: 384.4

Based on these results, under copy load XP SP2 achieves a PassMark rating that's fully 27.5% better than Vista SP1

However, using the baseline that we gathered earlier we can see the effect that the file copy load has on the benchmark rating for OS: 

  • Under load, XP SP2 achieves a PassMark PerformanceTest rating that's 3.7% less than the OS under no load.
  • Under load, Vista SP1 achieves a PassMark PerformanceTest rating that's 18.1% less than the OS under no load.

However, oddly enough, Vista SP1 felt more responsive to user inputs such as opening applications and saving files while the tasks were being performed (we tried this out on separate runs). Problem is that it's darn hard to measure this end responsiveness without relying more on synthetic benchmarks.

However, let's see what we can do to clear things up ...

Next -->

More Tests, More Results

Looking at the overall results it was clear that what was pulling the scores down the most of both operating systems was the disk activity.  This gave us an idea.  We carried out another set of PassMark PerformanceTest runs, but this time rather than do a full run of tests this time the tests exclude all of the hard disk tests (therefore eliminating much of the effect of the copy operation on the benchmark). 

First, we established a baseline rating under no copy load with the disk test excluded.

Average PassMark rating, no copy load with all disk tests excluded:

  • XP SP2: 450.5
  • Vista SP1: 403.3

Based on these results, XP SP2 achieves a PassMark rating that's 11.7% better than Vista SP1

We then placed the system under the same file copy loads and ran the tests again.  Number crunching these results in the usual way gives us the following results.

Average PassMark rating, under copy load with all disk tests excluded:

  • XP SP2: 389.3
  • Vista SP1: 369.0

Number crunching these results, the lead that XP SP2 has over Vista SP1 has fallen to 5.5%

Rather than compare the two operating systems directly, let's compare these under load benchmark results to the no copy load with the disk test excluded.

The change from the baseline score when excluding PassMark disk test scores is as follows: 

  • Under load, XP SP2 achieves a PassMark PerformanceTest rating that's 13.7% less than the OS under no load.
  • Under load, Vista SP1 achieves a PassMark PerformanceTest rating that's 8.5% less than the OS under no load.

For those wanting to take a more detailed look at these test results, I've uploaded detailed outputs.

Next -->

Conclusions

This is far more complicated than I'd hope that it would turn out to be, however the results are interesting.  Let me summarize the results here:

  • Tested using PassMark PerformanceTest 6.1, XP SP2 consistently achieves a higher rating than Vista SP1.
  • Under file copy load, XP SP2 consistently achieves a higher rating than Vista SP1.
  • When running a partial PassMark PerformanceTest run (a run where all disk tests are excluded), XP SP2 again achieves a higher rating than Vista SP1 when under copy load and under no copy load.
  • However, if you look at the effect that file copy has on a partial PassMark PerformanceTest run, we see that the file copy operation on Vista SP1 has less of a detrimental effect on the overall rating than under XP SP2 system.

So, what this long-winded series of tests shows is that heavy file copy operations has less of an effect on the overall responsiveness when running Vista SP1 than when running XP SP2 (on the test system, all things being equal). 

This benchmark, along with the one I posted last week, go to show how unsatisfying it can be to benchmark one OS against another.  Even when you're dealing with one system there are a huge number of factors to contend with.

Later in the week I hope to have a set of results that are far more conclusive and convincing - I'll be testing each operating system and seeing which can deliver the best frame rates in some of my favorite games.

Stay tuned!

Thoughts?

Be the first to read new posts - subscribe to the Hardware 2.0 RSS feed.

<< Home >>

Topics: Hardware, Microsoft, Operating Systems, Software, Windows

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

180 comments
Log in or register to join the discussion
  • BackGround Copies

    Would a test using .NET or vbscript FSO be a better test?
    This would give the times of the copies, and also provide raw transfer measurements.
    merio74
  • RE: Vista SP1 vs. XP SP2 - Part Deux

    xpsp2 - if you can get drivers -
    wait for xpsp3 to go mainstream, that will fix this :-)
    cynic
  • Use DOS if you need performance

    Put your favorate OS on a machine with Intel 80286 processor, 640K RAM and 1M hard drive. Then do some benchmark of your pet OS vs. DOS to see which wins.
    pa2004
    • Only problem with that is...

      ...most of the software available for DOS anymore is DJGPP.
      John L. Ries
    • DOS - WHICH ONE ?

      but alas, d-opus shows some difference too, even on vista :P
      llval@...
      • dos which one

        yes which one and how to get a copy
    • I have done comparison with Win95

      A long time ago, I compared copy, download and other things of DOS to Win95. Win95 won big time.

      In fact DOS games actually ran faster running under Win95

      Now days you couldn't do this comparison. My 0 wait state 12 Mhz 286 had 90 Mbytes of RLL drive interleaved 1:1 and it's transfer rate would be many orders of magnitude slower than drives today.

      Applications are much larger because there is much more functionality. With that functionality is the need for more horsepower.

      Today, I can easily run 4 different compilers, firewall, AV, network share, print, download software, couple of copies of Word, and a spreadsheet, on dual monitors all at the same time.
      Also MKS, Notes, text editor. Sometimes Messenger
      DevGuy_z
    • Why not do this ...

      ... on a brand new 2GHz box with 2GB of RAM?

      DOS will, of course, win but just look at all of the things you CAN'T do with DOS? No WYSIWIG! No Internet!

      It's not just about raw performance. It is about personal productivity.
      M Wagner
      • I don't know

        Not a tech but a tinkerer and I don't know how putting win95 or win98se as I had on a new sinilar system, could utilize that much memory. I had memory popup errors all the time running 98se with 1 gig of memory, Once I switched to XP Sp1a No more memory issues. Should I have been using a third party mem. mgr.
        Joerobyte
  • Well I imagine..

    That XPSP2 will STILL be faster than VistaSP1 regardless of how modern the hardware is...
    Bozzer
  • Good point

    "I would like to see your tests done on the minimum spec systems for each OS (as you yourself suggested in your reply to me)."

    That benchmrk is in the pipeline ... hopefully!

    "I do think it's interesting that responsiveness was affected less under Vista - surely this is a good thing, and the ability to keep using the PC more efficiently is definitely worth the trade off of a few seconds of file copy time."

    Yes, it is a good thing!

    "Finally, can you make it clear that you are testing XPSP2 on a system that wasn't even around when SP2 was first released."

    This system was built before Vista was released ...

    I'll add a note to the main article.
    Adrian Kingsley-Hughes
    • Thanks!

      Sorry about being pedantic - as a scientist I always get very twitchy about doing "fair testing" ;)
      Ben_E
    • Good point, really?

      This is a very interesting point, but disqualifies all the porpous of your benchmarks.

      If we follow this line of thinking we will be asking ourselves:

      "What was Vista made for?"

      I think it was security and eye candy. XP needed a lot this two things.

      Microsoft owed their users a better response, so Vista was born. There is a big problem with the price tag Vista has, as
      there are ways to beautify XP a little more, and security may be enhanced by third packages without this performance sacrifice.

      A good benchmark should be an XP made look like Vista (there is a freeware program out there that does it), with a good security solution and then we might start questioning Microsoft if all this 6 years it took to produce Vista were to give users something expensive, slower and worse than XP. But this is a bitter way to look at things.

      Vista has to be slower than XP because it is heavely based on XP with more things, and Microsoft has written a dogma that says "every technological breakthrough will come with a hardware cost".

      But this benchmarks you are doing are a proof that may be we don't want to pay this time.
      TristanGrimaux
  • Minimum specs

    What a stupid comparison. The purpose of these tests is to gauge real-world
    performance, not some theoretical laboratory set up.

    In fact, Adrian should simply do some stopwatch tests of his normal workflow and
    report which OS does better, because, in the end, that's what it really boils down to.

    All that's going on here is a geeky version of FPS in a Quake 4 test.
    frgough
    • In the end...

      ...aren't these all "stupid" tests?

      File copy performance doesn't really matter to some people - I certainly don't do enough of it for it to bother me.

      What I was pointing out was that the test, as performed, was unfairly biased on the hardware side in favour of XP. By stripping the hardware down to the lowest possible for each OS we see which one scales best and takes the performance hit the least. By the sounds of things from this test, XP was ahead of Vista on hardware that's four years further on from SP2 (but still not as new as Vista), and Vista gave some ground, but not as much as I would have expected given the hardware disadvantage. I'm guessing Vista will scale better the lower things go from what Adrian said, but I'd like to find out for real!

      As with all things, if you think these tests are pointless, don't bother reading up on them. They're not hurting you, but they are quite interesting in their own way.
      Ben_E
      • that would be totally pointless

        by running the tests on a system which is newish compared to XP but oldish compared to Vista, he's testing it on a typical system. Sounds like a fair test machine to me. who cares about the minimum spec systems?
        lostarchitect
        • This is not a test about "typical spec"

          it's a test about which is faster. You can either do this as based on minimum spec systems or on the system that was the baseline standard at the time of release (thinking about it, that is probably even better). Either way, you are then testing like for like in a typical usage scenario.

          But most people will get Vista on a new machine - how many Pentium D powered machines get sold with Vista on? A quick search turns up precisely zero.
          Ben_E
          • Maybe not Pentium D powered, but...

            OEMs are selling still Celeron single-core laptops with Vista on it to people looking for the cheapest machine they can find.

            And I believe during 2007, eMachines was selling Vista pre-loaded on to Pentium M desktops, if I'm not mistaken.
            hasta la Vista, bah-bie
          • i don't get your logic.

            how would that test be valuable to any user?
            lostarchitect
          • Added-value

            It would provide a more complete picture about the scalability of the OS. Adrian's results already hint that Vista seems to multi-task during file i/o better by retaining more responsiveness. What happens when you do this on bargain basement hardware? It seems to be the general consensus on these boards that that is what Vista is being purchased on.
            Ben_E