Why all the errors in Microsoft updates lately?

Why all the errors in Microsoft updates lately?

Summary: September makes 3 months in a row that Microsoft has issued buggy patches, 3 of which had to be pulled from distribution. Perhaps Microsoft has too many products to have one patch cycle.

SHARE:
TOPICS: Security, Windows
102

About a month ago I wrote a column celebrating the great things that Patch Tuesday has done for customers and the industry. I still believe in it, but I couldn't have picked a worse time to write it. In the weeks that followed, Microsoft customers have experienced a reign of error under Windows Update.

A few days after my column appeared, Microsoft was forced to withdraw two August patches, beginning with a patch for Outlook Web Access in Exchange Server. The buggy code in this patch turns out, ironically, to be written by Oracle, but that's neither here nor there: Microsoft delivered it as part of their product and it caused problems on Exchange Server 2013. The second patch they withdrew was for ADFS (Active Directory Federation Services), but they re-released it a few days later.

The Exchange Server update wasn't re-released until late in August, at which point they also re-released a separate July patch for Windows Media Services that had not been withdrawn.

Anyone can have one bad month I guess, but it didn't end there. Yesterday Microsoft pulled a buggy non-security update to Outlook 2013. They explained the problem and what was happening in a Technet blog entry, but it's still not over.

There was at least one more buggy patch in September, described in this support Microsoft forum and this Technet thread. The problem seems to be related to the patch for MS13-074, a security update for Access. I was a victim of this one. The first thing I saw was that I couldn't load any Office (2013) apps. I got the same unhelpful "something went wrong" error message.

The problem most users report is that, even after installing the patch, Windows Update reports that it is not installed. Even if you manually install the standalone version of the patch, which appears to install correctly, Windows Update still reports that you need to install it. Go to Programs and Features and look at the installed updates and you'll see the update there (designated by its KB number, KB2810009). You can uninstall it and try again, but it won't make a difference. I wasn't able to run Office programs again until I used System Restore to revert the system back to pre-Patch Tuesday. I haven't seen a response from Microsoft on this one.

Two bad months in a row? And not too long ago, in April, Microsoft had to call on Windows 7 users to uninstall an update that was crashing systems. This level of quality is atypical.

It's not that Microsoft doesn't care. They put tremendous resources into updating their software. I asked about this latest pattern and Dustin Childs, Group Manager, Microsoft Trustworthy Computing, replied: “The quality of security updates is critical to our customers, and it is a high priority for us too. We are actively looking at where improvements can be made with the goal of reducing implementation issues, and we will remain transparent with our customers about security threats, protections and update issue resolution.” Below this article is an embedded video about Microsoft's security updating process featuring Childs.

I went to do some research on previous problematic updates in the last several years. It's not easy. Based on my fishing expedition through Technet, I'd say that clusters of errors like this have been very rare in recent years. There was one in April 2011, detailed in these entries on  the Microsoft Office Updates blog ("The official blog of the Office Sustained Engineering and Release team"):

Without the updates and their versions in some structured database it's hard to go much further, and I haven't been able to get Microsoft to say any more on it. But I have a theory.

September was a particularly busy month for Microsoft patches. There were 13 security bulletins covering 47 vulnerabilities. In fact, there was a 14th bulletin that was withdrawn in the final days for QA reasons; this last point both adds to the concerns about quality and shows that Microsoft does put some effort into it and is willing to hold back on a patch.

But there were also numerous non-security updates; the details of all updates, security and non-security, for the year 2013 are in this support article. The list includes the monthly Windows Malicious Software Removal Tool.

Or is it really all the updates? The first botched update this month, the one that messed up Outlook 2013, isn't on the list of non-security updates, and yet it shows up in the list of installed updates under Programs and Features.

And there's even more than that. If you follow Patch Tuesday as closely as I have over the years, you notice that there is plenty of credit given to outside researchers for reporting vulnerabilities, but rarely if ever do they say that Microsoft found one internally. That's because those vulnerabilities are often patched silently.

The numbers are large not because of any general quality deficit at Microsoft, but because they have so many products. Microsoft is not the only company that patches software and not the only one that has had embarrassing outcomes from buggy patches. Firefox 16 had to be yanked off the download servers because of a really bad vulnerability. In 2010 McAfee released a virus definition update with a false positive so bad that it rendered Windows XP SP3 systems unusable. The company actually ended up paying for repairs to customer systems. Microsoft has never had anything go that wrong. But these things happen; software is really complex, nobody is perfect, and some percentage of the inevitable errors are inevitably bad ones.

I'm sure Microsoft puts enormous resources into keeping the update monster happy and well-fed, but perhaps it's just too big now. I see signs recently that the burden is too great. All of the buggy updates - and this is basically by definition - were insufficiently tested. I wouldn't be surprised if having all those products shooting for the 2nd Tuesday of the month was causing scheduling conflicts in testing and giving short shrift to some tests. I've done professional testing for a long time and I know that a proper test can be quite time consuming.

And perhaps one day a month is not often enough. Patch Tuesday was inaugurated as once/month because IT wanted to be able to plan to dedicate resources at a specific date and time. I suspect that the update machinery is well-enough defined that two per month would not be a great burden.

In fact, Microsoft already does have two Patch Tuesdays a month! The company also releases updates on the 4th Tuesday of the month, but only updates for performance, reliability and application compatibility. It's not a secret - that same support article with all the update definitions has the second Patch Tuesday updates in it too. So maybe even more are warranted, perhaps organized by product family. It's only the critical security vulnerabilities for which IT needs to be ready to move ASAP, and one of those a month is enough.

I'm guessing here, but I do think that it must be on senior management radar as a big problem to address. I have no doubt that they're getting lots of complaints, and not just from nobodies posting on support threads, but from large corporate customers, the ones Microsoft listens to carefully. This problem can't be allowed to continue.


Behind the Curtain of Second Tuesdays: Challenges in Software Security Response (from Microsoft's Channel 9)

Topics: Security, Windows

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

102 comments
Log in or register to join the discussion
  • Yes, MS has had some go that bad...

    The one I remember caused the system to go into a continual reboot. The only recovery was reinstallation.

    Things have gotten better since then, but they still have a lot.

    Partly, I think, is caused by too many patches - which prevents them from doing a system wide validation. That means having every software combination checked. Doing that would just take too long, and they don't have the support staff needed to do that.

    There is also the combination of patches and drivers. MS again hasn't the staff to produce/validate all drivers... and how they interact with all applications.

    They try. And I'm sure the staff gets blamed for a lot of failures that they have no control over.
    jessepollard
    • Wasn't the reboot due to already being infected with a root kit?

      I seem to remember one like that.
      ye
    • I suspect the root cause is poor coding in the

      parent software. If your main program is full of spaghetti, it becomes really, really difficult to patch it without breaking something. I've always thought MS made a big mistake when they didn't take the opportunity with Windows 7 of pulling an OS X: Completely re-write from scratch using modern programming tools and methodologies and run older software in a "classic" XP environment.
      baggins_z
      • Microsoft could never do what Apple did...

        Apple only had a half-dozen software titles that were really *really* important to keep working during the transition between Classic Mac OS and Mac OS X, and none of those titles were enterprise-level software. They also did it at a time when the majority of their customers weren't using their machines on a corporate network for anything other than file-sharing. Software breaking during their transition wasn't a big deal.

        Microsoft, on the other hand, had (and has) hundreds or even thousands of titles that *need* to work on every version of Windows, and upgrading many of these titles can cost hundreds of thousands of dollars to their customers. The result of a tear-out like that would be suicide.

        Couple that with the fact that the most significant improvements Apple gained with the move to Mac OS X (preemptive multitasking, multi-user support, protected memory, software portability) were already present in Windows NT. Many of the "tools and methodologies" that you allude to were also already present there.
        daftkey
    • @Jessie

      Totally agree with this, they also have 4 OS's to deal with. The techs do all they can but the problem is, MS is releasing too many OS's at once without dropping support. They should have dropped XP support when W7 came out.
      spineshank155
  • Quality Problem

    " I wouldn't be surprised if having all those products shooting for the 2nd Tuesday of the month was causing scheduling conflicts in testing and giving short shrift to some tests."

    If the quote is reasonably accurate then the problem is MS insistence of holding to "Patch Tuesday" ready or not. This is easily fixed by releasing critical patches when they are ready not on a schedule that must be obeyed. MS needs to realize they are dealing with security patches, bug fixes, and enhancements. Security patches should come out when ready and never be delayed when ready or pushed out until ready, period. Bug fixes are trickier but since security is not involved they could be released when ready on the next "Patch Tuesday". The exceptions are those bugs that appear to affect many users; this would be a judgment call. These bugs are released when ready. Enhancements should only be released when ready on the next scheduled "Patch Tuesday."

    Broken systems caused by bad patches probably cost most companies and users more than the mal-ware exploiting the vulnerability. Each Outlook 2013 user who needs to have the system rolled back or possibly even needing a fresh install of Outlook 2013 costs money.
    Linux_Lurker
    • Are you still arguing this?

      "This is easily fixed by releasing critical patches when they are ready not on a schedule that must be obeyed."

      Patch Tuesday was the result of CUSTOMER feedback. Deal with it. With that said a patch doesn't have to be released on any specific Patch Tuesday. If a patch isn't ready it can be released the following month.
      ye
      • "If a patch isn't ready it can be released the following month."

        The slippery slope with this (and possibly a good reason that we're seeing the problems we're seeing now), is that letting a patch slide into the following patch Tuesday is also not something you want happening too often. I suspect that this is exactly what's been happening with some patches, and some product managers have been getting fed-up with critical patches missing Patch Tuesday release windows.

        I think this is where the culture-clash comes in at Microsoft. We, as customers, expect fixes to - you know - fix things. Not break 'em. So we're okay with delays in a fix (especially a non-critical fix) if it means that a patch being released will actually fix things properly. But we also still expect that fix to be released in a fairly timely manner.

        The problem at Microsoft is that, like any other software development company, there are deadlines and budgets and management. There are also incentives (both positive and negative) for meeting deadlines and budgets. Teams that "do well" get the carrot. Those that "do poorly" get the stick. It is likely that a few of the teams involved in these patches have been getting "reminded" of those incentives recently, which has put pressure on the developers to "get it done" rather than "get it done right".
        daftkey
        • We don't really know how long it takes to release a patch.

          We don't know when a vulnerability is discovered by Microsoft (or reported to them if a third party made the discovery) and how long it took to patch it. We occasionally see public announcements to this effect so we have some limited data. But not enough to know on average how long it takes for Microsoft to develop and test a patch.

          I'm in agreement with Larry. The product matrix is very large. As I said testing just four different operating systems (Windows XP, Windows Vista, Windows 7, and Windows 8, and soone to be Windows 8.1) is very complex given the huge amount of hardware / software Windows alone supports. Then throw in various application suites (such as Office 2003, Office 2007, Office 2010, and Office 2013) and you've got a huge amount of testing.
          ye
          • That's true, we don't know...

            ..so I was only speculating that this is at least part of the issue.

            Microsoft does know internally how long it takes to release patches. I do agree their large product matrix also means that there is a lot of work to go around at Redmond. I was basing my assumption on the facts that a) we haven't seen string of problems caused by patches to this extent in a long time, so something must have changed and the problem is likely very acute, and b) Microsoft is a big company, big companies have managers and employees with budgets and deadlines that need to be met. In any project when budgets and deadlines are becoming an issue, testing is often the first place where corners are cut.
            daftkey
          • Cutting corners

            If MS is cutting corners with testing that is ultimately "penny wise; pound foolish" because bad quality eventually alienates customers.
            Linux_Lurker
          • Very true...

            ..I never said it was a good idea - I just said that's what happens. It isn't just a Microsoft problem - this is the most common trend among all development companies.
            daftkey
        • You are NOT their customer!

          Not when it comes to things like this, its those Fortune 500 companies that have thousands of Windows systems which makes out of cycle releasing a nightmare. It takes testing and insuring the patches won't stop mission critical programs before it can be rolled out and with Patch Tuesday this gives a set schedule for IT to work with whereas Joe and Jane Normal will get theirs automatically and frankly not know what day they come on.

          The ONLY excuse i can see for running an out of cycle patch is for a zero day targeting the system in the wild, otherwise those of us that have to work on the systems would spend all our time doing tests instead of our main jobs.
          PC builder
          • And one more thing

            On top of that, you, the end consumer, are actually the beta tester. Indeed, this is the system working as it is meant to. Test your patches heavily, hope they are 100%, and release them. Big companies will not apply them right away, but the rest of us will go ahead and update ASAP. If there are bugs, we will experience them and complain to MS. MS will then *actually* get those patches to 100%, with all of our feedback, and then a few weeks later the big guys will go ahead and install them.
            x I'm tc
          • You know, there is a grain of accuracy there..

            "On top of that, you, the end consumer, are actually the beta tester. Indeed, this is the system working as it is meant to."

            It sucks to think of it like that, but you are probably right about that actually being the plan. With all its resources, Microsoft still couldn't test all possible combinations of hardware, software, and drivers with every patch - meaning that yes - bugs after patching will inevitably be found and fixed, prior to larger IT shops getting around to installing the patch.

            Having said that, these last couple of issues appear to be a case of not even testing a reasonable minimum of configurations before release to the public. That needs to change.
            daftkey
      • Customer Feedback

        MS took the easy route of not fixing the underlying problem: lousy code. Who was really complaining - the end users or the system admins? Patch Tuesday appears to cater to the later not the end users. Reboots are a pain but the are not end of the world to an end user.
        Linux_Lurker
  • I'd speculate that employee turnover has caused some discontinuity and gaps

    Or it could be fatigue setting in. With all the reorgs and changes MS typical approach is ... Here's another load of work on top of what you're already doing so just work 90 hours a week. Of course they won't ever say work 90 hours because that would trigger labor laws... It's just, get it done now!!!
    greywolf7
  • thoughts

    "About a month ago I wrote a column celebrating the great things that Patch Tuesday has done for customers and the industry."

    Humm, I missed that.

    Honestly, monthly isn't fast enough. Patches *should* be released ASAP. If enterprises want to make their own schedules, fine, but don't screw over the non-enterprise customers in favor of the enterprise customers.
    CobraA1
    • The problem with having your own schedule is...

      ...once the patch is released malware authors will reverse engineer it and develop exploit code with knowledge learned from the patch. Therefore putting one at risk.
      ye
      • Not sure how patch Tuesday mitigates this

        "once the patch is released malware authors will reverse engineer it and develop exploit code with knowledge learned from the patch."

        But doesn't releasing on a schedule still provide malware authors with the same tool? I guess going back to the "release on your schedule but let companies update at their own schedule" idea can create this kind of problem, since it is likely the patch will be released before companies' update schedules can catch up. On the other hand, even with patch Tuesday, I am aware of many companies that still keep their own schedule - often leaving server updates sit for weeks or months before they are installed (in the accounting world, it seems like every week of the month is month-end - except when it is year-end).
        daftkey