Microsoft officials posted an apology the evening of September 8 for three recent outages for customers of its Microsoft-hosted cloud applications.
The first of the North American outages hit Business Productivity Online Suite (BPOS) customers in late August. Two more occurred in early September. (The most recent, which happened at the start of this week, seemed to be focused around Exchange Online, from customer reports I received.)
Morgan Cole, a Director with Microsoft's Online Services team, posted an apology for the outages on Microsoft's Online Services Team blog. Cole also shared additional details about what caused some of these outages. Cole explained:
"Specific to the August 23 event: our proactive efforts to upgrade to next generation network infrastructure caused unforeseen problems that affected access to some services. Operations and Engineering quickly identified a design issue in the upgrade that caused unexpected impact, but the issue resulted in a 2-hour period of intermittent access for BPOS organizations served from North America.
"The August 23 event was remediated, but the solution did not resolve another underlying issue which created subsequent problems on September 3rd and 7th. BPOS customers experienced brief periods of service degradation, primarily affecting the sign-in service and administrative portals. The impact during the afternoon of September 7th had more widespread customer impact, although the duration was relatively short. We performed emergency maintenance to isolate suspect traffic, which has proven successful in stabilizing the service. We continue to monitor the network and all services to ensure stable operations. Needless to say we, like you, find the events unacceptable and have 24/7 efforts underway to ensure we do not have a repeat of these events."
Microsoft has scheduled maintenance for Exchange Online and SharePoint Online in North America this coming Saturday, September 11. The planned maintenance period begins at 4 a.m. GMT and may last through 10 p.m. GMT, company officials have told customers.
There was no mention in Cole's post as to whether Microsoft plans to compensate users affected by the three outages. Microsoft Small Business Specialist Guy Gregory asked the question in the comments section of the blog post:
"Given the 2 hour outage equates to 99.7% for August, will you be honoring your pledge to refund affected users? My understanding was that the 99.9% uptime promise was backed by a money-back guarantee."
I've heard from a few other customers directly via e-mail who are worried about the effect of these kinds of outages on their businesses and those of their customers. One partner mentioned "a huge hit to our credibility from the various outages" in the eyes of its customers, leading him to wonder about the wisdom of migrating to hosted Exchange.
Customers said they wanted and needed more communication from Microsoft about service interruptions -- both when they are happening and afterward. Commentator David Girdner noted:
"What is being done to improve communication when there are issues? On 9/7, on the Online Services admin site, the Service Status showed services were "Healthy" during a time when the services were not accessible. Additionally, the information provided by the RSS feeds is frustratingly vague and not timely."