Jha said the back-to-back Lync Online and Exchange Online service issues were "unrelated" to one another.
The Lync Online issue resulted in a number of users being unable to log into Microsoft's Lync Online unified communications service. Microsoft is attributing the inability to connect to "external network failures."
"Even though connectivity was restored in minutes, the ensuing traffic spike caused several network elements to get overloaded, resulting in some of our customers being unable to access Lync functionality for an extended duration," Jha explained.
The Exchange Online issue resulted in "prolonged email delays for externally bound email (email coming inside & going outside the company) for some customers," Jha acknowledged. Also for "a small subset of customers," Exchange email could not be accessed at all. At the same time, the Service Health Dashboard didn't notify all customers of the service issues, instead indicating that all was well.
"In the case of the Exchange Online issue, the trigger was an intermittent failure in a directory role that caused a directory partition to stop responding to authentication requests," Jha said.
Jha maintained that a "small set of customers" lost email access, but their loss of access was "prolonged." However, Jha noted, "the nature of this failure led to an unexpected issue in the broader mail delivery system due to a previously unknown code flaw leading to mail flow delays for a larger set of customers."
The team ended up partioning the mail delivery system away from the failed directory partition and then addressing the root cause for the failed directory partition. Microsoft is "working on further layers of hardening for this pattern," Jha said.
There's no word so far on what Microsoft is planning to do, if anything, to financially compensate those subscribers affected by this week's Lync Online and Exchange Online issues. I've asked a spokesperson if there's more to come on that front. No word back yet.
Update (June 28): A Microsoft spokesperson sent me the following regarding financial compensation for the outages this week:
"Microsoft guarantees 99.9% uptime as part of the Office 365 SLA (Service Level Agreement), so if it’s determined that the service didn’t meet that bar in a particular month, we’ll work with customers to credit them appropriately. This is on a case by case basis given the impact of service issues can vary among customers."