Azure global outage: Our DNS update mangled domain records, says Microsoft

Azure, Microsoft 365, Dynamics, Power BI, DevOps, all down for nearly two hours.
Written by Liam Tung, Contributing Writer

Microsoft says a mishap during a DNS migration was behind a nearly two-hour Azure outage on May 2, between 19:43 and 22:35 UTC. 

The global incident impacted a whole range of Microsoft cloud services, causing connection problems for core services like Azure, multiple services under the Microsoft 365 umbrella, Dynamics, and DevOps. 

The incident had a knock-on effect for Azure compute, storage, App Service, Azure AD identity services, and SQL Database. 

Microsoft was mid-way through migrating its legacy domain name system (DNS) to its own hosted Azure DNS, when "some domains for Microsoft services were incorrectly updated", it explains on the Azure status history page.  

Microsoft updated the page several times during the incident and as services were gradually restored. 

The company assures customers that none of their DNS records were impacted during the event and that Azure DNS itself remained up throughout. 

"The problem impacted only records for Microsoft services," it said, noting it was caused by a "namerserver delegation issue". 

A possible reason for the slower than expected restoration time is that affected Microsoft apps and services may have cached the incorrect domain records, according to Microsoft. These would have been restored as the cached information expired. 

The company plans to publish a detailed root cause analysis within the next 72 hours. 

The Microsoft 365 service health status currently reports no issues. However, per The Register, yesterday it stated that affected services included SharePoint Online, OneDrive for Business, Microsoft Teams, Stream, Power BI, Planner, Forms, PowerApps, Dynamics 365, Intune and Office Licensing.

DNS issues were also behind a global Azure outage in January that affected Office 365, Azure and Dynamics 365 services. Microsoft pinned the problem on Level 3's managed DNS service. And late last year the Azure AD multi-factor authentication outage left Office 365 users around the world unable to sign into their accounts. 

A quite different global DNS problem also hit Windows 10 users in February. A data corruption problem at an external DNS provider resulted in incorrect DNS entries for Windows Update, leaving Windows 10 users unable to download security and software updates through Windows Update for days.

The latest Azure outage comes days ahead of Microsoft's huge Build conference for developers where the headline topic is expected to be Azure and AI

More on Microsoft and cloud services

Editorial standards