The Windows 10 version 1809 release didn't quite go as smoothly as Microsoft hoped, prompting lots of questions about how it tests the quality of its builds before release. To answer these questions, Microsoft has now revealed one of the key tools it's developed to measure the health of Windows updates.
These days Microsoft is heavily reliant on diagnostics data to ensure Windows 10 is humming along nicely across the world, and key to this ability is Release Quality View (RQV), an internal dashboard Microsoft uses to assess the ongoing health of millions of systems before and after updates roll out.
According to Jane Liles and Rob Mauceri from the Windows Data Science team, the company approaches every new release with the question: "Is this Windows update ready for customers?".
"This is a question we ask for every build and every update of Windows, and it's intended to confirm that automated and manual testing has occurred before we evaluate quality via diagnostic data and feedback-based metrics," they write in a blogpost.
So while Microsoft does depend heavily on diagnostic data, the pair of data scientists emphasize the internal checking by engineers that takes place even before Windows Insider testers receive a build.
First, a build that goes out must pass initial quality testing. Then Microsoft engineers who "aggressively self-host Windows" give it a thorough once-over for potential problems. Diagnostics data is involved at all stages.
"We look for stability and improved quality in the data generated from internal testing, and only then do we consider releasing the build to Windows Insiders, after which we review the data again, looking specifically for failures," explain Liles and Mauceri.
They're responsible for ensuring that Microsoft's metrics are "reliable, repeatable, precise, true and unbiased" before the big "Is Windows ready" question is broached.
The data science team is always aiming to ensure current metrics are higher than the quality levels of the previous release.
To understand what 'quality' looks like through data, the team counts the unique 'active' monthly devices — a number that Microsoft doesn't publish — and then they look at how the upgrade process went, as well as the "general health of the user experience".
Then they look at certain user scenarios that indicate success for Windows, which include "success rates for connecting to Wi-Fi, or opening a PDF file from Microsoft Edge, or logging in using Windows Hello".
Central to all this data-driven analysis is the RQV dashboard, which includes over 1,000 other measures. This dashboard is used to assess the customer experience while a build is still in the hands of engineers and Insiders, and after an update becomes generally available.
The dashboard is also critical to Windows managers' 'readiness sessions', where engineering teams review RQV measures and decide what bugs to fix.
"Starting with the lowest scoring problem areas, we run down the list of areas whose measures are proportionally farthest from their targets. The engineering owners for those areas are then called on to explain what is causing the problem, who is on point to resolve it, and when they expect the quality of that area, as represented by the measures, to be back within target."
RQV developers also built the capability to see whether a bug fix actually resolves the problem, which can be seen by a particular measure returning to a healthy range.
"Fixes that engineers check into future builds are tracked through the system, so reviewers can see when a fix will be delivered via a new build and can monitor impact as the build moves through its normal validation path: through automated quality gates, to self-hosted devices in our internal engineering 'rings', and to our Windows Insiders."
The company acknowledges there are gaps and specifically notes customer feedback as an area it is investing in "to help us identify gaps or inconsistencies in our diagnostic data-based measures".
It's also looking at how to "provide insight into the experiences you have on your actual devices". And of course the company is creating new machine-learning models for "earlier detection through text analytics".
Microsoft released the Windows 10 October 2018 Update on October 2, then pulled it days later. Now, with November fast approaching, the update has still not been re-released. Where's the problem? And will it happen again?
When Microsoft began testing its Windows 10 20H1 release more than a year before it is expected to start rolling out, many company watchers wondered why. The answer may be more boring -- and a lot more complicated -- than you'd think.
How annoying are Windows 10's automatic updates? In a new study, a group of UK researchers report that users of Home edition experience unexpected restarts and inconsistent installation times, caused by inappropriate defaults and inadequate notice of pending updates.