Google's 'moonshot' fix for the hardest-to-solve of the three Meltdown and Spectre CPU attacks seems to have paid off.
That fix, called Retpoline, addresses Variant 2 of the two Spectre CPU attacks called 'branch target injection'. Variant 2 is considered by Microsoft and Google to be the trickiest speculative execution vulnerability to fix as it's the only one that does cause a significant hit on CPU performance.
It is also the scariest threat to virtualized environments in the cloud due to its potential to be used to hop between different instances on the same CPU.
The other way of fixing Variant 2 is via a blend of OS/kernel fixes and silicon microcode from Intel and AMD, but Google contends its software-based Retpoline answer is superior and should be adopted universally.
Google last week said Retpoline generally had "negligible impact on performance" and has now outlined the specific impact for Google Cloud Platform services.
Ben Treynor Sloss, the VP of Google's 24x7, said for several months it looked like the only option to fix Variant 2 would be to disable the performance-enhancing speculative execution CPU feature, which in turn would result in slower cloud applications.
Google had already patched Variant 1, also a Spectre attack, and Variant 3 aka Meltdown, by September, with Variant 2 standing out until December. These first two fixes had "no perceptible impact" on GCP or services like Gmail, Search, and Drive, but the fix for Variant 2 did.
Intel initially denied reports that its Meltdown and Spectre fixes would cause a major hit on CPU performance, but yesterday admitted "impact on performance varies widely, based on the specific workload, platform configuration and mitigation technique".
Sloss says during tests at Google, disabling the vulnerable CPU enhancements -- that is, speculative execution -- did result in "considerable slowdowns".
"Not only did we see considerable slowdowns for many applications, we also noticed inconsistent performance, since the speed of one application could be impacted by the behavior of other applications running on the same core. Rolling out these mitigations would have negatively impacted many customers," he wrote.
Microsoft's analysis of the patches' impact on PC, server, and cloud performance came to a similar conclusion.
"In general, our experience is that Variant 1 and Variant 3 mitigations have minimal performance impact, while Variant 2 remediation, including OS and microcode, has a performance impact," wrote Terry Myerson, executive vice president of Microsoft's Windows and Devices Group.
"Retpoline sequences are a software construct which allow indirect branches to be isolated from speculative execution. This may be applied to protect sensitive binaries (such as operating system or hypervisor implementations) from branch target injection attacks against their indirect branches," said Turner.
Retpoline is a stable fix too, according to Sloss, who says that since wrapping up all Meltdown and Spectre bugs for Google Cloud Platform in December, it hasn't received a single support ticket related to the updates.
"This confirmed our internal assessment that in real-world use, the performance-optimized updates Google deployed do not have a material effect on workloads," he wrote.
"We believe that Retpoline-based protection is the best-performing solution for Variant 2 on current hardware. Retpoline fully protects against Variant 2 without impacting customer performance on all our platforms. In sharing our research publicly, we hope that this can be universally deployed to improve the cloud experience industry-wide."