There are more hindrances to AMD’s ability to penetrate the market with its Opteron CPUs; and Intel’s not a fault this time. In an earlier blog post on the AMD-Intel settlement I brought up an example of a type of incompatibility that exists between the two CPU makers that isn’t covered by the settlement – live migration of virtual machines. There’s more to this story.
Today, virtual infrastructure administrators cannot live migrate server images from an Intel-based server to an AMD-based server and back. The reason isn’t because Intel is blocking AMD, but that it requires special software to achieve. This software provides a layer of abstraction and translation from the specific CPU instructions that differ across the two CPU types (this also applies to different CPU types from the same vendor, BTW). Most hypervisors and virtual machines check what CPU type they are running on only at boot time and thus become rather shaky when live migration moves them to another CPU type. AMD has published a white paper detailing this and has done its part by building and providing this software but the hypervisor providers haven’t added this software to their distributions. Until they do, AMD will remain shut out of many virtual environments.
You would think that with the contentious (and litigious) nature of the CPU market that the hypervisor vendors would want to stay out of this fray and would want as large a market available to them as possible. Turns out, the answer isn’t that simple and comes down to the usual ISV concerns that center around any new feature they might add — business justification.
We spoke with Citrix, the flag bearer of open source XenServer; Red Hat, who is pushing the alternative open source hypervisor, KVM, Microsoft and VMware. The answer from each was relatively consistent: there isn’t enough market demand for this to justify the expense of qualifying the software.
A Citrix spokesperson said, “Honestly, we haven’t been asked for this by anyone — customers tend to have relatively homogeneous server roll-outs.” No surprise. When the hypervisor can only do live migration within a homogeneous server family, why would you roll out anything else? Sounds like a self-fulfilling prophecy. However, Citrix added, “it is high on our priority list.”
Red Hat acknowledged at least some market demand for this. AMD and Red Hat posted a video demonstrating this capability to YouTube in November 2008. Red Hat’s spokesperson said that this code is still evolving and characterized it as, “nowhere close for production use.” And even if Red Hat started the quality assurance process on this software today, it would be “6 - 12 months” before this could be added to its official KVM distribution.
Microsoft and VMware echoed these comments.
VMware added that this isn’t a simple QA exercise. As the market leaders in virtualization their key objective is to reinforce the enterprise readiness of server virtualization by proving that it is ready for mission critical workloads and delivers high fault tolerance. This higher bar means more corner cases that have to be tested and heterogeneous live migration just adds another risk factor. “The last thing we want to happen is a fatal error across the [CPU] instruction sets that results in an outage for the customer,” a VMware spokesperson stated.
To address the technical side, VMware said it will take a three-way partnership between it, AMD, and Intel jointly testing, qualifying, and then supporting this solution. Such a partnership is certainly feasible; it would be similar to the joint integration efforts by Microsoft and Novell to ensure Linux and Windows integration.
The bigger issue, however, may be business justification. As any ISV will tell you they have significantly more requests for new features, integrations, and solution certifications than they do resources and time and thus have to prioritize these efforts against a business justification. They need to see that customers want this and how much larger a market they will be able to address by doing this work. And with most enterprises deploying homogeneous infrastructure for their virtual environments, it makes cross-CPU certification a tough sell.
All of this makes the challenge look very much like a chicken and egg situation for AMD. It can’t raise its market share in virtual environments without this software qualified and available from the hypervisor vendors; and it can’t get them to qualify it without market share justification. AMD could potentially buy its way onto these ISVs’ priority lists but as VMware stated, there is prevailing concern that if it isn’t a three-way effort with AMD and Intel both at their table, the likelihood of success would be severely diminished. Clearly such an effort would be in AMD’s best interest; hard to see how Intel would see it as such. And one source suggested that bringing Intel the table might be a very tall order given the amount of money it’s already expended appeasing AMD, “…of course, there is a certain other chip vendor who will likely have an opinion on the matter, so we’re piggy-in-the-middle!”
AMD plans in 2010 to further advance its server CPU and motherboard capabilities by increasing the memory bandwidth per core even further and to increase the core count per CPU. These enhancements seem well suited to virtual environments but will there be an addressable market for them?
What’s your opinion? Should virtualization vendors adjust their priorities so you have CPU choice? Clearly they need to hear from you on this issue in order to adjust their priorities.