AMD set to Bulldoze the datacenter

AMD targets Bulldozer for the next generation of datacenter tasks.

Despite concerns that the new AMD Bulldozer CPUs were going to be trailing Intel at launch, AMD had some good company on hand for the announcement of their new Opteron 6200 and 4200EE CPUs at their launch this week. Dell, HP, and IBM were all on hand to announce their Bulldozer-based servers and exhibited little concern over the architectural changes that shift the focus of the CPUs from ultimate performance to better virtualization. To a certain extent, it could be said that AMD is betting their server business on the cloud.

While the tendency is to look at the 16-core AMD Opteron and compare it to the 10-core Intel XEON CPUs, it is no longer a simple matter of how fast even a multi-core CPU is. Much in the way that AMD tried to separate performance results from clock speeds, they are trying to switch the focus of CPU selection for the datacenter from simply choosing the fastest processor to choosing the CPU most suitable for the task. And with the Bulldozer architecture, the focus isn't on absolute performance but rather on providing a better architecture for virtualization.

In a major shift from the usual "our processors can support multiple VMs per CPU" approach AMD has taken a more density focused approach, focusing on the fact that you can build very dense racks (HP's announcement included a full-size rack that can support 2048 Bulldozer cores) and dedicate a core to each VM (not that you can't support multiple VMs per core).

Part of the optimization for VM use has to do with the actual physical architecture of the CPU. In an Intel 10-core Xeon, you get a single CPU with 10 cores. In the Bulldozer architecture you get two 8-core modules that are combined to make a single-socket 16 core CPU. The Intel architecture gives each core its own FPU; in the Bulldozer module, the cores share the FPU, resulting in lower floating point performance. But in the world of virtualization, the performance of the FPU rarely has an impact on overall system performance, and in this case, users looking for optimal floating point operation wouldn't elect to use the Bulldozer architecture. But the Bulldozer also shares L1 and L2 caches, not just FPUs, and in this case, shared cache architectures should result in improved performance.

Rather than talking on Intel performance head-on, a strategy that no longer works for AMD, they have chosen to take advantage of the direction that the industry seems to be going with higher density systems, task specific servers, and VM optimization for cloud computing.