Analysis of Sun's 'Niagara 2' UltraSPARC T2

Sun Microsystems launched their brand new "Niagara 2" UltraSPARC T2 8-core CPU last Tuesday August 7th 2007 as the successor to the UltraSPARC T1 8-core CPU launched in November of 2005.  During the launch Sun's executive VP David Yen took a few shots at big iron competitors IBM and Intel by noting that they didn't try to crank the clock speed to 4.

Sun Microsystems launched their brand new "Niagara 2" UltraSPARC T2 8-core CPU last Tuesday August 7th 2007 as the successor to the UltraSPARC T1 8-core CPU launched in November of 2005.  During the launch Sun's executive VP David Yen took a few shots at big iron competitors IBM and Intel by noting that they didn't try to crank the clock speed to 4.7 GHz (referring to IBM Power6) or try to shove 24 MBs of Cache (referring to Intel Itanium 2) though this was a bit strange since the UltraSPARC T2 isn't really aimed at the big iron market at all.  Of course Sun's launch page does take a shot at Intel's x86/x64 quad-core CPUs by saying that they didn't resort to "packaging gimmicks, such as MultiChip Modules (MCMs)".  At the end of the launch event during questions and answers, one of the event highlights was when CEO Jonathan Schwartz came to the defense of Linux in response to someone saying that Linux has a "checkered past" with scalability by saying that Solaris has a "checkered past" with usability.

Market for UltraSPARC T2 The UltraSPARC T2 processor is designed to go in to a single CPU server from Sun Microsystems and it should not be compared to Itanium2 or Power6 that are designed to scale to 32 or 64 CPU "big iron" servers.  Sun's Niagara series chips are designed to consolidate up to 64 slower legacy SPARC-based servers or host up to 64 lightly-loaded logical servers using Solaris 10 containers or LDoms (Logical Domains).  But because container and LDOM technology only supports Solaris 10, legacy servers that are likely running older versions of Solaris will need to be validated to run on Solaris 10.  While that validation process of migrating legacy servers should usually go smoothly, the cost and time required is non-zero and it isn't as simple as Virtualization technology such as VMware or Microsoft Virtual Server which allows you to basically import legacy machines as is without changing the OS version or OS vendor.  But once the migration is complete, there can be massive annual savings just in the hardware maintenance cost of the old legacy SPARC servers, rack space, and power consumption.

UltraSPARC T2 Architecture The UltraSPARC T2 is an 8-core CPU built on a 65nm process and is based on a single 342mm squared die.  That's relatively small considering the fact that eight Intel Core 2 65nm cores (two processors) would measure 572mm squared and eight 65nm Barcelona cores (two processors) would measure 566mm squared.  While that might sound like it makes the UltraSPARC T2 cheap to make, reality hits when you realize that 342mm squared is a huge single die.  Intel's 65nm dual-core dies are only 143mm squared and AMD's upcoming 65nm Barcelona die is 283mm squared which is already big.  Any tiny flaw in that massive 342mm squared die means the whole chip is bad or you have to give up two or four of the cores though it's not known if Sun will offer 4- or 6-core versions of the T2 like they did for the T1.

The UltraSPARC T2 has 8 1.4 GHz cores with 2 pipelines per core and 4 threads per pipeline giving it a total of 64 threads that it can support.  The T2 CPU has eight crypto processors designed to offload symmetric and asymmetric encryption as well as hashing functions.  The UltraSPARC T2 also has eight fully pipelined floating point units making it substantially better than its T1 predecessor.  The T2 die also houses four memory controllers that can support up to 512 GBs of fully buffered DIMMs with an aggregate memory bandwidth of 64 gbps.  Even a 10 gigabit Ethernet controller and a PCI-Express controller are included in the UltraSPARC T2 processor.

Note: Surprisingly, Sun actually shortchanged their own T2 processor at the launch event by saying that it would shave $190 from the cost of the motherboard because you don't need a 10 gigabit Ethernet controller and crypto offload engine.  $190 might be the cost to the motherboard manufacture to integrate those components but the end user will see a multiple of that cost when they buy the actual server.  However, even 8-core UltraSPARC T1 servers have been known to be twice as expensive ($30K from Sun and $15K from HP) as an 8-core x86/x64 servers from Intel so the true savings may be dubious.  Given the steep pricing on the UltraSPARC T1, I don't expect the T2 to be cheaper.

<Next page - UltraSPARC T2 performance and power consumption>

UltraSPARC T2 performance and power consumption

The UltraSPARC T2 8-core server set two new records as the "fastest" Microprocessor in a single socket for SPECint_rate2006 (multi-threaded integer performance) and SPECfp_rate2006 (multi-threaded floating point performance).  While the numbers generated for SPEC are very respectable, it's only a symbolic victory due to the fact that two-socket 8-core servers based on two Intel "Clovertown" processors are cheaper and faster.  For example, a system based on two 2 GHz Intel Xeon E5335 processors can deliver a SPECint_rate2006 score of 85 while the UltraSPARC T2 server scores 78.3.  Intel's dual X5365 can get a blistering score of 107 on SPECint_rate2006.

Since Intel just launched a new low-voltage 2 GHz quad-core 50 watt variant of the E5335 called the L5335, two L5335 quad-core CPUs will have a TDP (Thermal Design Power) of 100 watts.  Taking in to account the need for a 32 watt North Bridge memory controller on an Intel chipset motherboard, this is roughly in line with the 123 watt peak power consumption of the UltraSPARC t2 processor in terms of performance/watt though a dual L5335 2.0 GHz based server is superior in terms of performance and price.  Most of the market is probably more concerned with performance if they're trying to consolidate multiple servers so they will use a dual X5365 3.0 GHz Server.  Furthermore, Intel's 45nm Penryn which will be a monster will be a much better comparison within 2 months because the launch dates between Penryn and UltraSPARC T2 are so close together that the product generations are much closer in sync.

Another major problem with Sun's UltraSPARC T2 architecture is its poor single threaded performance.  Since Sun isn't being helpful in answering this question, I asked CPU analyst David Kanter of Real World Technologies what he thought the single thread performance of a T2 would be like.  Kanter replied that it would likely be better than 1/16th of the total performance of the T2 chip.  The number 1/16th was derived by looking at the number of pipelines (2 per core with 8 cores is 16 pipelines) in the UltraSPARC 2 by assuming that a single pipeline is the smallest execution engine per thread.  Since a single threaded application would have a monopoly on the CPU cache, it is reasonable to postulate that single threaded performance would be better than 1/16th of the total aggregate performance of the UltraSPARC T2.

Intel's architecture has a worst case single threaded performance of 1/8th of an 8-core server.  But the fact that a single Intel core has full monopoly on the cache, memory controller, and FSB (Front Side Bus) means single threaded performance is as high as 1/5th of the total 8-core server.  In fact a single core test on a 3 GHz Xeon 5160 gets a whopping score of 21 on SPECint_2006.  If we hypothesize that an UltraSPARC T2 1.4 GHz processor might get 1/14th of the best 8-core score, we're looking at a best case SPECint_2006 score of 5.6 per thread which isn't very good at all.

Single threaded performance is so important in the server world because it essentially allows co-hosting logical or virtual servers to borrow each other's idle time and complete tasks so much faster.  Unfortunately, single-thread performance is where Sun's UltraSPARC T2 architecture fails badly.  That doesn't mean the UltraSPARC T2 is a failure by any means because it fills its niche role in the proprietary SPARC world very well.  Just don't expect it to steal market share from Intel or AMD any time soon.

Note: Sun Microsystems has not responded to my questions on the TDP rating of the UltraSPARC T2 processor and what the best single-threaded performance is on the T2 in more than a week.  Therefore I've published this analysis without the benefit of their official answer and used what resources were available to me.  If Sun wants to offer more precise data, I will be happy to update this post.

<Return to top>