X
Tech

Benchmarks: Intel's first 45nm Penryn chip

The 3GHz Core 2 Extreme QX9650, Intel's first 45nm processor, has a total of 12MB of Level 2 cache at its disposal. This benchmark test shows what else the new chip has to offer.
Written by Kai Schmerer, Contributor

On 29 October 2007, Intel took the wraps off the first of its Penryn family of processors. The quad-core Intel Core 2 Extreme QX9650, codenamed Yorkfield, replaces the QX6850 and will be available from 12 November. The new CPU runs at 3GHz and has a total of 12MB of Level 2 (L2) cache.

The Penryn architecture is a development of Intel's current Core technology. To complement the new QX9650 chip for high-end desktops, a quad-core server counterpart, codenamed Harpertown, is due to appear later this quarter. Compared with the previous generation of 65nm Core 2 processors, Penryn chips are manufactured with a 45-nanometre (nm) process.

Dual-core Penryn processors, which are codenamed Wolfdale, have 6MB of L2 cache (2 x 3MB); the quad-core Hapertown/Yorkfield chips are essentially two Wolfdale cores on the same package, and therefore have a larger 12MB (2 x 6MB) pool of L2 cache. The Penryn microarchitecture also offers more efficient execution and an upgrade to SSE4. The 47 new SSE4 instructions are mostly aimed at improving the performance of multimedia applications such as video compression. Although Intel offers quad-core Penryn processors for servers and desktops, notebooks will have to get along with dual-core chips for now.

Intel will announce further Penryn chips on 12 November. However, it's likely to take until the beginning of 2008 for most of these to become widely available.

Intel's first processor in the 45nm Penryn family (see table below) is a 3GHz quad-core desktop chip with 12MB of L2 cache, codenamed Yorkfield.

 

Intel's 45nm Penryn processors

   Server  Desktop  Mobile
 Name  Xeon  Core 2 Extreme / Core 2 Quad / Core 2  Core 2 Extreme / Core 2 Duo
 Quad core  Harpertown  Yorkfield  n/a
 L2 cache  2 x 6MB  2 x 6MB  n/a
 Dual core  Wolfdale  Wolfdale  Penryn
 L2 cache  2 x 3MB  2 x 3MB  2 x 3MB

 

 

 

 

 

 

 

Test setup & power consumption

The 3GHz Core 2 Extreme QX9650 has a 1,333MHz frontside bus (FSB) and is supported by Socket 775 motherboards based on Intel's X38 chipset. The new chip will also work with most other Socket 775 motherboards that support a 1,333MHz FSB, although this will depend on the manufacturer: a BIOS update will certainly be necessary for older boards.

Test setup
To evaluate the performance of the new 45nm quad-core QX9650 with 12MB of L2 cache, we compared it to Intel's previous desktop performance leader, the 65nm quad-core QX6850 with 8MB of L2 cache. Both chips have the same 3GHz clock speed and connect to the rest of the system over a 1,333MHz FSB. The benchmark results should therefore provide a direct comparison between the old and the new microarchitecture.

Intel leaves the clock frequency multiplier unlocked on its high-end Extreme processors, making overclocking straightforward. We therefore tested the new quad-core CPU at 3.33GHz. For comparison, we also added a 3GHz Pentium 4 with 1MB of L2 cache to the benchmark test. This three-year-old 90nm Prescott processor has only a single CPU core, but does offer parallel command processing courtesy of Hyperthreading technology.

The new Intel platform also supports DDR3 memory, so our tests will examine how this compares to the previous DDR2 memory technology.

Power consumption
One of the advantages of an improved manufacturing process is lower power consumption. The system with the new 45nm processor consumes significantly less power than the one with its 65nm predecessor in no-load operation: 190W compared to 227W. When the CPU is fully loaded, the trend is even stronger: the system with the 65nm QX6850 uses 337W, while the QX9650 system uses only 264W. Particularly striking is the comparison with the single-core 3GHz Pentium 4 machine, which uses 210W in quiescent mode and 277W under full load — more than the new quad-core chip.

 System specifications for power consumption tests

 CPUs  Intel Core 2 Extreme QX6850/QX9650, Pentium 4 E530
 Motherboards  ASUS Maximus Formula (DDR2), Gigabyte GA-X38T-DQ6 (DDR3)
 Chipsets  Intel X38 Express
 Graphics card  ATI Radeon HD 2900XT (Catalyst 7.10)
 Memory  2 x 1024MB DDR2-800 (Qimonda), 2 x 1024MB DDR3-1333 (Aeneon)
 Hard drive  4 x Western Digital RE in RAID-0 configuration
 Power supply  Tagan TG480_U22
 Operating system  Windows Vista Ultimate 32-bit 

 

 

Memory performance

The basic result from the memory tests describes how quickly the processor can communicate with the memory subsystem. Beyond pure throughput, the time it takes to access particular memory cells is also important: the fewer CPU clock cycles required to access a memory cell (i.e. the lower the latency), the quicker the data can be read. Low latency is a particularly important factor in determining database performance, for example.

With 1,333MHz DDR3 memory, the Core 2 Extreme QX9650 has the highest throughput and lowest latency. Because of its slower 800MHz FSB, the Pentium 4 cannot keep up with the new 1,333MHz FSB Core 2 processors. The P4's NetBurst architecture also suffers besides from high latency when acessing 256-byte and 512-byte memory blocks.

 
 

 

VMware 6.0: virtualisation performance

Virtual machines are becoming ever more common in enterprises. Our tests with VMware Workstation 6 and the Winstone suite of application benchmarks examine a processor's efficiency in virtualised IT environments.

The fact that Winstone is a somewhat elderly test suite is not a problem in this case: we're not testing application performance, but the efficiency of the processors in handling the VMware virtualisation.

Compared to the older single-core Pentium 4 processor, the quad-core chips show clear advantages when running virtual machines (VMs). In the first test, two Windows XP-based VMs are started and the Content Creation Winstone (CCWS) benchmark suite run on each one. In each case, two CPU cores are available to the virtual machines. The quad-core chips are optimally employed, and therefore have clear advantages over single-core chips. The fastest quad-core system, with the QX9650 clocked at 3.33GHz, is about five times faster than the single-core 3GHz Pentium 4 system in this test. In the second test, which uses the CPU-intensive image-processing software Paint Net, the overclocked quad-core system is seven times faster than the single-core system.

 
 

Image editing performance

To evaluate performance when running image-editing applications, we use the popular Panorama Factory program and the free Paint.Net tool. We also use the Java-based JAlbum, an HTML picture gallery containing 289 photos.

In these tests, the superiority of quad-core processors compared to the older single-core technology is evident. For example, with Paint Net, the 3GHz Pentium 4 takes 136.7 seconds to complete a task that the 3GHz quad-core QX9650 manages in just 21.3 seconds. With Panorama Factory, the new 45nm QX9650 delivers similar performance to its 65nm QX685 predecessor.

 

Video and sound editing

The video and sound editing tests involve converting a video file into the iPod format (H.264) and creating a 320Kbps MP3 file from a 450MB WAV file of Pink Floyd's Dark Side of the Moon.

At present, VirtualDub is the only software for creating DivX 6.7-encoded video that's optimised for the SSE4 instructions in the new quad-core chip. As a result, the QX9650 performs around 34 percent better than its predecessor the QX6850 in this test. In all of the remaining tests the performance is almost identical.

 

Rendering performance

3D rendering software makes good use of multi-core CPUs. Compared to the single-core 3GHz Pentium 4, the quad-core chips are up to seven times faster. Under Cinema 4D, the larger cache in the new 45nm QX9650 also makes it presence felt, delivering an eight percent performance improvement over the 65nm quad-core QX6850.

 
 

3D gaming performance

Quad-core processors do not deliver their full potential with current 3D games: in practice, any theoretical advantage of the latest high-end technology (3D Mark 06) is masked by the performance of the graphics card. As the tests with F.E.A.R and Call of Juarez show, even the older single-core Pentium 4 system can almost keep up with the quad-core Core 2 Extreme processors. None of our other tests show the Pentium 4 as close to the quad-core chips as this.

 
 

Conclusion

The $999 Intel Core 2 Extreme QX9650 running at 3GHz is worth considering despite its high price. This quad-core chip delivers excellent performance in the areas of image, video and sound editing; it also shines when running 3D rendering software or hosting several virtual machines. Our benchmarks incidentally show that quad-core chips running modern software offer big performance gains compared to single-core processors. In some tests, the quad-core chips are seven times faster than a Pentium 4 running at the same clock speed.

If 3D games are your thing, there's currently little advantage to be had from a quad-core processor. However, if developers build in more quad-core optimisation in the future, the picture will change fast, as the 3DMark 06 test suggests. Generally speaking, the more a program is parallelised — that is, the more threads it uses — the bigger the advantage a quad-core chip will deliver. Performance-hungry users will also appreciate the fact that Intel ships its Extreme series CPUs with the clock frequency multiplier unlocked, making them straighforward to overclock. The QX9650 functions happily at 3.33GHz (see benchmark results), although to get stable operation at 3.66GHz, the voltage must be increased and an efficient cooling solution implemented.

Compared to the 65nm QX6850, the new 45nm QX9650 chip delivers up to 15 percent better performance with conventional applications. It also consumes significantly less power. For potential buyers, then, the picture looks straightforward: the new quad-core chip costs the same, while performing slightly better and using considerably less energy.

Until AMD's Phenom arrives, which will probably be at the beginning of 2008, Intel remains the sole supplier of quad-core chips for desktop systems.

 

Editorial standards