Intel unveils Alder Lake hybrid architecture with efficient and performance cores

Covering desktop, laptop, and ultra mobile form factors, Alder Lake processors will have cores optimised for lower power consumption while other cores are optimised for raw performance.

intel-alder-lake.png

Image: Intel

Intel has taken the wraps off its Alder Lake system-on-a-chip, which is the company's first processor to incorporate its efficient and performance cores.

Previously known as Gracemont, the efficient core in Alder Lake is designed to do what it says on the tin, with Intel saying in comparison to its late-2015 Skylake chips, efficient core was able to deliver 40% better single-threaded performance at the same power levels, and 80% better performance when comparing a 4-core 4-thread efficient core to a 2-core 4-thread Skylake core.

With the new Intel 7 process, previously known as 10nm Enhanced Super Fin, Intel is able to fit four cores on the same die space of a Skylake core.

Intel is boasting that the core will have more accurate branch prediction thanks to its 5,000 entry branch target cache, a 64KB instruction cache to "keep useful instructions close". The core will also have Intel's first on-demand instruction length decoder to generate pre-decode information, a clustered out-of-order decoder that can decode six instructions per cycle.

The core also has 17 execution ports that include four integer ALUs, and can support up to 4MB of L2 cache.

The Alder Lake performance core, previously known as Golden Cove, arrives with six decoders, 12 execution ports, improved branch prediction, and quicker L1 cache. All up it performs around 19% better than the 11th generation Cypress Cove.

To help with matrix multiplication, which is useful when handling machine learning workloads, Intel has introduced its advanced matrix extensions.

To get the efficient and performance cores working together with operating systems, Intel has created a new scheduler it is calling Thread Director, and has been working with Microsoft to optimise it for Windows 11.

"Built directly into the hardware, Thread Director provides low-level telemetry on the state of the core and the instruction mix of the thread, empowering the operating system to place the right thread on the right core at the right time," Intel states.

When the two types of cores are brought together for Alder Lake, the processor will support up to 16 cores consisting of eight performance and efficient cores each, 24 threads with one thread per efficient core and two for each performance core.

Alder Lake will support DDR5-4800, LP5-5200, DDR4-3200, and LP4x-4266 memory, as well as 16 lanes of PCIe Gen 5. Intel said the compute fabric of Alder Lake can handle 1000GBps, I/O fabric can do 64GBps, and memory fabric can hit 204GBps.

The new architecture will power chips that draw nine watts up to 125 watts of power.

Intel also provided more detail on its Arc consumer GPUs set to arrive next year. The first chip dubbed Alchemist will have cores with 16 vector and 16 matrix engines, support for DirectX and Vulkan ray tracing, and be fabricated on TSMC's N6 process.

The GPU will have upscaling technology, called XeSS, that uses neural networks to reconstruct frames. Intel has said the technology will allow 4K frames to be rendered on integrated graphics.

Sapphire Rapids and data centre accelerators

Intel has also taken the wraps off its next Xeon Scalable processor which was formerly known as Sapphire Rapids. Making use of only performance cores, the chip is also built on the Intel 7 process.

"Sapphire Rapids provides a single balanced unified memory access architecture, with every thread having full access to all resources on all tiles, including caches, memory and I/O. The result offers consistent low-latency and high cross-section bandwidth across the entire SoC," the company said.

The chip also uses advanced matrix extensions and has a number of accelerators, including accelerator interfacing architecture for attached devices; a data streaming accelerator (DSA) to offload data movement tasks that cause overhead, which Intel said can have 39% additional CPU cycles available for compute functionality with DSA enabled; and similarly, quick assist technology for cryptography and data compression, which Intel said allows for 98% of the workload for such tasks to be offloaded.

When running microservice workloads, Intel said compared to its 2018 Cascade Lake, this year's Ice Lake had 24% better performance, and Sapphire Rapids delivered a nice 69% performance boost over the 2018 Xeon.

On the infrastructure process unit front, which take offloaded infrastructure workloads in the cloud systems, Intel introduced its first ASIC IPU called Mount Evans, which was "developed hand-in-hand with a top cloud service provider". Mount Evans has up to 16 Arm Neoverse N1 cores for compute, can support four host Xeons, and has a hardware-accelerated NVMe storage interface that was "scaled up from Intel Optane technology" to emulate NVMe devices.

The company also unveiled its Oak Springs Canyon IPU handle offloaded network and storage virtualisation functions, and the Arrow Creek SmartNIC for packet processing.