A team of researchers at the University of California, Davis have built what they describe as the world's first chip with 1,000 processors, as well as creating a design that overcomes power constraints common to multi-cores.
The chip, which was unveiled last week, is capable of performing 1.78 trillion instructions per second and contains 621 million transistors, according to the university.
In 2010, researchers at the University of Glasgow and the University of Massachusetts Lowell produced a 1,000-core computer processor but that design was based on a field programmable gate array. An FPGA is a chip whose core logic is reconfigurable using software.
Also that year, Intel produced an experimental 48-core processor, which it said could theoretically be scaled to 1,000 cores. Even before that, in 2006, Rapport said it was planning to deliver 1,024-core chips based on IBM technology.
Now, not only does the new UC Davis chip pack 1,000 cores, but the researchers also claim it is the most energy-efficient many-core processor ever built.
It is capable of being tuned to run off the equivalent of a single AA battery. It can, for example, perform 115 billion instructions per second while dissipating 0.7W.
The 1,000-core chip was built by IBM using its 32nm CMOS technology with funding from the Department of Defense. As noted in a research paper, the dimensions of the entire array is 7.94mm by 7.82mm, with 18 processors taking up about 1mm squared of space.
The chip also has 1,000 packet routers, and 12 memory modules that handle data and instruction requests. Each core can operate at an average maximum clock frequency of 1.78GHz.
Bevan Baas, professor of electrical and computer engineering at UC Davis, explained that the design of the KiloCore chip is also more efficient and flexible than GPUs since each of the 1,000 processor cores can be independently programmed.
GPUs use "single instruction, multiple data" (SIMD) operations for parallel computing, while the KiloCore uses "multiple instruction, multiple data" (MIMD) operations.
"The idea is to break an application up into many small pieces, each of which can run in parallel on different processors, enabling high throughput with lower energy use," Baas said.
The researchers have also created a compiler and mapping toolset for programming the chip.
"Programming is accomplished by a multi-step process including a mapping step that assigns programs to processors," the researchers note.
They've also developed several applications for the chip including wireless coding-decoding, video processing, encryption, datacenter record processing, and scientific data applications.