VITA news By Ray Alderman

Multicore disparities

The last time the CPU makers got into a flap, it was ’s Engine (POE), a communications processor core that will handle Complex Instruction Set Computers (CISC) versus the all communications tasks. Plus, you could have a math-oriented CPU makers (HP-PA, Sun SPARC, DEC Alpha, 88000, processor core (such as a DSP core) to handle all the heavy and the MIPS processors) and their Reduced Instruction Set calculations that some of the applications programs may have. Computers (RISC) machines. Intel won the battle because of all the Now, you have an architecture that can show some serious CISC code running on low-end PCs. The installed base of UNIX increases in performance over homogeneous core concepts without code running on at the time just couldn’t hold any all the voodoo parallelism promises. significant market share at the desktop. But Intel did adopt some RISC techniques in their x86 cores after the dust settled. Yes, you can write all that graphics, communications, and math code and run it on those homogeneous cores. But that code and From my reading lately, it looks like the CPU makers are starting those cores will show terrible performance compared to a hetero- their flap over multicore concepts, homogeneous cores ver- geneous group of ASPs. However, I expect to see Intel maintain sus heterogeneous cores. In a homogeneous core architecture, all their position on homogeneous multicores and either write that the cores in the CPU are exactly the same, and they have the same code or get Microsoft to write it. instruction set. Of course, this is the Intel approach to multicore. They won the The companies supporting heteroge- CISC versus RISC battle because of the “... Using parallelism as neous multicore concepts are AMD and huge installed base of CISC-based code IBM, with more joining the party every running on those cheap and slow PCs, so a justification day. AMD bought ATI (a graphics chip they are playing the same card again in company) a year or so ago, so AMD this flap. already has a graphics core. Addition- for homogeneous core ally, AMD has licensed that ATI graph- One myth about homogeneous-core ics core to Freescale Semiconductor processors is that you can run the appli- and will probably license it to more cation code through a parallelizing architectures is, by companies. AMD and IBM have packet- , find those segments that are thrashing cores, for example, those independent, and run them on different definition, another great protocol and communications offload cores simultaneously. That should speed engines from their Ethernet and other up the code execution dramatically. communications chips. The DSP cores Well, yes, it would in a perfect world. waste of time.” can be found in many places to handle Gene Amdahl studied this problem for math-oriented applications, so combin- many years, and he discovered that very little applications code ing a very efficient and cost-effective heterogeneous multicore had any parallelism that can be exploited. His discoveries resulted processor is relatively easy. And the heterogeneous core concept in Amdahl’s Law, which showed the computing world that lets the CPU makers choose best of class cores and software for chasing parallelism in code was a great waste of time. So using the other functions, rather than shoehorn some slow, kludgy code parallelism as a justification for homogeneous-core architectures into a clumsy and inelegant solution on a homogeneous multicore is, by definition, another great waste of time. processor chip. Additionally, I suspect great power and heat dissi- pation savings are associated with heterogeneous multicore chips. For many decades, we have been building multiprocessor systems using homogeneous CPU chips. This is the same concept as the It is clear to me that the heterogeneous multicore concept makes new homogeneous multicore CPUs. We discovered the knee in a lot of sense. But similar to the RISC versus CISC battles of the curve early on, in which each additional processor contrib- the ’80s, my opinion ignores all that kludgy CISC code running utes less than 100 percent increase in performance. We have seen on all those poorly performing PCs available today. To compete four-processor servers outperform six-processor servers hands with AMD’s heterogeneous core model, Intel would have to buy down. The overhead associated with interprocessor communica- NVIDIA for big bucks. Or they could take the cheap route and tions and shared resources exposes such machines to the Law of just write some graphics-thrashing code for one of their x86 cores Diminishing Returns after four CPUs, particularly when parallel and pawn it off on the market. buses, shared memory, and cache-coherency protocols are used. This homogeneous versus heterogeneous multicore flap will A heterogeneous multicore architecture would consist of highly continue for a while and will surely be interesting to watch. specialized Application Specific Processor cores (ASPs) with I think if you examine the homogeneous multicore model, you their own unique instruction sets. Consider this: The primary core will discover a lot of shortcomings. would be a generic processor for running basic programs. Next, you would have a graphics core to handle all graphics and display For more information, contact Ray at [email protected]. algorithms. Additionally, you would have a Protocol Offload