The point of these articles is to paint the high-level picture for trends in computer processing. I hope this bigger picture will help summarize things for those that do not breathe computer processors and technical software on a daily basis.
Over the last 20 years, big gains in computer processing have been defined by increases in CPU clock speeds, then by increases in the number of CPU cores. The next 10+ years will be defined by heterogeneous computing.
So let’s start with a definition: Heterogeneous computing is the coordination of 2 or more different processors, of different architecture types, to perform a computational task. “Architecture type” is defined below.
In practice, that means that you actually have more than one processor in your computer. You have the CPU that you already know about (of x86 architecture type), but you also have another processor (of a different architecture type). We’ll call that other processor an “accelerator,” because it accelerates computations by assisting the CPU to get stuff done.
Stop and be amazed. Did you realize that your computer actually already has an accelerator in it? Yes it does. You may even have several accelerators already in your computer. Some of those may even be more powerful in terms of ability to do processing than your CPU. However, it is very likely that you are not using those accelerators. They are probably just sitting there unused, like an expensive gym membership in March.
OK, let me explain that in more detail.
The Rise of GPU Computing
In your computer, you have a GPU (graphics processing unit), which is an accelerator to the CPU. You probably know of the GPU (which typically resides on a video card) as the thing that drives your computer monitor. You plug your monitor into your video card/GPU.
Innovations in GPUs over the last 20 years have been primarily driven by the demand for more awesome video games. Video games have advanced tremendously, going from Mario Brothers (which simply had some pre-computed pixel patters that are the same every time you play it) to the games today which are extraordinarily complex in how they perform physics calculations of wind-blowing, water-flowing, trees-swaying, and all sorts of other physical phenomenon. In games, GPUs can actually do all those physics calculations on-the-fly to determine the color of pixels to be sent to the monitor. It still blows my mind that there is a company purely dedicated to creating physical models for trees in video games (i.e. those physics calculations are extremely complex and the GPU has to be a beast of a processor to handle them).
Anyway, in order to support all of those physics calculations, GPUs have advanced from merely displaying to the monitor to actually having incredible capabilities to do math computations. The processing capability of the GPU is not limited to video games. In fact, any software program can use the GPU to do computations. You can do math on GPUs. You can do financial calculations on GPUs. You can do genomic sequencing on GPUs. Radiologists can find tumors in MRI scans using GPUs. Here are some examples of things our users have done with GPUs.
In terms of sheer capacity to crunch numbers, GPUs can crunch more numbers per minute than CPUs. They have thousands of cores for number crunching. They are more powerful. They also use less energy per computation than CPUs. Note that a GPU core is not nearly as capable as a CPU core in terms of the kinds of things they can do, but there are many more of them available.
GPUs are also ubiquitous. Every computer, smartphone, and tablet has GPUs in them. It is fitting that this article is published today, on the eve of NVIDIA’s GPU Technology Conference.
Software for Heterogeneous Computing
So it is really cool that these computational powerhouse accelerators are in all of our computers.
What is not so cool is that most software is incapable of actually running on those accelerators. In order to use accelerators, software must be re-written.
The process for re-writing software involves an understanding of parallelism. I will write a future post in this series to address parallelism for dummies. The take home message is that software written for parallel computing runs circles around other software, because it is able to use the multiple cores of the CPU as well as the many cores of accelerators, like GPUs.
The Next Decade: A Tidal Wave of Heterogeneous Computing
The rise and success of GPU computing over the last 5 years, with NVIDIA as the hardware-vendor leader, solidified the validity of accelerators and has provoked a tidal wave of oncoming heterogeneous computing systems. As followers to NVIDIA, here is a list of other companies and their accelerators pushing heterogeneous computing as the primary path to computational performance increases over the coming decade:
- Intel Xeon Phi – released this year as a 60+ core accelerator (they prefer the term “co-processor”). It has a new processor architecture with a ring of older x86 CPUs.
- Intel integrated graphics – this is Intel’s GPU. It is not as capable as NVIDIA or AMD’s GPUs, but comes on the same chip as all Intel’s CPUs and is probably the most ubiquitous GPU today for that reason.
- AMD FirePro and Radeon – these are the only other first-rate GPUs for desktops and servers.
- AMD APUs – this is AMD’s merger of Radeon technology onto the same chip as the AMD CPU (i.e. APU = CPU + GPU). The GPU AMD is putting on APUs today is not as powerful as the full Radeon GPU, but it can still be used as an accelerator.
- Altera FPGAs – these used to be restricted to very niche markets, but with the tidal wave towards heterogeneous computing, Altera’s FPGAs and FPGAs from other vendors will be considered as a viable option for many more applications.
In addition to those, all smartphones and tablets have CPUs and GPUs. All these heterogeneous computing concepts apply equally well to getting more performance out of mobile apps. I will discuss mobile heterogeneous computing in a later post in this series.
With all of these major companies pushing accelerators, heterogeneous computing is easily the biggest trend in computing for the coming decade.
What indications of heterogeneous computing trends have you noticed?
- Architecture type refers to the blueprint used to create the processor. Software that will run on one architecture will not run on another architecture without modification, either of the code itself or with the way it was compiled to run. AMD and Intel use the same blueprint, called “x86″, so software compiled for AMD will generally run on Intel and vice versa. Heterogeneous computing involves many different processors of different architecture types. Developing software for all those different types is complicated and will be described in a future post in this series.
- When I have described this to people in the past, they often get over excited about accelerators and ask, “When will accelerators overtake CPUs and when will we no longer need CPUs?” That question misses the point. If your computer were an army, CPUs would be the generals – highly capable and extremely efficient at command and control. Accelerators would be the foot soldiers, massive numbers of production units but not as capable at decision-making.
Posts in this series:
- CPU Processing Trends for Dummies
- Heterogeneous Computing Trends for Dummies
- Parallel Software Development Trends for Dummies