NVIDIA has launched their next generation GPU based on their Kepler Architecture. They followed it up with a rather quick update to their CUDA toolkit. Considering that we have access to 3 generations of their GTX cards (480, 580 and 680), we thought we would show case how the performance has changed over the generations.
It can be seen that the GTX 680 breaches the 1 Terraflop mark comfortably for single precision, while the GTX 580 barely scratches it. However the performance seems to peak around 2048 x 2048 and then rallies downward to match the performance of the GTX 580 at larger sizes. The high end Tesla C2070 finishes last for single precision behind the third placed GTX 480.
For double precision, as expected the C2070 is well ahead of the pack. The most interesting snippet here is that the GTX 680, finishes dead last compared to its predecessors. At about 1/10 th of its single precision performance, the 680 is about twice as slow as the 580 which settles down a ~ 1/5th the single precision performance.
Fast Fourier Transform:
The performance gains moving from 480 to a 580 is significant (~20%), while the 680 does not seem to have huge wins over its immediate predecessor. The Fast Fourier Transform is an interesting benchmark in that, it is a case of these cards running out of memory before the peak performance is reached. At 2GB, the 680 can hold two 8192×8192 single precision, complex matrices, but the scratch space required for this algorithm is more than the free space available. All the transformations were 2D , Real to complex transforms.
Here the GTX 680 starts off strong before losing out to the GTX 580, and eventually to the 480. We are using the same radix-sort algorithm for all the benchmarks. It is really astonishing that the 680 is more than 20% slower at peak.