The GTX 780 Ti is unleashed on PowerColor’s 290X OC
Last Autumn, the U.S. Department of Energy’s Oak Ridge National Laboratory (ORNL) launched a new era of scientific supercomputing with Titan. In fact, Titan took the title as the then fastest supercomputer in the world. Titan was made possible by using Nvidia’s K20X Tesla accelerators to do ninety percent of the computing load. This basic GK110 GPU is in the GTX Titan and now also in the GTX 780 Ti.
Using 18,688 Nvidia Tesla K20X GPU accelerators, the Titan supercomputer took the world’s top spot with a performance record of 17.59 petaflops as measured by the LINPACK benchmark. Best of all, Tesla K20X accelerator is energy efficient and Titan achieved 2,142.77 megaflops of performance per watt, which surpassed the energy efficiency of the number one system at the time on the Green 500 list of the world’s most energy-efficient supercomputers.
The Titan and the GTX 780 Ti use the same GeForce GK110 GPU that is also used for Nvidia’s Kepler architecture K20 family of accelerators – the K20 and the K20X. The K20X provides the highest computing performance ever available in a single processor, surpassing all other processors on two common measures of computational performance – 3.95 teraflops single-precision (SP) and 1.31 teraflops double-precision (DP) peak floating point performance.
Differences between the GTX 780 Ti and Titan
The GeForce Titan can also run double-precision compute at one-third of single-precision speeds, giving over 1 teraflop double-precision peak floating point performance. Nvidia is actually looking to grow their CUDA ecosystem by providing Titan GeForce cards to programmers on a budget since K20X costs over $3,000. In contrast, there is no significant double-precision available on the GTX 780 Ti for compute; all of its resources are devoted to performance gaming.
The K20 family also includes the Tesla K20 accelerator, which provides 3.52 teraflops of single-precision and 1.17 teraflops of double-precision peak performance. The K20X is clocked at 732MHz for the core clock and 5.2GHz for the memory clock while the K20 is clocked slightly lower at 706MHz with the same memory clock. The Titan GTX GPU is clocked at 837MHz on the core clock with a Boost to 876MHz and the memory is clocked at 6GHz. In contrast, the GTX 780 Ti is clocked higher at 874MHz and with a more aggressive 928MHz boost backed up by a significantly higher TDP. The GTX 780 Ti is a pure gaming card!
The new GTX 780 Ti is upgraded over the K20 as it has all 15 instead of 13 SMX units enabled, and also two more than Titan. It is the gaming card equivalent of the latest Quadro K-6000 which costs in the $5000 range.
Here is pictured Nvidia’s GK110 Kepler GPU which at 7.1 billion transistors is the most complex piece of silicon anywhere. Although we are primarily gamers, we realize that these very same GPUs that power supercomputers as Tesla are used in our GeForce video cards. ABT has been following GK110 closely since we covered Nvidia’s GTC 2012 where the new architecture was unveiled.
A completely functional GK110 GPU such as the GTX 780 Ti is made up of 15 SMXes for a total of 2880 CUDA cores (192 x 15). However, for yield purposes and to differentiate faster, more complex, and more expensive processors from less expensive, slower and less complex products, parts are often disabled and clockspeeds lowered. In the both cases for Titan and the K20X, 14 SMXes are enabled for a total of 2688 CUDA cores. The K20 has 13 SMXes enabled and the GTX 780 has 12 SMXes. The top Quadro K-6000 is also fully enabled.
The memory subsystem of GeForce GTX 780 Ti consists of six 64-bit memory controllers (384-bit) with 3GB of GDDR5 memory rated for 7000MHz, an upgrade over Titan and the GTX 780’s 6000MHz data rate.
The base clock speed of the GeForce GTX 780 is 863MHz. The typical Boost Clock speed is 900MHz. The Boost Clock speed is based on the average GeForce GTX 780 card running a wide variety of games and applications. In contrast, the GTX 780 Ti is based at 875MHz and boosts to a minimum guaranteed 928MHz.
The GeForce GTX 780 reference board measures 10.5” in length. Display outputs include two dual-link DVIs, one HDMI and one DisplayPort connector. One 8-pin PCIe power connector and one 6-pin PCIe power connector are required for operation. The GeForce GTX 780 Ti will be taking the flagship place of the GeForce GTX 780 in Nvidia’s lineup.
The Specifications
Here are the GTX 780 specifications as released in Nvidia’s chart.
Now compare to the GTX Titan and notice that although the GTX 780 is lacking in CUDA cores, it has a higher clockspeed, boost and TDP:
Now here are the specifications for the GTX 780 Ti
Features
We have covered Boost 2.0, Adaptive Vsync, and FXAA/TXAA anti-aliasing in all of our previous Kepler GPU coverage. New is the GeForce Experience including ShadowPlay, G-Sync which syncs the display to the GPU, a more balanced power delivery to the GTX 780 Ti, and much more.