Nvidia’s GTX 650 Ti arrives to round out the Kepler line-up
Architecture and Features
We have covered Fermi’s GK architecture in a lot of detail previously. You can can read our GTX 680 introductory article and and its follow-up. We also covered the launch of the GTX 690, the launch of the GTX 670, the launch of the GTX 660 Ti and the launch of the GTX 660 and GTX 650. The new Kepler architecture builds on Fermi architecture with some important improvements and refinements that we will briefly cover here before we get into performance testing.
SMX architecture
As Nvidia’s slide indicates, the new Kepler architecture is called SMX and it emphasizes 2x the performance per Watt over Fermi. Their multi-threaded engine handles all of the information using three graphics processing clusters including the raster engine and two streaming multi-processors.
The Fermi SM is now called the SMX cluster. Each SMX cluster includes a Polymorph 2.0 engine, 192 CUDA cores, 16 texture units and a lot of high-level cache. In the GTX 680, four raster units and 128 Texture units comprise 32 ROPs; eight geometry units each have a tessellation unit, and more lower-level cache. Both the GTX 670 and the GTX 660 Ti each have 4 graphics engines but one less SMX unit and only 24 ROPs.
The GeForce GTX 650 Ti ships with 768 CUDA Cores and four SMX units. The memory subsystem of the GeForce GTX 650 Ti consists of two 64-bit memory controllers (128-bit) with 1GB of GDDR5 memory.
The other main differentiation between the GTX 670/680 and the GTX 660 Ti, GTX 660 and GTX 650/Ti, is that the 660 and the 650 bus is much narrower at 192-bit, cut down from 256-bit. One main differentation between the GTX 650/Ti and the higher GTXes is that the memory speed is lower at 5400MHz data rate and that there is no GPU Boost.
From top to bottom, the GeForce lineup now consists of:
- GeForce GTX 690
- GeForce GTX 680
- GeForce GTX 670
- GeForce GTX 660 Ti
- GeForce GTX 660
- GeForce GTX 650 Ti
- GeForce GTX 650
- GeForce GT 640
- GeForce GT 630
- GeForce GT 620
- GeForce GT 610
- GeForce 210
Under load, the GeForce GTX 650 Ti typically draws around 80W of power in most gaming apps. This is no adjustment nor a power target slider available. At maximum power, the GTX 650 Ti will draw around 110W in non-TDP apps. There is no GPU Boost available on the GTX 650 or 650 Ti.
This is a very brief overview of Kepler architecture as presented to the press at Kepler Editor’s Day in San Francisco a few months ago. We also attended Nvidia’s GPU Technology Conference (GTC) and you can find a lot more details about the architecture in our GTC 2012 report.
Adaptive VSync
Traditional VSync is great for eliminating tearing until the frame rate drops below the target – then there is a severe drop from usually 60 fps down to 30 fps if it cannot meet exactly 60. When that happens, there is a noticeable stutter.
Nvidia’s solution is to dynamically adjust VSync – to turn it on and off instantaneously. In this way VSync continues to prevent tearing but when it drops below 60 fps, it shuts off VSync to reduce stuttering instead of drastically dropping frame rates from 60 to 30 fps or even lower. When the minimum target is again met, VSync kicks back in. In gaming, you never notice Adaptive VSync is happening; you just notice less stutter (especially in demanding games).
Adaptive VSync is a good solution that works well in practice. We spent more time with Adaptive VSync by playing games and it is very helpful although we never use it when benching.
FXAA & TXAA
TXAA
There is a need for new kinds of anti-aliasing as many of the modern engines use differed lighting which suffers a heavy performance penalty when traditional MSAA is applied. The alternative, to have jaggies is unacceptable. TXAA – Temporal Anti-Aliasing is a mix of hardware mult-sampling with a custom high quality AA resolve that use temporal components (samples that are gathered over micro-seconds are compared to give a better AA solution). It’s main advantage is that it reduces shimmering and texture crawling when the camera is in motion.
There is TXAA 1 which extracts a performance cost similar to 2xMSAA which under ideal circumstances give similar results to 8xMSAA. Of course, from what little time we have spent with it, it appears to be not quite as consistent as MSAA but works well in areas of high contrast. TXAA 2 is supposed to have a similar performance penalty to 4xMSAA but with higher quality than 8xMSAA.
TXAA was the subject of a short IQ analysis of the Secret World – the first game to use it. So far, it appears to be a great option for situations where MSAA doesn’t work efficiently and it almost completely eliminates shimmering and texture crawling when the camera is in motion. It works particularly well for the Secret World as the slight blur gives the game a cinematic look.
FXAA
Nvidia has already implemented FXAA – Fast Approximate Anti-Aliasing. In practice, it works well in some games (Duke Nukem Forever/Max Payne 3), while in other games text or other visuals may be a bit blurry. FXAA is a great option to have when MSAA kills performance. We plan to devote a entire evaluation to comparing IQ between the HD 7000 series and the GTX 600 series as well as comparisons with the older series video cards.
Specifications
Here are Nvidia’s specifications for the reference GTX 650 Ti:
As discussed, the GTX 650 Ti is very similar to the GTX 660 but with less CUDA cores. The GeForce GTX 6650 Ti was also designed from the ground up to deliver exceptional tessellation performance. Tessellation is the key component of Microsoft’s DirectX 11 development platform for PC games.
Tessellation allows game developers to take advantage of the GeForce GTX 650 Ti’s GPU’s tessellation ability to increase the geometric complexity of models and characters to deliver far more realistic and visually rich gaming environments. Needless to say, the new GTX 650 Ti brings a lot of features to the table that current Nvidia’s customers will appreciate, including improved CUDA’s PhysX, 2D and 3D Surround plus the ability to drive up to 3 LCDs (plus a 4th accessory display from a single GTX 650 Ti with some models); superb tessellation capabilities and a really fast and power efficient GPU in comparison to their previous GTX 450 and GTX 550 Ti.
3-panel Surround (plus an Accessory display with some GTX 650 Ti models) from a single card
One of the criticisms that Kepler has addressed from Fermi was that two video cards in SLI are required to run 3-panel Surround or 3D Vision Surround. From a single card, the GTX 670, 680, the GTX 660/660 Ti and now the GTX 650 Ti can run three displays (plus an accessory display with some models). Interestingly, Nvidia has changed their taskbar from the left side to the center screen. We now prefer the taskbar in the center; it might be more convenient for some users rather than clicking all the way over to the left for the start menu as with Eyefinity.
One thing that we did notice. Suround and 3D Vision Surround are now just as easy to configure as AMD’s Eyefinity. And AMD has no real answer to 3D Vision nor 3D Vision Surround – HD3D lacks basic support in comparison.
One new option with the GTX 650/650 Ti/660/660 Ti/670/680/690 is in the bezel corrections. In the past, the in-game menus would get occluded by the bezels and it was annoying if you use the correction. Now with Bezel Peek, you can use hotkeys to instantly see the menus hidden by the bezel. However, this editor does not ever use bezel correction in gaming.
One thing that we are still noting – Surround suffers from less tearing than Eyefinity although AMD appears to be working on a solution with their latest drivers. The only true solution to tearing in Eyefinity is to have all native DisplayPort displays or opt for the much more expensive active adapters. And you will need two adapters for Eyefinity for most HD 7770s to run Eyefinity (below right), whereas you only need one for Surround with the GTX 650 Ti, GTX 660, GTX 660 Ti, GTX 670 and the GTX 680.
Nvidia also claims a faster experience with the custom resolutions because of a faster center display acceleration.
A look at the GTX 650 Ti
The GTX 650 Ti is also on a short PCB at 5.65″, especially compared to the GTX 680.
Display outputs include two dual-link DVIs, and one HDMI (and one DisplayPort connector for some models). One 6-pin PCIe power connector is required for operation. If a user fails to connect the power connector properly, a brief message is displayed at boot-up instructing them to plug-in the power connector.
SLI
The GTX 650 Ti is not set up for SLI at all.
Super-Widescreen 5760 x1080, Surround, 3D Vision Surround, and PhysX
The GTX 650 Ti is set up exactly the same way as the more expensive GTX 660/660 Ti, GTX 670 and GTX 680. Since the GTX 660 is considerably slower than the GTX 670 overall, one can reasonably expect the performance delta to be much lower for super-widescreen resolutions as well as for Surround, 3D Vision Surround and for PhysX as in our last evaluation of the GTX 670 in May.
For 3D Vision and for Surround, many games need to have their settings reduced. Just remember that you are playing across three screens and are also rendering each scene twice for 3D Vision!! And turning on PhysX on a GTX 650 Ti, although affecting the frame rate, it is enough to play the game with fully maxed out details and FXAA or AAA compared to the GTX 550 Ti that it replaces.
Overclocking
Our GTX 650 Ti edition is reference clocked 925/1350MHz. We were able to overclock a further +175MHz with complete stability even though we did not adjust the voltage nor our fan profile. We also managed +250MHz on the memory clocks and could have gone higher although performance did not scale further.
Even with overclocking further, temperatures generally stayed below 60C and the fan rarely exceeded 40%. The reference GTX 650 Ti is a very quiet card and it could easily be used for a home theater PC (HTPC).
PhysX and Borderlands 2
We did not get to add Borderlands 2 to our benchmarking suite yet. It will be game benchmark number 25 in our next evaluaton. However, we did spend some time playing the game at 1920×1080 with all details set to high including PhysX and FXAA with our new GTX 650 Ti. Playing with the GTX 650 Ti in the most demanding areas with mass action, explosions, debris and liquid PhysX effects taking place all at once, we averaged 29 fps in the Caustic Caverns and never dropped out of the 20s.
Check out the performance summary charts and particularly the overclocking charts to note how well the GTX 650 Ti scales. The specifications look good with solid improvement over the Fermi-based GTX 550 Ti. Let’s check out performance after checking out our test configuration on the next page.
This game bundle really sets off the 650ti. Anyone looking to buy Assassins creed 3 should consider this card. Honestly there are few who wont be interest in this game. The 650ti could be had for only $89 if you take out the $60 for Assassins creed 3. Thats a tremendous value. Even if you already have a great GPU, for $89 more you could have a dedicated physX card. That is a good deal.