nVidia GTX470 Bottleneck Investigation
Hardware
- Intel Core i5 750 (2.8 GHz, turbo on).
- 4 GB DDR3-1333 RAM (2×2 GB, dual-channel).
- Gigabyte GA-P55-UD3 (Intel P55 chipset, F6 BIOS).
- nVidia GeForce GTX470 (1.28 GB, reference clocks).
- Creative X-Fi XtremeMusic.
- 30” HP LP3065.
Software
- Windows 7 (64 bit).
- nVidia driver 258.96, high quality filtering, all optimizations off, LOD clamp enabled.
- DirectX June 2010.
- All games patched to their latest versions.
Settings
- 16xAF forced in the driver, vsync forced off in the driver.
- AA forced either through the driver or enabled in-game, whichever works better.
- Highest quality sound (stereo) used in all games.
- All results show an average framerate.
Thanks, BFG10K.. it’s nice to see which games are most affected by the bandwidth.
In AvP and Riddick:DA, the penalty for reducing the bandwidth is nearly identical to that of reducing the core/shader clock, which is rather revaling of the bottleneck. Only in Stalker:CoP is the bandwidth clearly sufficient, probably due to the absolute bottleneck on the shader performance part.
Reducing core/shader clock by 20% while leaving the memory bandwidth alone at 100% stock should “loosen” up the bandwidth bottleneck as it is, which is most likely the reason there is a bit less penalty by reducing the bandwidth in a vice-versa way. Say, for example, you have 1GHz core and 1GHz memory, but you drop the core to 800MHz, then you have a surplus of memory. If you leave the core at 1GHz, but drop the memory to 800MHz, the performance hit is almost identical to reducing the actual “work” by 20% itself. Plus if there were plentiful, bountiful bandwidth, then dropping the core by 20% should have theoretically resulted in a full 20% reduction.
Thanks again for the data.. yummy!
Yet, in Stalker:CoP, there appears to be a problem with the optimization of the architecture/drivers/etc.. being unable to scale very well with increased GPU speed. It could be poorly optimized with the system memory/CPU or something else?
AvP and Riddick only get close at 4xAA, and that’s expected since 4xAA loads the memory more. In some ways using higher AA modes makes the test more “synthetic”.
Stalker could be running out of VRAM capacity. If I ever pull the trigger on a GTX480, I can see if the extra memory makes a difference there.
Thanks for the test! What about a “Core 2” and “Core i5/i7” cpu scaling with the Fermi? Pleeeeeeease 😀
Matrixfan:
i5 750 tested with 2 vs 4 cores: http://alienbabeltech.com/main/?p=19601
I don’t have a Core 2 anymore, but is there something else you’d like to see from my i5 750 + GTX470?