It's interesting that the 7970 seems to really shine at 2560x1600 vs the GTX 580 (probably due to stellar memory bandwidth) and at eyefinity resolutions with FXAA. It's a bit unexpected to see so many review sites using FXAA in their benchies this soon, already! But when 8x MSAA is used, the performance plummets to levels closer to the 580 - probably due to only 32 ROPs.
What I'm a bit disappointed in is that FP64 performance remains at only 1/4 the FP16 performance, like that of the VLIW4 cards from AMD. I was kinda expecting GCN to do it at least 1:2 ratio or better, like the Fermi-based "unlocked" Tesla cards from NV.
Have anybody done approximations on the 4.3 billion trannies with all of the statistics to logically deduce if parts of the 28nm chip could have been disabled? A 384-bit bus shouldn't take that much die space at all, plus there is still only 32 ROPs. The increase of shader count only went up by 33% but I'd assume that the GCN arch takes about 15-20% more trannies per shader (with the resulting performance boost of around 10% per shader). Who knows how much more trannies are needed for the new DX11.1 features? The doubling of L2 cache does seem to take up some space, but heck it's still ONLY a measly 768K (compare that against 12+MB on the CPUs that take up far less trannies). 2048 shaders and 128 TMUs seem like a fully-rounded number, so it's a bit hard for me to imagine the unlocked number of shaders if there were more... perhaps it's the ROPs (really being 48, advertised as only 32) since they take up a LARGE amount of die space/trannies? I'm afraid that AMD could be playing us once again like they did with the # of Bulldozer trannies (first announcing 2B, then a few months after the launch admitting the "error" and correcting it down to 1.2B). Especially as the HD 5830 and 6790 show some "impossible" stats, along with the Barts series claiming VLIW5 when they behave like VLIW4, I've taken a healthy dose of skepticism this time around.
Perhaps behardware.com would be doing some synthetic tests on the bandwidth and pixel fillrate capacity to see if the specs hold true (like they did on the 5830 and then the 6790 http://www.behardware.com/articles/827- ... -6790.html
As AMD has decoupled the ROPs from the memory controllers in its GPUs it can deactivate ROPs without affecting the memory controllers. The ROPs do however take up a lot of memory bandwidth and also represent an important pathway for this memory. Deactivating half of them therefore does have an impact on the GPU’s capacity to exploit fully the memory bandwidth made available by the 256 bit Barts bus.
Maybe we'll be seeing an unlocked 48 ROP part soon that allows the 384-bit bandwidth to be used to its fullest?? My guts say no.. that we'd merely be seeing a better-binned process like the RV790XT (4890) refresh of RV770XT.