Introducing “the World’s Fastest Graphics card”, AMD’s flagship HD 6990
Today we are introducing the HD 6990, AMD’s flagship video card along with their bold claim of “the World’s Fastest Graphics card”. It is a beast of a dual-GPU video card that for the first time, breaks the 300W PCIe 2.0 specification by 150 watts to give you a no-holds barred competitor that is designed to crush Nvidia’s flagship GTX 580. It is no budget solution as it retails for $699.
AMD Graphics and Nvidia are locked in a perpetual battle to one up each other in what can only be described as a “graphics war”. Nvidia had issues with introducing their Fermi DX11 architecture and video cards and AMD beat them to the market by over six months with the first DX11 video cards. In April of last year, Nvidia launched their GTX 470 and GTX 480 which were criticized for being hot-running, power-hungry and loud although they offered somewhat higher performance than AMD’s HD 5870 and HD 5850. It appears that AMD actually believed that Fermi was unfixable.
However, a few months later, Nvidia’s midrange GTX 460 turned out to be a very successful reworking of GF100 Fermi into GF104 that scaled well, ran cool, had good thermal characteristics, overclocked well and no doubt ate into AMD’s then DX11 90% marketshare. To combat GTX 460, AMD released their HD 6000 series codenamed “Barts” with HD 6870 and HD 6850 being debuted this past October. This is not AMD’s high end, codenamed “Cayman”, which arrived at the end of last year, but rather their upper-midrange which was renamed from HD 58×0 series and is designed to take on and surpass the GTX 460 and all of its variants.
Now we see that the new HD 6970 and HD 6950 are designed to combat Nvidia’s brand new reworked GF110 which debuted since the HD 68×0 launch as thermally tamed and quiet-running GTX 580 and GTX 570 video cards which significantly surpass the HD 58×0 series’ performance. And of course Nvidia has upped the GTX 460 ante by releasing their GTX 560 TI a couple of months ago.
Today we see ‘Antilles’, the HD 6990, AMD flagship dual-GPU card replacing their recently EoL’d HD 5970. This card has been designed to completely crush the GTX 580’s performance. It is physically a long card the same size as the card it is replacing. It comes in a metal case packed with three adapters valued at sixty dollars. Upon opening the lid you are greeted with, “Rise to Power with the AMD Radeon HD 6990, the fastest graphics card in the world.”
ABT was represented by this editor at AMD’s Press Day at the famous LA Exchange in October and saw AMD’s vision unfold further for us. Since we are going to focus on the HD 6990 ‘s performance in 29 games, we will only give you the barest outline of their 5 hour presentation which covered “Barts”, “Cayman” and today’s “Antillies” graphics cards. We do see that the reason that they chose downtown Los Angeles is symbolic of their increasing commitment to the movie industry and they have partnered up with several Hollywood movie studios to increase productivity by using AMD hardware and know how. They also used the presentation to introduce their support for 3D in PC gaming and 3D for video playback.
AMD’s Press Event was called “Believe Your Eyes” and they laid out their vision for the world’s press. AMD feels that Fusion is uniquely suited to conquer the world and they stress the “firsts” they have accomplished, including being first to bring DX11 GPUs to market very quickly and successfully. They are quite proud of their marketshare and do not intend to allow Nvidia to easily make inroads.
AMD points out the advantages of their Eyefinity which now allows more displays to be driven off of a single card – up to six displays now with a hub adapter – rather than with Nvidia’s competing Surround solutions which require two similar video cards running in SLI to power 3 displays. And with this new card, 5×1 Eyefinity becomes possible for super-widescreen gaming.
A rose by any other name …
Today, AMD Graphics is proud to introduce an improved version of their dual-graphics card, HD 5970, with better performance, but still on the same 40 nm process as the 5000 and 6000 series. With the release of the HD 6990, AMD still holds on to the title of fastest video card.
Here is our interview with Stanley Ossias of AMD. He gives an insight into their strategy but it does appear that they did not expect the GTX 580 to so successfully address its issues, nor did they appear to expect GTX 570 before their own Cayman launch. The HD 6990 is clearly designed to get the performance crown back for AMD at all costs.
However, AMD is not abandoning their strategy of “aiming for the sweet spot”. The HD 6000 series now supports a more flexible form of Eyefinity than the 5000 series, and it is also going to support 3D PC gaming and 3D video playback. What may be confusing to many is that HD 6870 and HD 6850 are slower than HD 5870 and HD 5850 respectively and yet are replacing them. AMD’s goal is to give more gamers the ability to experience HD 58×0-type performance in a less expensive, smaller and even less power-hungry video card.
On the other hand, the HD 6970 is AMD’s fastest single-GPU card and the true successor to the HD 5970. In many ways, a picture is worth 1,000 words and here is AMD’s continuation of their video card strategy with the addition of the new HD 6990 at the very top:
We still see changes only to the upper midrange and at the high end. AMD is continuing the HD 5700 series for now; unchanged, as they evidently feel unchallenged by Nvidia’s GTS 450 which we reviewed here against HD 5750. In the above chart, we see the HD 5800 series diverge into 3 streams – the “Antilles” reviewed today, an “X2” video card at the highest end as a successor to the current dual-GPU HD 5970; the “Cayman” as HD 6970 and HD 6950 which is using AMD’s fastest single GPU to succeed to surpass the Cypress HD 5870.
Here is the HD 6990 in its most modest incarnation – stock-clocked from the factory.
With the recent HD 6950/6870 Cayman launch, AMD was originally directly targeting the just now discontinued Nvidia’s GTX 480. Things move very quickly in the graphics card world. So now, the HD 6970 (SEP $369.00) has taken on the GTX 480 and the brand new GTX 570. In fact you can find the HD 6970 at NewEgg for about $320 each – a pair for Crossfire is about $50 cheaper than a single HD 6990. Below is AMD’s opinion of the way the graphics market stood just before they released the HD 6990 to replace the HD 5970.
What’s new in HD 6990?
Since “seeing is believing” is AMD’s theme for the 6000 series launch and it is all about the 3 “eyes”, we shall briefly cover them here:
Under Eyedefinition, we see a further subdivision with more efficient tesselation; there is mention of a Barts tweaked engine, offering up to 2x the tessellation performance of the HD 58×0 GPUs – an area where AMD was perceived weak in comparison to Nvidia’s Fermi GPUs in heavily tessellated benchmarks and games. With Cayman we see the potential for a further increase of geometry performance and we also see mention of enhanced architecture for efficiently using GPU compute and for improvement and performance in games. We also note improvements in Anisotropic Filtering (AF) and new Anti-Aliasing modes – morphological AA and EQAA.
Except for the HD 6990, the HD 69×0 series features 2x DVI ports and one HDMI port plus two mini-DisplayPorts which are DP version 1.2. This is important because of the new Eyefinity features that now allow for daisy chaining of displays and for using a new hub, much like using a USB hub, to output to six displays from a single card!
With just one DVI port, the three included adapters takes on some importance.
Now we see the possibilities of using HD 6990 for the brand new super-widescreen 5×1 Eyefinity:
Below we see 5×1 Eyefinity in portrait mode. Thin bezels are a real advantage.
Eyespeed refers to GPU compute and to AMD’s “open initiative” approach to (everything and especially to) OpenCL, in contrast to Nvidia’s use of their own proprietary GPU language, CUDA. We see AMD partnering with Cyberlink, Arcsoft, Viewdle, Adobe, Microsoft and more companies (some of which are also Nvidia’s partners) to bring you, the end consumer, quality video processing and playback; and of course, UVD 3 accelerated decoding for 3D BluRay playback.
In our HD 68×0 launch article, we evaluated AMD’s claim of 35% better performance per mm over HD 58×0 and found that the HD 6870 is about equal in performance to HD 5850 overall. Not really too much has architecturally changed from Cypress except that Barts has up to 2x the performance of the tessellator in the HD 58×0 GPU. Here is the Barts GPU from AMD’s own presentation slide.
We were able to confirm that tessellation was superior in Barts over Cypress in Tessellation-heavy games and engines; in Lost Planet 2 and in Unigine’s Heaven, we saw the weaker-performing HD 6870 beat the generally faster HD 5870. And now we see that Cayman has been further improved over Barts and Cypress .
We note that there are dual-graphics engines and the Radeon’s VLIW5 core architecture has been replaced by the more efficient VLIW4 using 24 SIMD engines in the HD 6870 and 96 texture units. Overall efficiency will be improved over both Barts and Cypress.
There is a complete core redesign with a more efficient VLIW4 thread processing that is more efficient that the previous VLIW5 design.
The Render Back-Ends have necessarily been upgraded and AMD has enhanced the GPU Compute as described below.
To see what it brings new, we note that the UVD engine has been updated; HDMI 1.4a is available for 3D Blu-ray and we see an improved Tesselator Engine. AMD now uses a second Ultra Threaded Dispatch Processor and an improved engine logic. We have noted in previous reviews, that Nvidia’s Fermi GPUs are faster in heavily tessellated scenes than competitive AMD Cypress GPUs. Well, now AMD claims a solid tessellation improvement over Cypress and HD 58×0 series and calls their method “tessellating the right way”.
In the case of Barts, it was supposed to be twice more efficient than the Cypress HD 5870 and now Cayman’s HD 6970 is supposed to be three times more efficient. And now we have two Cayman HD 6970 GPUs inside one HD 6990. Here is AMD’s slide of their own internal testing.
Of course, we have to test this out to see what it means in a practical way for us gamers.
Morphological Adaptive AA
AMD’s new morphological anti-aliasing technique works as a post process effect. In other words, the GPU finishes rendering each frame as usual – but before presenting it to the display, it runs it through another shader pass to perform the filtering. This differs from traditional multi-sample and super-sample AA techniques where the filtering occurs during the rendering of each frame. In fact, this technique can eliminate aliasing for still images, though it’s intended to work better when in motion.
The filter works by first detecting high contrast edges with various pixel-sized patterns that are normally associated with aliasing, and assumes they should actually be straight lines that are not aligned to pixel edges. It then estimates the length and angle of the ideal line for each edge, and determines the proportional coverage by the lighter and darker color for each pixel along the edge. Finally it uses this coverage information to blend the colors for each pixel. All of this is actually being accomplished by the Catalyst drivers through a DirectCompute shader while the Local Data Share is used to keep adjacent pixels in memory for a low overall overhead. It will be interesting to see if AMD chooses to extend this morphological adaptive AA to the 5000 series as there is no reason it cannot be done, except perhaps to differentiate HD 6000 series from the current one.
AMD’s diagrams (below) should help to illustrate how this is accomplished.
Since the edge detection step requires frequent sampling and re-sampling of adjacent pixel colors, it offers a lot of opportunities for data re-use by using the LDS (Local Data Share) hardware to avoid redundant data fetches and to significantly improve performance. AMD sent us a driver very late in our testing and we are unable to evaluate it as yet. We simply cannot comment on what we have not yet evaluated.
Enhanced Quality Anti-Aliasing (EQAA)
EQAA is a new anti-aliasing option available on the AMD Radeon HD 6900 series. It offers enhanced quality over standard Multi-Sample Anti-Aliasing (MSAA) modes by doubling the number of coverage samples per pixel, while keeping the same number of color/depth/stencil samples. This technique offers advanced smoothing of aliased edges without requiring additional video memory, and with a minimal performance cost.
The new Enhanced Quality Anti-Aliasing modes can be enabled by selecting the 2xEQ, 4xEQ, or 8xE modes that have been added to the anti-aliasing slider in AMD Catalyst Control Center. EQAA is fully compatible with all other supported anti-aliasing techniques, including Adaptive AA, Super-Sample AA, Custom Filter AA (Edge-Detect), and Morphological AA. Selecting the Enhance Application Settings option from the drop-down box will cause applications that natively support MSAA modes to use equivalent EQAA modes instead.
Selecting the Override Application Settings option will force applications to use EQAA modes if they are selected on the slider; this setting will often work even if an application does not natively support antialiasing.
Anisotropic Filtering (AF)
With the HD 5000 series, AMD brought genuine angle-independent filtering to gaming by putting an end to angle-dependent deficiencies. The AMD Radeon HD 6900 series continues to support fully angle invariant anisotropic filtering, and incorporates further improvements in LOD precision relative to the ATI Radeon HD 5000 Series. These image quality benefits come with no additional performance cost and remain enabled at all Texture Filtering Quality settings.
However, our own Senior Editor BFG10K pointed out the flaws in AMD’s Anisotropic filtering with the transitions, here, here and here. AMD listened to us and the enthusiast community and they have improved the transitions between filter levels. Well, examining these improvements are beyond the scope of this performance evaluation, but rest assured that BFG10K will again provide the definitive answers in a future review right here at ABT.
AMD PowerTune™ Technology
AMD PowerTune is a new technology that attempts maximum performance at TDP. It allows the GPU to be designed with higher engine clock speeds which can be applied on the broad set of applications that have thermal headroom. AMD PowerTune technology helps enable higher performance that is optimized to the thermal limits of the GPU by dynamically adjusting the engine clock during runtime based on an internally calculated GPU power assessment. AMD PowerTune technology also helps to improve the mechanism to deal with applications that would otherwise exceed the GPU’s TDP. In other words, like Nvidia’s power limiter, they do not allow a power virus such as FurMark to exceed a predetermined limit.
AMD PowerTune allows for the GPU to run within its TDP budget at higher clock speeds than otherwise possible by managing the engine clock speeds based on calculations which determine the closeness of the GPU to its TDP limit. In other words, it will not throttle the GPU when TDP is exceeded, as abruptly.
AMD’s PowerTune technology can be directly adjusted by the user using the AMD Catalyst Control Center, AMD Overdrive tab. PowerTune can be tweaked to more aggressively limit power and heat or be used by enthusiasts to squeeze every last bit of performance out of the Cayman GPU in overclocking.
AMD PowerTune technology dynamically adjusts the performance profile in real time to fit within the TDP envelope.
Architectural improvements and Engineering a 450W video card
AMD met quite a challenge in putting two HD 6970 Cayman GPUs into a single package and they pulled out all the stops breaking the 300W PCIe 2.0 specification for the very first time in any reference card.
Dissipating 450W is difficult inside the HD 6990 shroud’s limited space and we have to say the results are quite noisy when the card spins up – well above the level of the GTX 480 reference design and the HD 4870-X2. Unfortunately it commands your attention while you are gaming and it is very hard to ignore. We will note that on the lower factory-clocked setting these noise issues are far less intrusive.
Like Cypress, all Barts and Cayman GPUs are produced with the 40 nm process. AMD’s reference Cayman Radeon HD 6970 has 1536 Stream Processors with its core operating at 880 MHz with 2GB of GDDR5 at 5.5 GHz (880/1375 MHz) on a 256-bit bus. There are 32 ROPs and 96 Texture units.
Now double this for the HD 6990 and there are also 4GB of fast DDR5 vRAM for gaming at the highest Eyefinity resolutions. The difference is that the HD 6970’s memory is clocked at 1375MHz while the HD 6990’s is clocked at 1250MHz; when we ran HD 6990 plus HD 6970 in TriFire-X3, we clocked our core and memory at 880/1375MHz for all GPUs, which incidentally, must be set separately in the Catalyst Control Center.
The HD 6970’s maximum load board power is 250 watts (and 190 Watts with Power Tune; “typical gaming power” – see explanation above) and its idle is 20 watts and it uses a 6-pin+8-pin PCIe cables. The HD 6990 idles at 37W and uses two 8-pin connectors and will easily reach 450W with Power Tune set to maximum.
Here is the specification chart for the HD 6990 and remember that these are the milder factory settings:
As we will see shortly, there is a switch for the HD 6990’s end user to increase the voltage and core clock way beyond the already over-spec 375W to 450!
We put all of our Radeon cards through their paces this week with the very latest WHQL drivers – Catalyst 11-2 except for HD 6990 and HD 6970. Of course, we used the release drivers for the HD 6990 which are Catalyst 11-4 beta, and to make it fair, we used the same drivers for the HD 6970. This driver brought some good performance increases over the 11-2 WHQL drivers.
AMD has not forgotten the enthusiast community. Flashing a BIOS incorrectly can lead to despair so AMD has enabled a toggle switch for a Dual BIOS option for a factory default BIOS that a user can always fall back on by flipping a switch. It is a nice feature that is on HD 6970 and HD 6950.
UPDATE: Be aware that if you do flip the switch, you are voiding your warranty as far as AMD is concerned. AMD clarified this to ABT only today (3/9/2011) as we had quite a different impression of this situation from their presentation to the press last week. If you are considering buying one of these cards, make sure to get in writing what each of AMD’s partners are going to do regarding their own warranties and and using the factory-overclocked settings. We will make sure to keep you informed on our ABT forum. All future testing will be done with the HD 6990 in the locked position.
AMD is continuing to use and refine their vapor chamber cooling for several years and the HD 6990 is designed to cool their two very hot-running GPUs. Imitation is the sincerest form of flattery and Nvidia has just begun to use vapor chamber technology and their own diagram demonstrates how it works with their newest GTXes.
Is AMD’s HD 6990 worth $50 or so more than HD 6970 Crossfire?
We naturally want to know if the new AMD HD 6990 card is worth $700 as we compare it to the $350 GTX 570 and the $500 GTX 580; after all it is twice the price of the GTX 570 and you can buy a pair of HD 6970s for CrossFire and keep $50 for yourself. On top of this, the CrosFired HD 6970 pair will be faster than HD 6990 in most cases although they will use more power.
The overclock on our HD 6990 can best be described as decent – from 830/1250 MHz to 850/1250MHz with its switch flip; to 960/1390 MHz as our highest stable overclock. It was not stable at 970 MHz and we had some issues with the unseasonably warm temperatures in our testing lab that approached 80F. We only used CCC to set our Radeon overclocks and we did not increase the core voltage nor change the fan profile and we tested in a very warm environment that could be described as “Summer-like”. Power Tune was set to its maximum, +20% .
Because of severe time constraints on this article, HD 6000 series CrossFire will be examined in depth in a further article as well as 3-panel Eyefinity (which can be driven off of a single Radeon) verses Nvidia’s competing Surround which requires SLI to make it work. We used our Intel Core i7-920 at 3.8 GHz for this evaluation with turbo on (one core will hit 3.99GHz) so there was little chance of significant CPU bottlenecking. Read on to see our test bed and the games we used.