Big GPU-Shootout; Part III, PCIe 1.0 vs. PCIe 2.0
INTRODUCTION:
Here is our third installment of “Big GPU-Shootout”. Part I covered Cat 8.8 vs. Geforce 177.41 on Intel’s P35 MB platform as we examined the performance of five video cards. The new cards we tested were: HD4870-512MB, HD4870X2-2GB, GTX280; while 8800GTX & 2900XT represented the top and mid-range cards of the last generation. The resolutions we test at for these reviews at are fairly demanding for any PC – 1920 x 1200 and 1680 x 1050. We used a combination of ten benchmarks, including PC games’ built-in performance benchmarks and custom time demos as well as synthetic tests, and we came to some interesting conclusions. We realized that last generation’s video cards are not sufficient for today’s Vista DX10 maxed-out gaming at 1920×1200 and even 1650×1080 with 4xAA/16xAF. We realized that a modern gamer wanting to play the latest DX10 PC games needs to upgrade to this generation of video cards. Our CPU for the first article of our series compared e4300 at it its overclock of 3.33Ghz with e8600 at its stock 3.33 GHz and found the older CPU lacking in comparison. We continued for the rest of our series with our e8600 which we later overclock to nearly 4.0 Ghz for the next 3 reviews, and we always use the DX10 pathway in Vista 32, whenever possible, for this entire series. We also started to bench with crossfireX-3 in Part I which ran on fairly immature drivers and we will continue to chart its progress.
Part II – Cat 8.9 vs. Geforce 178.13 was also tested on our P35 motherboard (PCIE 1.0/Crossfire 16x+4x) and demonstrated to us the need for overclocking our e8600 CPU from stock 3.33 Ghz to nearly 4.0 Ghz to take full advantage of our new video cards. We also set new benchmarks with these drivers that we are continuing to use to compare two different motherboard platform’s performance into Part III, this article. For Part II, we added a couple of more games and refined our testing slightly. We also noted that the ranking of the new video cards has remained the same: 4870X2, GTX280 and 4870 while crossfireX-3 got more mature drivers over the last Catalyst 8.8 set.
Part III, for this review – PCIe 1.0 vs. PCIe 2.0 – we are also using Cat 8.9 vs. Geforce 178.13 as in the last review, Part II in our series. However, this time, we are particularly comparing individual video card and crossfireX-3 performance with our new x48 motherboard’s double-the-bandwidth 2.0 PCIe vs. our old P35 motherboard’s PCI express (PCIe) 1.0 specification. We also look at the potential performance increase in upgrading from P35 motherboard’s PCIe 1.0 bandwidth – doubling to X48 motherboard’s PCIe 2.0 specification with our top 3 video cards: HD4870, GTX280 and HD4870X2. We will also note the possible benefits of Intel’s X48 motherboard’s full X16 + X16 crossfire slots over P35 motherboard’s X16 + X4 crossfire slots.
This time, there are a lot of comparisons to be made between these two motherboard platforms as we look for possible reasons a PC gamer would upgrade his motherboard platform – or not. AMD’s crossfire or Nvidia’s SLi technology is a solid way to transform an ordinary gaming machine into a gamer’s powerhouse. However, with several Intel-based platforms supporting crossfire across different PCI express lane configurations, there are several options that concern many gamers about whether or not their motherboard can provide enough bandwidth to realize the full potential of crossfire – or even a single fast GPU like GTX280. So we are going to take the two extremes – P35 motherboard’s PCIe 1.0 and crossfire 16x + 4x PCIe vs. X48’s PCIe 2.0 and 16x +16x crossfire slots and examine their performance differences.
Since Intel’s Core 2 chipset development has been halted in favor of Core i7, it is now a good time to analyze how crossfire scales on these two motherboard chipsets as a guide for those looking to possibly upgrade to the best performance so as to get the best bang for the buck. We are assuming that the owners of P35-based motherboard high-end systems, probably should have purchased Intel’s fastest Core 2 Duo processor and four gigabytes of high-speed memory so as to still have a decent gaming PC. In this review, we will attempt to determine if P35 motherboards are suitable for crossfire, SLi or single slot 4870, 4870-X2 or GTX280 upgrades.
P35 Express
Intel released its P35 Express chipset in mid-2007 and brought with it support for FSB-1333 processors and DDR3 SDRAM, though Intel left support of DDR2 memory intact. It’s spec is PCIe 1.0 or 1.1 and we chose a Gigabyte P35 DS3P motherboard which features PCIe 1.0 and crossfire PCI express lanes of 16x+4x and DDR2.
X48 Express
X48 and X38 chipsets are basically the same with the only real difference between the two chipsets is that Intel approved X48 for 1,600 MHz FSB speeds; both the X48 and X38 have built-in support for FSB 1600. The main features that Intel released in its X38/X48 chipsets were aimed at high-end graphics. First was full support for AMD’s crossfire by adding sixteen more lanes for full x16 transfer mode to both cards instead of splitting them into 16x+4x as in P35 Motherboards or even splitting them in half as in the later motherboard chipset P45’s 8x+8x Crossfire lanes. In addition to the added pathways, PCI express 2.0 transfer rates doubled peak slot bandwidth. We purchased a DDR2-supporting, X48-based ASUS P5E Deluxe for our tests to more directly compare with our older Gigabyte’s DDR-2 P35 DS3P MB. Overclocked to nearly 4.0 Ghz for both of our platforms, our Core 2 Duo E8600 required good RAM to achieve its optimum performance. Unfortunately, our older P35 chipset couldn’t use the DDR2-1000 setting without instability. We settled for identical RAM speeds and latencies at DDR2-800 speeds for both motherboards to compare performance identically.
Let’s begin with Part III round of testing. We are still using Catalyst 8-9 and Geforce 177.83 as in our last article. Only final certified drivers are used for our regular testing consistently all through these these reviews up-until-now and also all through our entire series. Identical 250 GB hard drives are set up with the latest version of Vista 32-SP1; each with identical programs, updates and patches – the only differences are the video cards and the motherboards. The testing hardware is detailed in the following chart:
Test Configuration
Test Configuration – Hardware
* Intel Core 2 Duo E8600 (reference 3.33 GHz, Overclocked to 3.99Ghz ).
* Gigabyte P35-DS3P (Intel P35 chipset, latest BIOS. PCIe 1.0 specification; crossfire 16x+4x).
* ASUS P5e-Deluxe (Intel X48 chipset, latest BIOS. PCIe 2.0 specification; crossfire 16x+16x).
* 4 GB DDR2-PC8500 RAM (2×2 GB, dual-channel at PC6400 speeds).
* Nvidia GeForce GTX280 (1 GB, nVidia reference clocks) by BFGTech
* ATi Radeon 4870 (512 MB, reference clocks) by Sapphire
* ATi Radeon 4870 (1GB, reference clocks) by ASUS
* ATi Radeon 4870X2 (2 GB, reference clocks) by VisionTek
* Onboard RealTek audio
* 2 – Seagate Barracuda 7200.10 Hard Drives [setup identically, except for the graphics cards]
Test Configuration – Software
* ATi Catalyst 8.9, highest quality mip-mapping set in the driver; Catalyst AI set to “advanced”
* nVidia Geforce 178.13, high quality driver setting, all optimizations off, LOD clamp enabled.
* Windows Vista 32-bit SP31; very latest updates
* DirectX August 2008.
* All games patched to their latest versions.
Test Configuration – Settings
* as noted, vsync off in the driver to “application decide” and never in game.
* 4xAA only enabled in-game; all settings at maximum 16xAF applied in game [or in CP except as noted; No AA/AF for Crysis and No AA for UT3]
* All results show average, minimum and maximum frame rates
* Highest quality sound (stereo) used in all games.
* Vista32, all DX10 titles were run under DX10 render paths
3DMark06
3DMark06 still remains the number one utility used for a system benchmark. The numbers it produces aren’t indicative of real-world gameplay – or any gameplay in general – and for that reason we really dislike using it to compare different systems. However, as long as the rest of the tech world uses it to evaluate gaming performance, we will too. We find it mostly useful for tracking changes in a single system, what we are mostly doing now. There are four “mini-games” that it uses for benchmarking graphics, as well as two CPU tests. The scores are “weighed” and added together to give an overall number and there is a further breakdown possible with these mini games that we are charting for you.
Above is a scene from one of the four benchmark “mini games” used to benchmark GPU performance. It will give your PC a real workout even though the default resolution is only 12×10 (as pictured). Here are the results of our 3DMark06 benchmark comparison using the benchmark at its default settings:
3DMARK06
Unlike with Part I and II of our series, we have a small surprise. We can see the ranking and we note the maturity of both sets of vendor’s drivers are good. HD4870X2 scales well although crossfireX-3 still barely scales in this combination of drivers and HW in this synthetic benchmark. However, the thing to note is the comparison of the our motherboard’s performance. First of all, there is little difference with 4870 -512MB in either the 1.0 PCIe board or in the PCIe 2.0 mother board. However, we do see GTX280 making a significant jump from 15073 to 16045 by simply upgrading to PCIE 2.0! There is almost no difference in 4870×2 or crossfire-X3 performance with either MB and it could possibly be attributed to drivers. We also note the 4870-1GB scores a bit higher than its 512MB VRAM sister card.
The 3DMark06 mini-games more specifically compare the video cards performance. We note that it is more significant than the final 3DMark overall score might suggest. Here we also note that 4870-X3 outperforms 4870-X2 in every test and generally there is a small performance increase going from the limited bandwidth of PCIe 1.0 to 2.0. Again, there is little difference in this benchmark with the 4870×2’s 2GB VRAM paired with the crossfired second card being either 512MB instead of 1.0GB or being limited to x4 PCIe instead of x16.
Vantage
Now we move on to Vantage, Futuremark’s latest test. Of course, we feel the same way about Vantage as we do about 3DMark06 – it is really useful for tracking changes in a single system. However, it has become a new de facto standard for measuring PC video gaming performance and we will use it also. There are two mini-game tests: Jane Nash and Calico. Also there are two CPU tests but we are focusing on the graphics performance
Below are two scenes from Vantage, both shown at 1920×1200 resolution and with every setting fully maxed out.
For our benchmarking results below, we used the default tests.
Vantage results:
Again we see the 4870, 4870X2 and GTX280 ranking unchanged but the 2.0 specification’s superiority over the 1.0 spec is not seen in the overall score except with GTX280 and crossfire-X3. Strangely, the 512MB card as the second card instead of our 1GB 4870, appears to make no difference. We also see the GTX280 has a rather high GPU score compared to the others and we also noted this in Cat 8-10 testing.
Fur Benchmark
Fur Benchmark v1.2 is Open Gl and a synthetic benchmark. We picked 1920×1200 resolution and 4xMSAA.
Fur OpenGL benchmark
Here our 280GTX takes off, unfettered by the P35 motherboard’s limited PCIe bandwidth. 4870X2 and 4870-X3 have a solid but perhaps less dramatic improvement while 4870-512MB makes no difference in either motherboard. We also note the 512MB version is outperformed by the 4870-1GB version in this test. Enough of the synthetics – on to PC games!
Call of Juarez
Call of Juarez is the first ever DX10 benchmark from Techland that was released in June 2007 as a fast-paced Wild West Epic Adventure Shooter. Call of Juarez is loosely based on Spaghetti Westerns that become popular in the early 1970s. Call of Juarez features it’s Chrome Engine using Shader Model 4 with DirectX 10, so the usage of Vista is mandatory. This benchmark isn’t built into Call of Juarez but is a stand-alone that runs a simple fly-through of a level that’s built to showcase the game’s new DX10 effects. It offers great repeatability and is a good stress test for DX10 features in today’s graphics cards although it is not quite the same as actual gameplay as the game logic and AI are stripped out of the demo. Still it is very useful in comparing video cards performance.
Performing Call of Juarez benchmark is easy as you are presented with a simple menu to choose resolution, anti-aliasing, and two choices of shadow quality options. We set the shadow quality on “high” and the shadow map resolution to its maximum, 2048×2048. At the end, the demo presents you with the minimum frame rate, maximum frame rate, and average frame rate, along with the option to quit or run the benchmark again. We always ran the benchmark at least a second time and recorded that generally higher score.
Call of Juarez DX10 benchmarks
Interesting. 280GTX shows mixed results on our motherboards. It shows more gain at the lower of the two resolutions we chose. Again, 4870-512MB shows little difference in either motherboard with the edge going to the X48. This time, the 512MB 4870 beats the 1GB version, consistently! 4870-X2 gains +7 FPS in the maximum frame rate at 1920×1200 while crossfireX-3 gains about +5 FPS with the PCIe 2.0 MB over the P35 1.0 spec. It is doubtful it makes a practical playable difference in the minimum frame rates, as we see 4870X2 only picking up a few tenths of a frame at 1920×1200, but crossfireX-3 gains more than +3 FPS – from 40>43.4. For 16×10 resolution, perhaps there is more reason to upgrade you P35 MB as we do see a bit more performance using X48 motherboard; but not for crossfireX-3 with 1.0 GB VRAM over 512 MB as the 2nd card in Call of Juarez.
CRYSIS
Now we move on to Crysis. It is the most demanding game released to date for any PC. Crysis is a sci-fi first person shooter by Crytek and published by Electronic Arts, November, 2007. Crysis is based in a fictional near-future where an ancient alien spacecraft is discovered buried on an island near the coast of Korea. The single-player campaign has you assume the role of USA Delta Force, ‘Nomad’ in the game. He is armed with various futuristic weapons and equipment, including a “Nano Suit” which enables the player to perform extraordinary feats. Crysis uses DirectX10 for graphics rendering.
A standalone but related game, Crysis Warhead was released in September, 2008. It is notable for providing a similar graphical experience to Crysis, but with less graphical demands on the PC at its highest ‘enthusiast’ settings. CryEngine2 is the game engine used to power Crysis and Warhead and is an extended version of the CryEngine that powers FarCry. As well as supporting Shader Model 2.0, 3.0, and DirectX10’s 4.0, CryEngine2 is multi-threaded to take advantage of SMP aware systems. Crysis also comes in 32-bit and 64-bit versions and Crytek has developed their own proprietary physics system, called CryPhysics. There are three built-in demos that are very reliable in comparing video card performance. However, it is noted that actually playing the game is a bit slower than the demo implies.
GPU Demo, Island
All of our settings are set to ‘maximum’ but we do NOT apply any AA/AF in the game. Here is Crysis’ Island Demo benchmarks, at 1920×1200 resolution, then 1680×1050:
Not much difference for GTX280 at 1920×1200, but a little improvement is noted at 1680×1050 with our X48 motherboard over P35. 4870-512MB appears to have very little difference in either motherboard and the 1GB version is faster at all settings. This time, 4870-X2 picks up solid performance in the PCIe 2.0 slot – 19FPS at the minimium at 1920×1200, goes up to the almost playable 25 FPS! CrossfireX-3 also gets some smaller performance gains in our newer motherboard; again, the amount of gain is still probably held back by driver immaturity.
Quake Wars: Enemy Territory
Quake Wars: Enemy Territory is an objective-driven, class-based first person shooter based in the Quake universe. It was developed by id Software and Splash Damage for Windows and published by Activision. Quake Wars pits the combined human armies of the Global Defense Force (GDF) against the technologically superior Strogg, an alien race who has come to earth to use humans for spare parts and food. It allows you to play a part – probably best as an online multi-player experience – in the desperate battles waged around the world in mankind’s desperate war to survive. Quake Wars has controllable vehicles and aircraft as well as multiple AI deployables, asymmetric teams, big maps and the option of computer-controlled ‘bots. Online and offline play modes are available as PC versions let you play individual campaigns against bots and allows the player combat as both sides. The Human weapons and vehicles are mostly based on modern combat weapons and vehicles but updated for the 50 years into the future. The Strogg have alien weapons and vehicles.
Quake Wars is an OpenGL game based on id’s Doom3 game engine with the addition of their MegaTexture technology. It also supports some of the very latest 3D effects seen in today’s latest DX10 games, including soft particles. id’s MegaTexture technology is designed to provide very large maps without having to reuse the same textures over and over again. For our benchmark we chose the flyby, Salvage Demo, from Quake Wars: Enemy Territory. It is one of the most graphically demanding of all the flybys and is very repeatable and reliable in its results. It is fairly close to what you will experience in-game. All of our settings are set to ‘maximum’ and we also apply 4xAA/16xAF in game.
Salvage Demo fly-by:
Our GTX280 liked our X48MB better than P35. Except for a single maximum frame rate at 1920x12oo, all the other tests showed gains in the newer motherboard. 4870 got rather mixed results and the 1GB version is generally edging out the 512MB version. 4870-X2 strangely preferred the P35 MB, but not crossfireX-3 which had much better performance on X48. This time crossfireX-3 with a 512MB version in the second slot was beaten badly when it was compared to “true” CrossfireX-3 [4870×2-2GB + 4870-1GB] – 56 FPS as a minimum at 1920×1200 to 83 FPS !!
Consider this a warning to those who use “FrankenfireX-3” – 512MB is a limiting factor when it happens.
F.E.A.R.
F.E.A.R. – First Encounter Armed Assault – is a DX9c game by Monolith Productions that was originally released in October 2005 by Vivendi Universal Production. Later, there were two expansions with the latest, Perseus Mandate, released in 2007. Although the game engine is aging a bit, it still has some of the most spectacular effects of any game, showcasing a powerful particle system, complete with showers of sparks, puffs of smoke and dust for collisions with objects and walls as well as including bullet marks on walls and other visual effects including “soft shadows”. This is well highlighted by the the built-in performance test although it was never updated. This performance test will tell you how F.E.A.R. and its first expansion Extraction Point will run, but Perseus Mandate is more demanding on your PC graphics and will run slower than the demo. We always run at least 2 sets of tests with all in-games features at ‘maximum’ – one featuring “soft shadows” and the other with 4xAA instead, as they do not run well together. F.E.A.R. uses the Jupiter Extended Technology engine from Touchdown Entertainment.
All settings are set to ‘maximum’ as we first we test at 1920×1200 with no AA but with Soft shadows enabled and then with 4xAA, no soft shadows:
Finally we test at 1650×1080 with no AA but with Soft shadows enabled and then with 4xAA, no soft shadows:
We see GTX280 taking advantage of the PCIe 2.0 specification, but no one playing on a P35 motherboard would notice any practical difference. Ditto for 4870 and 4870 but we see far more mixed results. CrossfireX-3 generally gets more performance out of X48 and in most of the F.E.A.R. benches it is an advantage to have a 1GB card in the second slot – especially with 4xAA applied.
Half-Life2: Lost Coast
Half-Life2 is still a popular game and is the oldest game we review for this series. Half-Life2: Lost Coast is an additional level for this 2004 game. Lost Coast was released October, 2005 as a free download to all purchasers of Half-Life2. Lost Coast was developed as a playable tech demo, intended to showcase the newly-added high definition range lighting – HDR – features of the Source Engine. Lost Coast features some very minor storyline details that were scrapped from Half-Life2. A flyby of this level is played during the HL2 video stress test and it is very repeatable and quite accurate. All in-game settings are fully maxed out, including 4xAA/16xAF.
This old Source engine barely benefits from a newer platform and each of our modern GPUs easily hit the frame rate cap. Still, we generally see some improvement in all of our video card configrations in favor of upgrading to X48 – but not for Half-Life2.
[The top 1920×1200 resolution chart is mislabeled – the last two GPUs, 4870-X2 (1.0) and 4870-X2 (2.0) should instead be labeled CFX3-512M and CFX3-1G respectively.]
Lost Planet DX10 benchmark
Lost Planet: Extreme Condition is a Capcom port of an Xbox 360 game which became the first DX10 game. It takes place on the icy planet of E.D.N. III which is filled with monsters, pirates, big guns, and huge bosses. This frigid world makes a great environment to highlight the benefits of high dynamic range lighting (HDR) as the snow-white environment reflects blinding sunlight, while DX10 particle systems toss snow and ice all around. The game looks great in both DirectX 9 and 10 and there isn’t really much of a difference between the two versions except perhaps shadows. Unfortunately, the DX10 version doesn’t look that much better when you’re actually playing the game and the DX10 version still runs slower than the DX9 version.
There are two versions of this benchmark. One was released as a stand-alone demo and the other is in-game. We chose the in-game demo from the retail copy of Lost Planet released on June 26, 2007 and updated through Steam to the latest version for our benchmark runs. This run isn’t completely scripted as the bugs spawn and act a little differently each time you run the demo. The benchmark is more of a scripted flyby of the level with “noclip” turned on. This means the benchmark won’t make an absolutely perfect comparison between different hardware setups, even with identical game settings. So we ran it many times. Lost Planet’s Snow and Cave demos are run continuously in-game and blend into each other. All settings are fully maxed out with 4xAA and 16AF applied.
Here are our benchmark results with Snow and Cave. All settings are fully maxed out in game including AA/AF – first at 1920×1200 resolution:
Lost Planet Benchmarks
And Now at 16×10:
Lost Planet shows a huge performance increase for GTX280 when it is unfettered by PCI express 1.0 bandwidth constraints; in some cases from barely playable to solidly playable when it is on a X48 motherboard. 4870-X2 and even 4870 has a nice increase on a X48 motherboard over P35. CrossfireX-3 also gets a nice boost from PCIe 2.0 but this time there isn’t a lot of performance difference between 16x+4x and 16x+16x PCIe lanes; probably future drivers will show it.
Unreal Tournament3
Unreal Tournament3 (UT3) is the fourth game in the Unreal Tournament series. UT3 is a first-person shooter and online multiplayer video game by Epic Games. It was released for Windows on November, 2007. While many games share the same Unreal3 engine, the developers can decide how high the system requirements will be by increasing the level of detail. Unreal Tournament3 provides a good balance between image quality and performance, rendering complex scenes even on lower-end PCs. Of course, on high-end graphics cards you can really turn up the detail. UT3 is primarily an online multiplayer title offering several game modes and it also includes an offline single-player game with a campaign.
For our tests, we used the latest game patch for Unreal Tournament3. The game doesn’t have a built-in benchmarking tool, however, so we used FRAPS as well as HardwareOC’s benchmark tool which does a fly-by of a chosen level. Here we note that performance numbers reported are a bit higher than compared to in-game. The map we use is called “Containment” and it is one of the most demanding of the fly-bys. Our tests were run at resolutions of 1920 x 1200 and 1680 x 1050 with the UT3’s in-game graphical options set to their maximum values. One drawback of the way the UT3 engine is designed is that there is no support for anti-aliasing built in. While video card vendors have found ways to force this in their driver’s control panels, we did not force it. Again, we use the “timedemo-style” of benchmark for UT3. A “demo” is recorded in the game and a set number of frames are saved in a file for playback. When playing back the demo, the game engine then renders the frames as quickly as possible, which is why you will often see the it playing it back more quickly than you would actually play the game.
Containment Demo
It is starting to become somewhat predictable. GTX280 shows the superiority of the X48 motherboard over it’s P35 predecessor while 4870 is more mixed. Even the 4870-1GB version did not convincingly prove superior to the 512MB version here, so we need to test with AA ‘forced’ next time; ditto for 4870-X2. CrossfireX-3 with 1GB 4870 in the second slot is generally faster where it is more important at the minimums, but we are not straining it with these tests.
S.T.A.L.K.E.R.
S.T.A.L.K.E.R., Shadows of Chernobyl is a first person shooter by GSC Game World, published in 2007. This game has a non-linear storyline and features role playing gameplay elements such as trading and allying with NPC factions. In S.T.A.L.K.E.R., the player assumes the identity of “The Marked One” – an amnesiac illegal artifact scavenger in “The Zone” which encompasses roughly 30 square kilometers. It is the location of an alternate reality story surrounding the Chernobyl Power Plant after another (fictitious) explosion. GSC Game World released a prequel story expansion on September 5, 2008 as Prologue: S.T.A.L.K.E.R., Clear Sky, and it has just become a brand new DX10 benchmark for us.
S.T.A.L.K.E.R. features “a living breathing world” with highly developed NPC creature AI. It uses the X-ray Engine – a DirectX8.1/9 Shader model 3.0 graphics engine featuring HDR, parralax and normal mapping, soft shadows, motion blur, weather effects and day-to-night cycles. As with other engines using deferred shading, the X-ray Engine does not support anti-alising with dynamic lighting enabled. However, a form of anti-aliasing can be enabled that uses a technique to blur the image to give an impression of anti-aliasing. We set all the graphical options – including “AA” – to their maximum values.
Our benchmarks for this DX9c game are timedemo runs called “short” and “building”. Their flaw would be that the maximum frame rates are skewed way too high as the camera pans the sky. The maximums should mostly be disregarded although the minimums and averages are fairly representative of what you actually encounter in game. Even the best video cards will suffer stutters occasionally although the general gameplay is better than the minimum suggests.
S.T.A.L.K.E.R. Buildings and Short Benchmarks
Now at 1680×1050:
The results are almost too mixed to draw conclusions at first look. However, at the average and minimum, GTX280 is a bit faster on the X48 motherboard as is 4870. 4870-1GB is more capable than the 512MB version at the higher of the two resolutions and crossfireX-3 on X48 beats the same configuration on the more bandwidth starved P35. The “true” crossfire-X3 pulls solid minimums over “frankenfire” with a 512MB 4870 paired with the 2GB 4870-X2.
PT Boats: Knights of the Sea DX10 benchmark
We added a new DX10 benchmark in addition to the others used in our first Part I Shootout article. PT Boats: Knights of the Sea is a stand-alone DX10 benchmark utility released by Akella, last year. It is a benchmark-friendly tech demo of their upcoming simulation-action game. This DX10 benchmark test runs reliably and apparently provides very accurate and repeatable results.
We set the only settings options available to us as follows:
DirectX Version: DirectX 10
Resolution: 1920×1600 and 1680×1050 at 60 Hz
Image Quality: High
Anti aliasing: 4x
PT Boats DX10 benchmark
We see really mixed results for the GTX; it is beaten as badly on ‘maximum’ 1920×1600 as it wins at 1650×1080. We checked, several times, to make sure we did not reverse something. 4870-512MB is whipped on any motherboard by its 1GB relative and it also is held back on the P35 platform to a complete slideshow. CrossfireX-3 when forced to run with a 4870-512MB version also takes a dive in the framerates compared with “true” crossfireX-3. The 4870-X2 also benefits by being resident on a X48 motherboard.
Let this be a potential warning to those considering upgrading, that 512MB might not be a wise choice for newer games if you want absolutely maxed out DX10 settings – even at 1650×1050 resolution! If you are considering a 512MB version of 4870, a few extra dollers might be better spent on the 1GB version – especially if you are considering crossfire or keeping your video card for a couple of years.
Conclusion
PCIe 1.0 vs 2.0 performance and 16x+16x Crossfire vs. 16x+4X
Here we have to go back to our ranking to see which of our 5 video configurations benefit best from x48’s doubling of bandwidth over P35’s 1.0 PCI express specification.
- CF-x3
- 4870×2
- 280GTX
- HD4870-1GB
- HD4870-512MB
Upgrading crossfire-ready motherboards with a second graphics card is an excellent way for gamers to generally extend the useful life of their systems as it provides good performance increases at medium to high resolutions, with high details and especially with 4xAA/16xAF. Our individual benchmarks show that crossfireX-3 performance is rather inconsistent on P35 motherboards. We saw occasionally how one could suffer losses due to the low bandwidth of the second graphics card slot. If you have a P35 motherboard you might want to consider which games you will be playing before buying that second video card.
The easy solution for P35 express systems is to simply use a single, more powerful card. Where a secondary PCI Express x4 slot’s limited bandwidth might hurt performance, it best avoided by using a single powerful card. Of course, GTX280 or crossfire sandwich cards like 4870-X2 are also held back a bit by P35MB’s PCIe 1.0 compared to X48’s PCIe 2.0 bandwidth. However, in judging the usefulness of adding a second card when a powerful card is already installed, we can see that P35 express motherboards simply cannot be upgraded as well as X48.
Now when we are speaking 4870-512MB vs. 4870-1GB, it is likely that the 1GB card will have an advantage going forward for new games. Larger textures are always coming out and the 1GB version might be a good “investment” if you are planning to keep your card a couple of years. Our 4870-512MB just falls short of the 1GB version for maximum detailing and/or added AA/AF, yet both would be excellent for playing the newest and most demanding PC games at 1680×1050. We also saw the immaturity of the Crossfire-X3 drivers holding back performance of “true crossfire” – 4870-X2 plus 4870-1GB – over the “frankenfire” 4870-X2 plus 512MB 4870. It appears from our testing that 512MB VRAM limitations using 4870-512MB paired with 4870X2-2GB in crossfireX-3 exist for specific games and/or settings even including 1680×1050 resolutions. Especially in the upcoming DX10 PT Boats, 512MB VRAM would be a real disadvantage. CrossfireX-3 is a little faster than 4870-X2 and is becoming less and less of a curiosity than with previous drivers and it also stands to benefit most from PCIE 2.0 over 1.0 with improving drivers. We expect the situation will continue to improve as we are continuing our testing with Catalyst 8-12 now for future articles in our series.
We definitely see that our new X48 motherboard’s PCIe 2.0 specification makes some solid differences performance-wise over our PCIe 1.0 P-35 motherboard but it clearly depends on which card is used and what game is played. PCIe express 2.0 is not so critical for 4870-512MB class video cards but becomes more increasingly noticeable with GTX280 and 4870-X2 – and especially crossfire-X3 – when it scales well.
We will be back very soon with Part IV of our testing – using Catalyst 8-10 and Geforce 178.24 in a short article to get up to date, quickly and then Cat 8-11 compared with 8-12 and Geforce 180.48. Over this series, we also watch Nvidia’s drivers evolve over the same period as AMD’s to see if they managed to get more performance out of their Tesla architecture. We also expect to explore Nvidia’s propriatory PhysX with their Big Bang Drivers II (180.48) vs. Catalyst 8-12 and ATi’s now-enabled “Stream” drivers which is meant to take on Nvidia’s CUDA.
Here is our planned “GPU-Shootout” series:
Part I – Cat 8.8 (8/20/08) vs Geforce 177.41 (06/26/08) (p35 MB platform)[done]
Part II – Cat 8.9 (9/17/08) vs Geforce 178.13 (09/25/08) (p35 MB platform)[done]
Part III – Cat 8.9 (9/17/08) vs Geforce 178.13(09/25/08) (x48 MB platform from now on)[Done]
Part IV – Cat 8.10 (10/11/08) vs Geforce 178.24 (10/15/08) [benches completed]
Part V – Cat 8.11 (11/13/08) and Cat 8.12 (12/12/08) vs. Geforce 180.48 (11/19/08)
(Stream vs BigBangII; benching in progress)
– in the meantime, please check out Leon’s excellent mini Cat 8-11 vs 8-12 performance review with 4870 here.
We also just got our hands on a brand New DX10 Benchmark from STALKER, Clear Sky! .. it is “official” and was released at the very end of last year. It is our very next article – due tomorrow! And we will check it with the very latest drivers and we will add 12×10 resolution as we will also intend to show last generation’s 8800GTX and 2900xt benchmarks also. In a couple of weeks, we expect to also have Intel’s new 65w, low-power QuadCore Q9550 in for testing and we will also give you stock and overclocked comparisons of it vs. our Core2Duo e8600 which we can now get to well over 4.0Ghz.
–After that, we expect to compare the maturing Intel Core i-7 CPU platform with our current maxed out Penryn system … and we expect to also explore Nvidia GTX280/290 SLi on an X58 motherboard.
Stay tuned. All four parts of our planned testing has been completed – with Part V testing in progress – and we think we will have some very interesting articles for you to read very shortly as you plan your own coming upgrades.
Mark Poppin
ABT Editor
Curious, why’d you set Catalyst A.I. to “Advanced”?
How about a few links to explanations of Catalyst AI and what “advanced” really does? Here is an old article on it:
http://www.hardocp.com/article.html?art=NjY2LDI=
Here is the tweak guide which supports my own research:
http://www.tweakguides.com/ATICAT_7.html
“Catalyst A.I. allows users to determine the level of ‘optimizations’ the drivers enable in graphics applications. These optimizations are graphics ‘short cuts’ which the Catalyst A.I. calculates to attempt to improve the performance of 3D games without any noticeable reduction in image quality. In the past there has been a great deal of controversy about ‘hidden optimizations’, where both Nvidia and ATI were accused of cutting corners, reducing image quality in subtle ways by reducing image precision for example, simply to get higher scores in synthetic benchmarks like 3DMark. In response to this, both ATI and Nvidia have made the process transparent to a great extent. You can select whether you want to enable or disable Catalyst A.I. for a further potential performance boost in return for possibly a slight reduction in image quality in some cases. If Catalyst AI is enabled, you can also choose the aggressiveness of such optimizations, either Standard or Advanced on the slider. The Advanced setting ensures maximum performance, and usually results in no problems or any noticeable image quality reduction. If on the other hand you want to always ensure the highest possible image quality at all costs, disable Catalyst A.I. (tick the ‘Disable Catalyst A.I.’ box). I recommend leaving Catalyst A.I enabled unless you experience problems. ATI have made it clear that many application-specific optimizations for recent games such as Oblivion are dependent on Catalyst AI being enabled.
Note: As of the 6.7 Catalysts, Crossfire users should set Catalyst A.I. to Advanced to force Alternate Frame Rendering (AFR) mode in all Direct3D games for optimal performance. Once again, Catalyst A.I. should only be disabled for troubleshooting purposes, such as if you notice image corruption in particular games”
In other words, one can choose the aggressiveness of your optimizations, either “Standard” or “Advanced”. The Advanced setting ensures maximum performance – as for benchmarking games – and with no noticeable image quality reduction. However, if you are doing IQ comparisons as BFG10K did, and want to guarantee the very highest image quality, then disable Catalyst A.I. [but not for crossfire; set it to “Standard”]. I have always recommended leaving Catalyst A.I enabled unless you experience any glitches in games.
You have to realize that Cat AI is not necessarily supposed to give you a boost in every single game. It tries to do optimizations, if possible, but many times these are either not possible with a particular game, or the settings you’ve chosen in the game may be too low for it to make any noticeable impact.
That is why I recommend leaving it on “Advanced”; you get a possisble performance boost; if not then you lose nothing. Or you can set it to standard or off if you feel your image quality is being degraded.
Hope that explains it.
Very interesting, I’ll definitely be I check your site on a regular basis now.