ATi 5770 Bottleneck Investigation

Introduction

This article is part two of my Radeon 5770 investigation. Part one covered performance, while a future installment will cover image quality in depth. This article also serves as another installment in my ongoing bottlenecking series, and I’ve already tested a GTX285, GTX260+ and a 8800 Ultra in a similar manner.

Today I’ll attempt to answer the burning question of whether the 5770 is primarily limited by its memory bandwidth or not. The question has been asked and hypothesized repeatedly, but I haven’t yet seen a comprehensive attempt to answer it.

While ATi’s hardware doesn’t allow separate shader and core clocks like nVidia’s does, we can still isolate the memory from the core by changing one of the clocks, but leaving the other at stock. Since there’s no real limit as to how far each can be underclocked, I’ve chosen a nice round figure of 20%. I’ll underclock the core and memory individually by 20% while leaving the other at stock, and see which has the greatest impact on performance.

In the past I’ve found the selection of titles and settings can influence the overall result to some degree, so this time I’ve picked a larger selection of games at varying settings to get a more accurate picture. I’ve also run all tests with at least 2xAA since it’s reasonable to expect someone gaming on a 5770 will use at least this setting. Also since AA hits the memory bandwidth harder, there’s a greater chance of stressing it more if it’s indeed the primary limiting factor.

The stock results have been generated by my 5770 performance review, and the system and benchmark setup in this article is identical to those tests. Essentially this review takes certain results from that review and compares to them to newly generated results from underclocking the 5770.

Since the stock clocks of a 5770 are 850 MHz core and 1200 MHz memory, a 20% underlock is 680 core and 960 memory. The results are color coded in such a manner.

«»

Pages: 1 2 3
  • bimmerboy79

    well done review, i am contemplating these 5000 series amd cards. would be nice to have a quieter system with enough power to game the way i want. also dx11 is a plus , i’m interested in eyefinity… need more monitor though.

    but again another nice review

  • BFG10K

    Thanks for taking the time to read the article and comment on it, bimmerboy79.

    I really appreciate your positive feedback!

  • ryrynz

    Wow, it’s nice to see a review like this, also the IQ review you have for the 5000 series has made me register and leave this comment. I highly look forward to reading future reviews of this nature in future. Well done.

  • BFG10K

    Thanks for your kind comments, ryrynz. Readers like yourself make articles like these all worthwhile. :)

    It is my continual goal to write articles that are a different to the standard fare.

  • Bo_Fox

    Let’s look at a 4890, which also has identical core specifications as a 5770 (850 MHz core, 800 shaders, 40 TMU’s, 16 ROP’s) minus DX11 functions and slight R800 architecture optimizations (let’s say around ~3%).

    A 4890 has 62.5% greater memory bandwidth (with 3.9GHz effective GDDR5 times 256-bit divided by 8-bits for 125 GB/s bandwidth). By comparison, a 5770 has 4.8GHz effective GDDR5, but with only 128-bit bus for 77 GB/s bandwidth.

    A 4890 performs roughly 19% better than a 5770 overall (according to several benchmark review sites like TechPowerUp for example). If a 4890 also had those R800 architecture optimizations, it would be more like 22% better.

    When the memory on a 4890 is overclocked further (with the core left at stock clocks), there are still consistent gains found in several newly-released games according to Anandtech. With the memory clocked to 5870′s bandwidth (which is 100% greater than 5770′s bandwidth), and 4890′s core kept at stock, there is an average of 5% gain.

    This means that a 5770 could easily have 22% + 5% better performance with 100% greater bandwidth. That is, with a 256-bit bus instead of 128-bit bus for 27% greater performance.

    Implementing a 256-bit bus is relatively cheap (perhaps costing the manufacturer only $10 more or so). The power consumption is increased at a fraction of what it takes to boost the GPU clock or number of shaders/TMU’s in order for the performance to be raised by 27%.

    Many of us would still rather have a 4890 than a 5770 even if it means losing DX11 features along with perfectly circular AF quality and SSAA features. A 4890 sells for more than a 5770, which says it all.

    BFG10K, this is a good article, but of course, the core is the main muscle that does the work. The penalties of memory bandwidth was actually almost equal to the penalty of the core doing less work!

  • Bo_Fox

    Here’s a great article that explains the costs of doubling the bus of the memory (for 2x the bandwidth). I would also presume that the latency remains roughly the same, unlike with trying to increase the clocks as high as possible, which would mean a latency of CAS 11 or even 13. http://www.beyond3d.com/content/interviews/39/5

    Eric Demers, architecture lead on R600, responded ot the question: “Does a 512-bit bus require a die size that’s going to be in the neighbourhood (or bigger) of R600 going forward?

    No, through multiple layers of pads, or through distributed pads or even through stacked dies, large memory bit widths are certainly possible. Certainly a certain size and a minimum number of consumers is required to enable this technology, but it’s not required to have a large die.”

    Although the core of the R600 was severely limited by 320 shaders (with the AA work being moved onto the shaders), the article goes to show that doubling the bus width is not as costly as we thought.

  • BFG10K

    Excellent comments, Bo_Fox.

    Since the 5770 is a reasonably balanced part based on my findings, I’d concur that raising memory bandwidth could boost performance almost as much as raising the core could.

  • truth

    the 5770 is rebranded hype. the only thing new it offers is dx11, eyefinity, and 40nm. most people buying the 5770 won’t be using eyefinity. I mean come on if you can afford the luxury of 3 monitors for gaming, you can afford a better card.

    not to mention that eyefinity for 3 or even 2 screens would bring the 5770 to a crawl.(even if the resolution was 1024×768! it would be like running 3072×2304 resolution with three 1024×768 screens)..so for eyefinity, the 5770 is really only good for 3 monitors used on old games or no games at all.

    you could say the 5770 is the 5870 cut in half, or the 4870 shrunk to 40nm with half the ram bus and a 100mhz slower clock. or the 4890 with half the ram bus. either way, ALL of these other cards perform better than the 5770, with the exception of the 4870 sometimes.

    dx11 is a bit hyped like dx10. tessellation being one of the largest. tessellation is more about helping developers create games easier than about creating more graphics quality. it only works well in a limited number of scenarios. so instead, more programming is required than modeling. it’s like telling your computer to construct a model with an algorithm, at the cost of half your gpu power…rather than loading an already created model from ram at less of a performance hit, it builds a lower polygon model into a higher one, when if you had a higher polygon model to begin with, you’d still have 100% of your gpu power than 50% going to tessellation.

    it’s just very inefficient in most scenarios. tessellation is going to be more of a background feature than anything…at least I hope so.

    it’s a fact, tessellation will halve your frame rate or more. i’ve seen it go as far as making 150fps into 50fps…so imagine if you’re already only getting 50fps in a new game with the 5770, without tessellation…your fps will drop down to 25 for sure, and maybe even to 20. I’d also like to note that tessellation is resolution independent, which is bad because no matter what resolution you’re running at, the tessellation is taking the same amount of your gpu power from you.

    so in short all tessellation does is make it easier for modelers by allowing them to do less work. or in contrast, it could put the demand for modelers very low; causing many modelers to have trouble finding work, or finding enough work.

    with tessellation on, the performance is more like you bought a radeon 3000 series…and that is simply not acceptable.

    I really wanted a 5770, until I learned all of these details, and more. I’ve literally read all of the reviews, all of the feedback, etc and came to my own conclusions about the card. right now, it’s like paying $160 for a 2-3 year old card that will be obsolete within a year from now. I could see games like mass effect 2 choking on it at even medium resolutions and settings (with tessellation OFF), and that’s only coming out in 2 months!

    with the 5770 you might be able to run 2010 games half decent, but in 2011 all bets are off, you’ll need to upgrade again unless you don’t mind playing with severely reduced quality settings.

  • Anti-truth

    It seems we have a pro-nVidia fanboi here, “truth” (ahhh the irony), who is espousing verbatim recent nVidia PR: DX11 isn’t important because nVidia GPUs can’t do DX11. Tesselation is a waste of time and slows systems down because nVidia GPUs can’t do tessalation. The HD 5770 is rebranded hype even though nVidia is still selling mildly tweaked versions of 2007′s 8800GT, firstly as the 9800GT, the GTS 250, etc.

    But wait, nVidia’s latest GPU releases can now do DX10.1! Sorry mate, but ATI had that over 12 months ago, with the launch of Windows 7 and the DX11 API behind us, buying anything of nVidia’s today means that you are already obsolescent. There are few DX11 games here today but you will only have to wait one more quarter for that to change.

    Oh, and lets not forget the HD 5770 is a highly energy efficient $160 midrange card which almost matches the performance of the $400+ GTX 260 that nVidia was trying to sell 18 months ago.

    Bottom line: if you are in the market for an economical mid range card that you intend to keep for 2-3 years, you need something which is DX11 capable, matches the last generation performance models (HD 4870/GTX 260) and won’t break the bank i.e. the HD 5770 should definitely be on the short list.

  • Anti-truth

    In a post above “truth” claims that DX11 will cause a performance hit of 50% or over compared to earlier versions.

    Sadly its all nVidia FUD – Fear Uncertainty Doubt – being spread there. Check out the following independent link at Hard OCP

    http://www.hardocp.com/article/2009/12/07/dirt_2_demo_dx11_performance_iq_preview/3

    Its pretty clear that the DX11 code path in Dirt 2 does cause a performance hit but its in the 8-15% range – and the payoff is noticeably better image quality.

  • Azn

    Kind of absurd observation from BFG when he doesn’t include minimum frame rates.

    As much as 5770 is a decent card I still think this card needs a little more bandwidth to play with just like 4850 did.

    Here’s a classic review of an overclocked Gainward 4850 1024mb that is getting similar average frame rates of 4870 with same core clocks as 4870 yet it still lags behind in minimum frame rates. Notice the average frame rates and minimum frame rates as you raise resolution. Minimum frame rates plummet on the 4850 to unplayable frame rates while 4870 does not.

    http://www.xbitlabs.com/articles/video/display/gainward-hd4850-1024mb-gs_8.html#sect1

  • Azn

    The results are combination of SP and core clock overclocking/underclocking that result in better average frame rates. I just don’t see the 5770 as an equal 4870/4890 counter part. It can’t with half the bandwidth and expect similar results. Average maybe but not minimum frame rates as bandwidth up’s minimum frame and core peaks maximum frame rate.

  • BFG10K

    Azn,

    A minimum is useless without a benchmark plot putting it into context because it’s a single data point by definition. Without such a plot it could just be benchmarking noise which doesn’t translate into reality to any significant degree.

    That and dips can affect the average unless they’re happening trivially, which brings me back to the first point.

    As for the 5770 equaling the 4870/4890, that depends on whether bandwidth is its primary limitation, and most evidence is pointing to that not being true.

  • Northernfrog

    My question as a video card noobie .Looking for card that will do for now and instead of buying new card down the road , buy another 5770 to crossfire-x with .Will 2 5770 together have better performance then the 5870 alone and would it still do good 3yrs down the road. Would it be a good approach for now and the future ??? Want something good now and to kick ass when i can afford the other half . Is it over kill or underill ? Would the two together out perform there next generation flagship next year . ???

    This was an excellnt article answering alot of my questions THANK YOU .

    Second, what about difference between quality and performance of ati card builders of the 5770 and others . Would be nice if i can find that info in one place . Example (compare XFX , Gigabyte , Asus , MSI , and others ) against each other for the same 5770 model . It’s scary reading some reviews stating falling off fan that was just glued by some , to points like lifetime warranty on the XFX compared to others with only 2 or 3yrs warranty . Is this info kept away for a reason . They all differ in price by only $20 or so but i can’t afford to buy the junk .

    I’ll wait for advice and info before i make my purchase! THANKS to all ………Northernfrog

    P.S. Noisey fans drive me insane !