Ruminations on various benchmarks for the OMAP 3600s, Hummingbird, and Snapdragon
The following is by Sean the Electrofreak, ABT Guest Contributor. As with everything that we publish at AlienBabelTech, the opinions expressed are solely those of the individual writer and do not necessarily reflect the views and the opinions of the rest of the ABT staff.
I’ve been thinking about some of the performance benchmarks I’ve been seeing on AndroidAndMe.
CPU performance from the new TI OMAP 3640 (yes, they’re wrong again, its 3640 for the 1 GHz SoC, 3630 is the 720 MHz one) is surprisingly good on Quadrant, the benchmarking tool that Taylor is using. In fact, as you can see from the Shadow benchmarks in the first article, it is shown outperforming the Galaxy S, which initially led me to believe that it was running Android 2.2 (which you may know can easily triple CPU performance). However, I’ve been assured that this is not the case, and the 3rd article seems to indicate as such, given that those benchmarks were obtained using a Droid 2 running 2.1.
Now, the OMAP 3600 series is simply a 45 nm version of the 3400 series we see in the original Droid, upclocked accordingly due to the reduced heat and improved efficiency of the smaller feature size.
If you need convincing, see TI’s own documentation: http://focus.ti.com/pdfs/wtbu/omap3_pb_swpt024b.pdf
So essentially the OMAP 3640 is the same CPU as what is contained in the original Droid but clocked up to 1 GHz. Why then is it benchmarking nearly twice as fast clock-for-clock (resulting in a nearly 4x improvement), even when still running 2.1? My guess is that the answer lies in memory bandwidth, and that evidence exists within some of the results from the graphics benchmarks.
We can see from the 3rd article that the Droid 2’s GPU performs almost twice as fast as the one in the original Droid. We know that the GPU in both devices are the same model, a PowerVR SGX 530, except that the Droid 2’s SGX 530 is, as is the rest of the SoC, on the 45 nm feature size. This means that it can be clocked considerably faster. It would be easy to assume that this is reason for the doubled performance, but that’s not necessarily the case. The original Droid’s SGX 530 runs at 110 MHz, substantially less than its standard clock speed of 200 MHz. This downclocking is likely due to the memory bandwidth limitations I discussed in my Hummingbird vs Snapdragon article, where the Droid original was running LPDDR1 memory at a fairly low bandwidth that didn’t allow for the GPU to function at stock speed. If those limitations were removed by adding LPDDR2 memory, the GPU could then be upclocked again (likely to around 200 MHz) to draw even with the new memory bandwidth limit, which is probably just about twice what it was with LPDDR1.
So what does this have to do with CPU performance? Well, it’s possible that the CPU was also being limited by LPDDR1 memory, and that the 65 nm Snapdragons that are also tied down to LPDDR1 memory share the same problem. The faster LPDDR2 memory could allow for much faster performance.
Lastly, since we know from the second article at the top that the Galaxy S performs so well with its GPU, why is it lacking in CPU performance, only barely edging past the 1 GHz Snapdragon?
It could be that the answer lies in the secret that Samsung is using to achieve those ridiculously fast GPU speeds. Even with LPDDR2 memory, I can’t see any way that the GPU could achieve 90 Mtps; the required memory bandwidth is too high. One possibility is the addition of a dedicated high-speed GPU memory cache, allowing the GPU access to memory tailored to handle its high-bandwidth needs. With this solution to memory bandwidth issues, Samsung may have decided that higher speed memory was unnecessary, and stuck with a slower solution that remains limited in the same manner as the current-gen Snapdragon.
Lets recap: TI probably dealt with the limitations to its GPU by dropping in higher speed system RAM, thus boosting overall system bandwidth to nearly double GPU and CPU performance together.
Samsung may have dealt with limitations to the GPU by adding dedicated video memory that boosted GPU performance several times, but leaving CPU performance unaffected.
This, I think, is the best explanation to what I’ve seen so far. It’s very possible that I’m entirely wrong and something else is at play here, but that’s what I’ve got.