NVIDIA’s DirectX 11 Architecture: GF100 (Fermi) In Detail
Article written by Mark Poppin and BFG10K, AlienBabelTech Senior Editors.
Introduction
At their Graphics Technology Conference (GTC) last September 30th, NVIDIA announced their next-generation graphics architecture, codenamed Fermi. We reported on it for you here, here and here in a three-part series. At the GTC, graphics performance was not the focus of Tesla Fermi. Rather the conference was emphasizing NVIDIA’s new architecture as a revolutionary General Purpose Processor that takes much more advantage of their new Fermi GPU’s abilities of superfast parallel processing over their current architecture. NVIDIA’s goal is to dominate the professional market with their Tesla GPUs. Now that Fermi GF100 GPUs for NVIDIA’s new video cards are finally in mass production, we will be looking at how NVIDIA intends to dominate gaming.

Fermi Production Mock-up

Fermi GPU
To summarize the new architecture, Fermi boasts a brand new shader core whose compute clusters comprise a single shader multiprocessor (SM). Each stream processor has a fully-pipelined integer arithmetic logic unit (ALU) and floating point unit (FPU). Each SM can dual-issue two independent instructions per clock to two different warps. Each instruction is run by a 16-way SIMD block that handles single-precision Floating Multiply-Add Instruction (FMAs). The Fermi memory hierarchy is also new, sporting a new unified L2 cache that serves all of the SMs without partitions. In addition, a new unified memory space allows each SM to not only communicate with its own local registers and shared memory, but now with L2 cache and beyond.
The GF100 features 768KB unified level-two cache as well as a rather complex cache hierarchy. In addition, many other GPU-compute areas of performance are improved over NVIDIA’s current Tesla architecture GPUs, GT200. The GF100 hardware can sustain peak Single Precision (SP) and Double Precision (DP) FMA instruction throughput. Atomic instruction throughput is maximized over the current generation and Fermi is backed by ECC which is absolutely necessary for GPU computing. This all comes together to support a new type of multi-threading technology which improves the efficiency of the 512 cores working together. The entire Fermi family is compatible with DirectX 11, OpenGL 3.x and OpenCL 1.x application programming interfaces (APIs). The new chips are finally in mass production using 40nm process technology at TSMC.
Let’s go ahead and see what is new and improved with GF100.
ati status
[told] x
benchmarks? none?
Benchmarks in a review of brand new GPU architecture!?!
– when have you seen that before?
We expect to have benchmarks vs. GTX 285 and vs. Radeon when we get the actual cards.
I noticed that you mentioned “time check”. These comments must be approved manually; sorry for any delay.
I thought you guys guys were gonna post something substantial, not this rehash. NDA my ass…
This is what they gave us TTimmy. It is not like other sites got anything different and we got garbage.
However, I do understand that this is not what you all wanted to see. We also wanted to see some more definitive information such as benchmark numbers, clock speeds, release date and pricing but…ABT isn’t releasing a video card (yet), they are.
I wouldn’t call what we posted, a “rehash”. What has happened with this progressive revelation, always happens with new architecture – no matter who releases it, AMD, NVIDIA or Intel. First you get the general information about upcoming architecture, then more and more information is released until it goes into production.
Much of the information about Fermi’s GeForce in our article is brand new information about Fermi’s gaming capabilities. Much of what we wrote about was not disclosed anywhere previously. There is a lot more to add since we wrote about Fermi’s computing architecture last year.
As I understand it, only the devs would have engineering samples of GF100. That means NVIDIA’s partners would not have them nor would any tech review site. There are no fixed clocks, nothing about power consumption nor thermals – and certainly no solid performance benchmarks other than what NVIDIA did internally. Not yet.
I hope Nvidia release a decent mid range $300 Fermi GPU.