NVIDIA’s DirectX 11 Architecture: GF100 (Fermi) In Detail
Tessellation and Displacement Mapping
It takes DX11 to take advantage of geometry. DX9 and DX10 are unable to create generalized geometry on the GPU. Therefore we will see Tessellation and displacement mapping used together to create more realism in games. The ability to control the geometric level of detail (LOD) is very important. Because it is on-demand and the data is all kept on-chip, precious memory bandwidth is preserved. Also, because one model may produce many LODs, the same game assets may be used on a variety of platforms which makes the game developers very happy. Their characters can also be easily adjusted as to how it appears in the scene; if it is small then it gets little geometry, if it is close to the screen then it is rendered with greater detail.
As an additional benefit, developers may be able to use the same models on many generations of games and future GPUs where performance increases will allow for enabling even greater detail than was possible when the game was first released. Complexity can be adjusted dynamically to even target a given frame rate!
Here in NVIDIA’s slide from the Unigine engine demo, we see tessellation compared, on and off. There is no comparison; tessellation adds to realism.
Take a look at the third image that we presented earlier in this article. The use of tessellation fundamentally changes the GPU’s graphics workload balance. With tessellation, the triangle density of a given frame can increase by multiple orders of magnitude which strains serial resources such as the setup and rasterization units. To facilitate high triangle rates, NVIDIA designed a scalable geometry engine called the PolyMorph Engine. Each of GF100’s 16 PolyMorph engines has its own dedicated vertex fetch unit and a tessellator which expands geometry performance.
In conjunction with the PolyMorph Engine, NVIDIA designed four parallel Raster Engines which allows up to four triangles to be setup per clock. Results calculated in each of five stages which are then passed to an SM. The SM executes the game’s shader, returning the results to the next stage in the PolyMorph Engine. After all stages are complete, the results are forwarded to one of the four Raster Engines.
The Rasterizer takes the edge equations for each primitive and computes pixel coverage. If antialiasing is enabled, coverage is performed for each multisample and coverage sample. Each Rasterizer outputs eight pixels per clock for a total of 32 rasterized pixels per clock across the chip. Pixels produced by the rasterizer are sent to the Z-cull unit. By having a dedicated tessellator for each SM, and a Raster Engine for each GPC, GF100 delivers up to 8 times the geometry performance of GT200. NVIDIA also compares the geometry performance of GF100 to HD 5870 and finds Fermi is significantly faster.
Here is a performance comparison between GF100 and HD 5870 using a 60 second run with the Unigine engine:
ati status
[told] x
benchmarks? none?
Benchmarks in a review of brand new GPU architecture!?!
– when have you seen that before?
We expect to have benchmarks vs. GTX 285 and vs. Radeon when we get the actual cards.
I noticed that you mentioned “time check”. These comments must be approved manually; sorry for any delay.
I thought you guys guys were gonna post something substantial, not this rehash. NDA my ass…
This is what they gave us TTimmy. It is not like other sites got anything different and we got garbage.
However, I do understand that this is not what you all wanted to see. We also wanted to see some more definitive information such as benchmark numbers, clock speeds, release date and pricing but…ABT isn’t releasing a video card (yet), they are.
I wouldn’t call what we posted, a “rehash”. What has happened with this progressive revelation, always happens with new architecture – no matter who releases it, AMD, NVIDIA or Intel. First you get the general information about upcoming architecture, then more and more information is released until it goes into production.
Much of the information about Fermi’s GeForce in our article is brand new information about Fermi’s gaming capabilities. Much of what we wrote about was not disclosed anywhere previously. There is a lot more to add since we wrote about Fermi’s computing architecture last year.
As I understand it, only the devs would have engineering samples of GF100. That means NVIDIA’s partners would not have them nor would any tech review site. There are no fixed clocks, nothing about power consumption nor thermals – and certainly no solid performance benchmarks other than what NVIDIA did internally. Not yet.
I hope Nvidia release a decent mid range $300 Fermi GPU.