, . , ,
— .
Vertex pipeline — .
"
design", :
…
Whereas the GeForce4 had two parallel vertex shader units, the GeForce FX has a single vertex shader pipeline that has a massively parallel array of floating point processors
…
www.anandtech.com/show/1034/3
, , (
NV35) , , . . .,
NV35 , : « », , . , :)
NV35 —
3. , !
…
The GeForce FX Series runs vertex shaders in an array
…
en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units#GeForce_FX_.285xxx.29_Series
Pixel pipeline.Vetrex pipeline NVIDIA, . NVIDIA ,
NV30 8 ,
TMU ( ). "
8x1 design"
, . , : DX9- …
, ,
«» NV30/NV35 4x2 (4 2 TMU ). NVIDIA :
…
«Geforce FX 5800 and 5800 Ultra run at 8 Pixels per clock for all of the following:
a) z-rendering
b) Stencil operations
c) Texture operations
d) shader operations
For most advanced applications (such as Doom3) most of the time is spent in these modes because of the advanced shadowing techniques that use shadow buffers, stencil testing and next generation shaders that are longer and therefore make apps “shading bound” rather than “color fillrate bound. Only Z+color rendering is calculated at 4 pixels per clock, all other modes (z, stencil, texture, shading) run at 8 pixels per clock. The more advanced the application the less percentage of the total rendering is color, because more time is spent texturing, shading and doing advanced shadowing/lighting»
…
www.beyond3d.com/content/reviews/10/5
, , :
…
Beyond3D:
We've seen the official response concerning the pipeline arrangement, and to some extent it would seem that you are attempting to redefine how 'fill-rate' is classified. For instance, you are saying that Z and Stencils operate at 8 per cycle, however both of these are not colour values rendered to the frame buffer (which is how we would normally calculate fill-rate), but are off screen samples that merely contribute to the generation of the final image — if we are to start calculating these as 'pixels' it potentially opens the floodgates to all kinds of samples that could be classed as pure 'fill-rate', such as FSAA samples, which will end up in a whole confusing mess of numbers. Even though we are moving into a more programmable age, don't we still need to stick to some basic fundamental specifications?
Tony Tamasi:
No, we need to make sure that the definitions/specifications that we do use to describe these architectures reflect the capabilities of the architecture as accurately as possible.
Using antiquated definitions to describe modern architectures results in inaccuracies and causes people to make bad conclusions. This issue is amplified for you as a journalist, because you will communicate your conclusion to your readership. This is an opportunity for you to educate your readers on the new metrics for evaluating the latest technologies.
Let's step through some math. At 1600x1200 resolution, there are 2 million pixels on the screen. If we have a 4ppc GPU running at 500MHz, our «fill rate» is 2.0Gp/sec. So, our GPU could draw the screen 1000 times per second if depth complexity is zero (2.0G divided by 2.0M). That is clearly absurd. Nobody wants a simple application that runs at 1000 frames per second (fps.) What they do want is fancier programs that run at 30-100 fps.
So, modern applications render the Z buffer first. Then they render the scene to various 'textures' such as depth maps, shadow maps, stencil buffers, and more. These various maps are heavily biased toward Z and stencil rendering. Then the application does the final rendering pass on the visible pixels only. In fact, these pixels are rendered at a rate that is well below the 'peak' fill rate of the GPU because lots of textures and shading programs are used. In many cases, the final rendering is performed at an average throughput of 1 pixel per clock or less because sophisticated shading algorithms are used. One great example is the paint shader for NVIDIA's Time Machine demo. That shader uses up to 14 textures per pixel.
And, I want to emphasize that what end users care most about is not pixels per clock, but actual game performance. The NV30 GPU is the world's fastest GPU. It delivers better game performance across the board than any other GPU. Tom's Hardware declared «NVDIA takes the crown» and HardOCP observed that NV30 outpaces the competition across a variety of applications and display modes.
…
www.beyond3d.com/content/reviews/10/24
?…
, NVIDIA DX9- , .
NV3x :
- NV3x DX9, … !
- NV3x 500!
- NV3x DDR2!
, DX9 500 , 125 ( ) . - : . 0.13 ( ), . — .
, , ? DDR2 , Samsung : ! DDR2, : 128 . ?
, ,
NV3x 8 , TMU (8x1). . , , , . , = 32 .
8 , 8- 32- (=256 bit)
. , 256 . 128- , , 8- 32-
. fps.
? :
4 , 2 TMU (4x2). , , - :
…
Beyond3D:
Our testing concludes that the pipeline arrangement of NV30, certainly for texturing operations, is similar to that of NV25, with two texture units per pipeline — this can even be shown when calculating odd numbers of textures in that they have the same performance drop as even numbers of textures. I also attended the 'Dawn-Till-Dusk' developer even in London and sat in on a number of the presentations in which developers were informed that the second texture comes for free (again, indicating a 2 texture units) and that ddx, ddy works by just looking at the values in there neighbours pixels shader as this is a 2x2 pipeline configuration, which it is unlikely to be if it was a true 8 pipe design (unless it operated as two 2x2 pipelines!!) In what circumstances, if any, can it operate beyond a 4 pipe x 2 textures configuration, bearing in mind that Z and stencils do not require texture sampling (on this instance its 8x0!).
Tony Tamasi:
Not all pixels are textured, so it is inaccurate to say that fill rate requires texturing.
For Z+stencil rendering, NV30 is 8 pixels per clock. This is in fact performed as two 2x2 areas as you mention above.
For texturing, NV30 can have 16 active textures and apply 8 textures per clock to the active pixels. If an object has 4 textures applied to it, then NV30 will render it at 4 pixels per 2 clocks because it takes 2 clock cycles to apply 4 textures to a single pixel.
…
www.beyond3d.com/content/reviews/10/24
NVIDIA , ! , . , .
( ) FX, "
— !" (Happy Gilmore), "… , !" :)