nvfx: rewrite draw code and buffer code
This is a full rewrite of the drawing and buffer management logic.
It offers a lot of improvements:
1. A copy of buffers is now always kept in system memory. This is
necessary to allow software processing of them, which is necessary
or improves performance in many cases.
2. Support for pushing vertices on the FIFO, with index lookup if necessary.
3. "Smart" draw code that tries to intelligently choose the cheapest
way to draw something: whether to use inline vertices or hardware
vertex buffer, and whether to use hardware index buffers
4. Support for all vertex formats supported by the hardware
5. Usage of translate to push vertices, supporting all formats that are
sensible to use as vertex formats
6. Support for base vertex
7. Usage of Ben Skeggs' primitive splitter originally for nv50, allowing
correct splitting of line loops, triangle fans, etc.
8. Support for instancing
9. Precomputation using the vertex elements CSO
Thanks to Ben Skeggs for his primitive splitter originally for nv50.
Thanks to Christoph Bumiller for his nv50 push code, that was the basis
of this work, even though I changed his code dramatically, in particular
to replace his ad-hoc vertex data emitter with translate.
The changes could also go into nv50 too, but there are substantial
differences due to the additional nv50 hardware features.
21 files changed: