nvfx: delay allocation of buffers in GART/VRAM to validation time
Currently we allocate buffers in GART or VRAM at creation time.
However, when using swtnl, this results in reads from uncached
memory, which drastically impair performance.
So, for now, cause nouveau_screen.c to not pass any placement flags
to buffer creation, so that the buffers are moved later.
Previously libdrm itself did this, but was changed to not to do it.
This may introduce an extra copy in normal usage, but this currently
does not seem to introduce significant performance degradation.
This will be revisited when pipebuffer is integrated.
Note that for AGP systems, properly solving this may be complex
since currently there is no fast way of reading from GART/VRAM.
We will probably need to try mapping AGP as writethrough and, in
addition, make buffer creation more aware of future buffer usage.