nvfx: rework state_fb code to get rid of render temps
This commit rewrites a lot of the state_fb code to support
rendering to targets not aligned to 64 byte.
This allows us to drop the render temporaries as unaligned
targets are the only use-case where they are really needed. The
temporaries code was used for a lot of things more, but apparently
those also work without temps.
There is one regression in piglit fbo-clear-formats, but this will
be fixed with the use of real hardware clears and doesn't matter in
practice as no real application tries to scissor clear a 2x2 pixel
render target.
Signed-off-by: Lucas Stach <dev@lynxeye.de>