nvfx: use dynamically sized rotating BO pool for fragment programs
Currently we used a single buffer for each fragment programs, leading to
rendering synchronization. This patch uses a doubly linked list of BOs,
which is dynamically resized if all the BOs are busy.
Note that inline image transfers could be an alternative option: this
will be explored later.
This removes one of the big performance limitations of the current
driver.
We also stop using pipe_resource internally in favor of using nouveau_bo
directly.