i965/ff_gs: Generate URB writes using a loop.
Previously we only ever did 1 URB write, since the maximum number of
varyings we support is small enough to fit in 1 URB write (when using
BRW_URB_SWIZZLE_NONE, which is what the pre-Gen7 GS always uses). But
we're about to increase the number of varying components we support
from 64 to 128.
With 128 varyings, the most URB writes we'll have to do is 2, but it's
just as easy to write a general-purpose loop.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>