intel/fs: Allow limited copy propagation of a LOAD_PAYLOAD into another.
This is particularly useful in cases where register coalaesce is
unlikely to succeed because the LOAD_PAYLOAD isn't a plain copy --
E.g. when a LOAD_PAYLOAD is shuffling the contents of a barycentric
vector in order to transform it into the PLN layout.
This prevents the following shader-db regressions (including SIMD32
programs) in combination with the interpolation rework part of this
series. On SKL:
total instructions in shared programs:
18596672 ->
18976097 (2.04%)
instructions in affected programs:
7937041 ->
8316466 (4.78%)
helped: 39
HURT: 67427
LOST: 466
GAINED: 220
On SNB:
total instructions in shared programs:
13993866 ->
14202963 (1.49%)
instructions in affected programs:
7611309 ->
7820406 (2.75%)
helped: 624
HURT: 52943
LOST: 6
GAINED: 18
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>