i965/vec4: Relax writemask condition in CSE
If the previously seen instruction generates more fields than the new
instruction, still allow CSE to happen. This doesn't do much, but it
also enables a couple more shaders in the next patch. It helped quite a
bit in another change series that I have (at least for now) abandoned.
v2: Add some extra comentary about the parameters to instructions_match.
Suggested by Ken.
No changes on Skylake, Broadwell, Iron Lake or GM45.
Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs:
11780295 ->
11780294 (<.01%)
instructions in affected programs: 302 -> 301 (-0.33%)
helped: 1
HURT: 0
total cycles in shared programs:
257308315 ->
257308313 (<.01%)
cycles in affected programs: 2074 -> 2072 (-0.10%)
helped: 1
HURT: 0
Sandy Bridge
total instructions in shared programs:
10506687 ->
10506686 (<.01%)
instructions in affected programs: 335 -> 334 (-0.30%)
helped: 1
HURT: 0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>