i965/vec4/dce: Don't narrow the write mask if the flags are used
authorIan Romanick <ian.d.romanick@intel.com>
Thu, 21 Jun 2018 00:18:30 +0000 (17:18 -0700)
committerIan Romanick <ian.d.romanick@intel.com>
Mon, 17 Dec 2018 21:47:06 +0000 (13:47 -0800)
commit440c051340669e809511c05370d6d703c70f6d0e
treea469e7857cc69b8aba42cdf420fb76b068b87f4a
parent111bcc8d028b5d71aacdd080671578b665a9f4ed
i965/vec4/dce: Don't narrow the write mask if the flags are used

In an instruction sequence like

            cmp(8).ge.f0.0 vgrf17:D, vgrf2.xxxx:D, vgrf9.xxxx:D
    (+f0.0) sel(8) vgrf1:UD, vgrf8.xyzw:UD, vgrf1.xyzw:UD

The other fields of vgrf17 may be unused, but the CMP still needs to
generate the other flag bits.

To my surprise, nothing in shader-db or any test suite appears to hit
this.  However, I have a change to brw_vec4_cmod_propagation that
creates cases where this can happen.  This fix prevents a couple dozen
regressions in that patch.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 5df88c20 ("i965/vec4: Rewrite dead code elimination to use live in/out.")
src/intel/Makefile.compiler.am
src/intel/compiler/brw_vec4_dead_code_eliminate.cpp
src/intel/compiler/meson.build
src/intel/compiler/test_vec4_dead_code_eliminate.cpp [new file with mode: 0644]