intel/compiler: use correct swizzle for replacement
authorLionel Landwerlin <lionel.g.landwerlin@intel.com>
Wed, 27 Feb 2019 15:53:21 +0000 (15:53 +0000)
committerLionel Landwerlin <lionel.g.landwerlin@intel.com>
Wed, 27 Feb 2019 20:06:42 +0000 (20:06 +0000)
The optimization in 4cd1a0be76883c introduced a replacement of :

cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D
...
cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D

By :

cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D
...
mov(8) vgrf11.y:D, vgrf15.yyyy:D

The first cmp instruction is storing in x while the second mov is
sourcing from y. We need to take into account where the replacement on
the scan_inst destination is going to store thing so that the
replacement mov can source things from the correct location.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 4cd1a0be76883c ("i965/vec4: Propagate conditional modifiers from more compares to other compares")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
src/intel/compiler/brw_vec4_cmod_propagation.cpp

index 760327d559d62fc2c8e7d4a39c9b666f8d60e430..a7a3bb8fb062ac239888bd1fb69adf9a07f0e317 100644 (file)
@@ -173,19 +173,19 @@ opt_cmod_propagation_local(bblock_t *block, vec4_visitor *v)
 
                   /* Given a sequence like:
                    *
-                   *    cmp.ge.f0(8)  g21<1>.xF      g20<4>.xF      g18<4>.xF
+                   *    cmp.ge.f0(8)  g21<1>.zF      g20<4>.xF      g18<4>.xF
                    *    ...
-                   *    cmp.nz.f0(8)  null<1>D       g21<4>.xD      0D
+                   *    cmp.nz.f0(8)  null<1>D       g21<4>.zD      0D
                    *
                    * Replace it with something like:
                    *
-                   *    cmp.ge.f0(8)  g22<1>F        g20<4>.xF      g18<4>.xF
-                   *    mov(8)        g21<1>.xF      g22<1>.xxxxF
+                   *    cmp.ge.f0(8)  g22<1>.zF      g20<4>.xF      g18<4>.xF
+                   *    mov(8)        g21<1>.xF      g22<1>.zzzzF
                    *
                    * The added MOV will most likely be removed later.  In the
                    * worst case, it should be cheaper to schedule.
                    */
-                  temp.swizzle = inst->src[0].swizzle;
+                  temp.swizzle = brw_swizzle_for_mask(inst->dst.writemask);
                   temp.type = scan_inst->src[0].type;
 
                   vec4_instruction *mov = v->MOV(scan_inst->dst, temp);