i965/fs: lower all non-force_writemask_all DF instructions to SIMD4 on IVB/BYT
authorSamuel Iglesias Gonsálvez <siglesias@igalia.com>
Thu, 25 Aug 2016 14:05:24 +0000 (16:05 +0200)
committerFrancisco Jerez <currojerez@riseup.net>
Fri, 14 Apr 2017 21:56:08 +0000 (14:56 -0700)
The hardware applies the same channel enable signals to both halves of
the compressed instruction which will be just wrong under non-uniform
control flow. Fix this by splitting those instructions to SIMD4.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
src/intel/compiler/brw_fs.cpp

index cae15542fa1495f131c85eee01c023ef43841e71..4dcdc1b46de77131f1c61f8de5b229b43adf108d 100644 (file)
@@ -4598,6 +4598,15 @@ get_fpu_lowered_simd_width(const struct gen_device_info *devinfo,
        */
       if (channels_per_grf != (exec_type_size == 8 ? 4 : 8))
          max_width = MIN2(max_width, channels_per_grf);
+
+      /* Lower all non-force_writemask_all DF instructions to SIMD4 on IVB/BYT
+       * because HW applies the same channel enable signals to both halves of
+       * the compressed instruction which will be just wrong under
+       * non-uniform control flow.
+       */
+      if (devinfo->gen == 7 && !devinfo->is_haswell &&
+          (exec_type_size == 8 || type_sz(inst->dst.type) == 8))
+         max_width = MIN2(max_width, 4);
    }
 
    /* Only power-of-two execution sizes are representable in the instruction