i965: Unroll SIMD16 DDY_FINE on Sandybridge.
authorKenneth Graunke <kenneth@whitecape.org>
Tue, 29 Mar 2016 08:32:52 +0000 (01:32 -0700)
committerKenneth Graunke <kenneth@whitecape.org>
Mon, 25 Apr 2016 20:13:00 +0000 (13:13 -0700)
This fixes 10 dEQP-GLES3 subtests:
dEQP-GLES3.functional.shaders.derivate.dfdy.texture.float_nicest.*.

Matt noticed that our Piglit tests for this use even numbered registers,
while the failing dEQP tests use odd numbered registers.  We believe
that it works for even numbered registers, but not otherwise.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
src/mesa/drivers/dri/i965/brw_fs_generator.cpp

index fb9f65c6a3749f2baecdb7903b60b055765c13bc..812a75eceedf21ca2f86b780e93cb59471be2cde 100644 (file)
@@ -1138,12 +1138,16 @@ fs_generator::generate_ddy(enum opcode opcode,
        *
        * Similar text exists in the g45 PRM.
        *
+       * Empirically, compressed align16 instructions using odd register
+       * numbers don't appear to work on Sandybridge either.
+       *
        * On these platforms, if we're building a SIMD16 shader, we need to
        * manually unroll to a pair of SIMD8 instructions.
        */
       bool unroll_to_simd8 =
          (dispatch_width == 16 &&
-          (devinfo->gen == 4 || (devinfo->gen == 7 && !devinfo->is_haswell)));
+          (devinfo->gen == 4 || devinfo->gen == 6 ||
+           (devinfo->gen == 7 && !devinfo->is_haswell)));
 
       /* produce accurate derivatives */
       struct brw_reg src0 = brw_reg(src.file, src.nr, 0,