intel/fs: fix SHADER_OPCODE_CLUSTER_BROADCAST for SIMD32
authorPaulo Zanoni <paulo.r.zanoni@intel.com>
Wed, 4 Sep 2019 22:07:20 +0000 (15:07 -0700)
committerJason Ekstrand <jason@jlekstrand.net>
Thu, 19 Sep 2019 02:48:27 +0000 (02:48 +0000)
The current code can create functions with a width of 32, which is not
supported by our hardware. Add some code to simplify how we express
what we want and prevent such cases.

For some unknown reason, all the tests I could run seem to work even
with these unsupported MOVs.

Fixes: b0858c1cc6 "intel/fs: Add a couple of simple helper opcodes"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
src/intel/compiler/brw_fs_generator.cpp

index 65b9217ee7706e2147d11a05204ffaf4bfb1a11d..1fb50e0da7307954f817aa50f681d86532a4b621 100644 (file)
@@ -2141,8 +2141,17 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width,
          assert(src[2].type == BRW_REGISTER_TYPE_UD);
          const unsigned component = src[1].ud;
          const unsigned cluster_size = src[2].ud;
+         unsigned vstride = cluster_size;
+         unsigned width = cluster_size;
+
+         /* The maximum exec_size is 32, but the maximum width is only 16. */
+         if (inst->exec_size == width) {
+            vstride = 0;
+            width = 1;
+         }
+
          struct brw_reg strided = stride(suboffset(src[0], component),
-                                         cluster_size, cluster_size, 0);
+                                         vstride, width, 0);
          if (type_sz(src[0].type) > 4 &&
              (devinfo->is_cherryview || gen_device_info_is_9lp(devinfo))) {
             /* IVB has an issue (which we found empirically) where it reads