i965/vec4: set correct register regions for 32-bit and 64-bit
authorIago Toral Quiroga <itoral@igalia.com>
Wed, 18 Nov 2015 13:00:58 +0000 (14:00 +0100)
committerSamuel Iglesias Gonsálvez <siglesias@igalia.com>
Tue, 3 Jan 2017 10:26:50 +0000 (11:26 +0100)
For 32-bit instructions we want to use <4,4,1> regions for VGRF
sources so we should really set a width of 4 (we were setting 8).

For 64-bit instructions we want to use a width of 2 because the
hardware uses 32-bit swizzles, meaning that we can only address 2
consecutive 64-bit components in a row. Also, Curro suggested that
the hardware is probably fixing the width to 2 for 64-bit instructions
anyway, so just go with that and use <2,2,1>.

v2:
 - No need to explicitly set the vertical stride of 64-bit regions to 2,
   brw_vecn_grf with a width of 2 will do that for us.
 - No need to adjust the width of dst registers.

v3 (Ian):
 - Make type_size and width const.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
src/mesa/drivers/dri/i965/brw_vec4.cpp

index 3f3fd6bbcf373832d840550eda6a00332d22f188..4286a6c78aae5ee41ebf60e97c9a5bac14909400 100644 (file)
@@ -1873,20 +1873,24 @@ vec4_visitor::convert_to_hw_regs()
          struct src_reg &src = inst->src[i];
          struct brw_reg reg;
          switch (src.file) {
-         case VGRF:
-            reg = byte_offset(brw_vec8_grf(src.nr, 0), src.offset);
+         case VGRF: {
+            const unsigned type_size = type_sz(src.type);
+            const unsigned width = REG_SIZE / 2 / MAX2(4, type_size);
+            reg = byte_offset(brw_vecn_grf(width, src.nr, 0), src.offset);
             reg.type = src.type;
             reg.swizzle = src.swizzle;
             reg.abs = src.abs;
             reg.negate = src.negate;
             break;
+         }
 
-         case UNIFORM:
+         case UNIFORM: {
+            const unsigned width = REG_SIZE / 2 / MAX2(4, type_sz(src.type));
             reg = stride(byte_offset(brw_vec4_grf(
                                         prog_data->base.dispatch_grf_start_reg +
                                         src.nr / 2, src.nr % 2 * 4),
                                      src.offset),
-                         0, 4, 1);
+                         0, width, 1);
             reg.type = src.type;
             reg.swizzle = src.swizzle;
             reg.abs = src.abs;
@@ -1895,6 +1899,7 @@ vec4_visitor::convert_to_hw_regs()
             /* This should have been moved to pull constants. */
             assert(!src.reladdr);
             break;
+         }
 
          case ARF:
          case FIXED_GRF: