aco: sign-extend input/identity for 16-bit subgroup ops on GFX6-GFX7
authorSamuel Pitoiset <samuel.pitoiset@gmail.com>
Tue, 26 May 2020 14:21:44 +0000 (16:21 +0200)
committerSamuel Pitoiset <samuel.pitoiset@gmail.com>
Wed, 3 Jun 2020 17:48:43 +0000 (19:48 +0200)
16-bit subgroup ops are implemented with 32-bit instructions
on GFX6-GFX7.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5227>

src/amd/compiler/aco_lower_to_hw_instr.cpp

index df7b571c529971df1486c14b086ece3f928e4845..40d466904ef3a72488b5d4874797f9a538597873 100644 (file)
@@ -590,6 +590,9 @@ void emit_reduction(lower_context *ctx, aco_opcode op, ReduceOp reduce_op, unsig
             sdwa->sel[0] = sdwa_uword;
          sdwa->dst_sel = sdwa_udword;
          bld.insert(std::move(sdwa));
+      } else if (ctx->program->chip_class == GFX6 || ctx->program->chip_class == GFX7) {
+         bld.vop3(aco_opcode::v_bfe_i32, Definition(PhysReg{tmp}, v1),
+                  Operand(PhysReg{tmp}, v1), Operand(0u), Operand(16u));
       }
    }