aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents
authorSamuel Pitoiset <samuel.pitoiset@gmail.com>
Tue, 14 Apr 2020 07:42:48 +0000 (09:42 +0200)
committerSamuel Pitoiset <samuel.pitoiset@gmail.com>
Wed, 15 Apr 2020 08:12:44 +0000 (10:12 +0200)
v_frexp_exp_i16_f16 returns the two's complement for negative
exponents. For example, with 0.333252 it returns 0.666504 for
the mantissa and 65535 for the exponent (-1 in decimal).

RADV/LLVM and AMDVLK do a v_bfe_i32 and AMDGPU-PRO uses SDWA with
the sign extension bit set. The latter is probably what we want to
do in long term but for now RA doesn't support changing non-SDWA
instructions to SDWA if useful/needed.

Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.frexp.compute.*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4546>

src/amd/compiler/aco_instruction_selection.cpp

index bc973a05e5a326f113f47335cd10f5b751a0c624..61a2d994d84b4e9b50ea5c9b1aca3455c2b97b7c 100644 (file)
@@ -2114,7 +2114,12 @@ void visit_alu_instr(isel_context *ctx, nir_alu_instr *instr)
       Temp src = get_alu_src(ctx, instr->src[0]);
       if (instr->src[0].src.ssa->bit_size == 16) {
          Temp tmp = bld.vop1(aco_opcode::v_frexp_exp_i16_f16, bld.def(v1), src);
-         bld.pseudo(aco_opcode::p_extract_vector, Definition(dst), tmp, Operand(0u));
+         aco_ptr<SDWA_instruction> sdwa{create_instruction<SDWA_instruction>(aco_opcode::v_mov_b32, asSDWA(Format::VOP1), 1, 1)};
+         sdwa->operands[0] = Operand(tmp);
+         sdwa->definitions[0] = Definition(dst);
+         sdwa->sel[0] = sdwa_sbyte;
+         sdwa->dst_sel = sdwa_sdword;
+         ctx->block->instructions.emplace_back(std::move(sdwa));
       } else if (instr->src[0].src.ssa->bit_size == 32) {
          bld.vop1(aco_opcode::v_frexp_exp_i32_f32, Definition(dst), src);
       } else if (instr->src[0].src.ssa->bit_size == 64) {