aco: always optimize v_mad to v_madak in presence of literals
authorSamuel Pitoiset <samuel.pitoiset@gmail.com>
Wed, 1 Apr 2020 16:09:43 +0000 (18:09 +0200)
committerMarge Bot <eric+marge@anholt.net>
Fri, 3 Apr 2020 07:30:49 +0000 (07:30 +0000)
v_mad and v_madak are both 64-bit instructions, so it doesn't
increase code size to always apply a 32-bit literal instead of
using v_mad and a sgpr which contains that literal.

Found with some Youngblood shaders but help some other games.

vkpipeline-db (VEGA10):
Totals from affected shaders:
SGPRS: 46168 -> 46016 (-0.33 %)
VGPRS: 45576 -> 45564 (-0.03 %)
Code Size: 5187208 -> 5179584 (-0.15 %) bytes
Max Waves: 3297 -> 3297 (0.00 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4410>

src/amd/compiler/aco_optimizer.cpp

index a18060f485b3e95e1073e29f7485782241c8ca27..a6b68dd5448903a1e6ddc4662d022495d9d9088e 100644 (file)
@@ -2633,7 +2633,15 @@ void select_instruction(opt_ctx &ctx, aco_ptr<Instruction>& instr)
                literal_idx = i;
             }
          }
-         if (literal_uses < threshold) {
+
+         /* Limit the number of literals to apply to not increase the code
+          * size too much, but always apply literals for v_mad->v_madak
+          * because both instructions are 64-bit and this doesn't increase
+          * code size.
+          * TODO: try to apply the literals earlier to lower the number of
+          * uses below threshold
+          */
+         if (literal_uses < threshold || literal_idx == 2) {
             ctx.uses[instr->operands[literal_idx].tempId()]--;
             mad_info->check_literal = true;
             mad_info->literal_idx = literal_idx;
@@ -2760,7 +2768,8 @@ void apply_literals(opt_ctx &ctx, aco_ptr<Instruction>& instr)
    /* apply literals on MAD */
    if (instr->opcode == aco_opcode::v_mad_f32 && ctx.info[instr->definitions[0].tempId()].is_mad()) {
       mad_info* info = &ctx.mad_infos[ctx.info[instr->definitions[0].tempId()].val];
-      if (info->check_literal && ctx.uses[instr->operands[info->literal_idx].tempId()] == 0) {
+      if (info->check_literal &&
+          (ctx.uses[instr->operands[info->literal_idx].tempId()] == 0 || info->literal_idx == 2)) {
          aco_ptr<Instruction> new_mad;
          if (info->literal_idx == 2) { /* add literal -> madak */
             new_mad.reset(create_instruction<VOP2_instruction>(aco_opcode::v_madak_f32, Format::VOP2, 3, 1));