vc4: Fix register pressure cost estimates when a src appears twice.
authorEric Anholt <eric@anholt.net>
Sat, 4 Mar 2017 01:03:44 +0000 (17:03 -0800)
committerEric Anholt <eric@anholt.net>
Wed, 8 Mar 2017 21:44:17 +0000 (13:44 -0800)
This ended up confusing the scheduler for things like fabs (implemented as
fmaxabs x, x) or squaring a number, and it would try to avoid scheduling
them because it appeared more expensive than other instructions.

Fixes failure to register allocate in
dEQP-GLES2.functional.uniform_api.random.3 with almost no shader-db
effects (+.35% max temps)

src/gallium/drivers/vc4/vc4_qir_schedule.c

index 89e6d1d0d60d59f96a1df3e491b87db28cbcc93c..5118caf317c02f0d9a2ecb51d75712b4aa14dcb8 100644 (file)
@@ -434,10 +434,20 @@ get_register_pressure_cost(struct schedule_state *state, struct qinst *inst)
                 cost--;
 
         for (int i = 0; i < qir_get_nsrc(inst); i++) {
-                if (inst->src[i].file == QFILE_TEMP &&
-                    !BITSET_TEST(state->temp_live, inst->src[i].index)) {
-                        cost++;
+                if (inst->src[i].file != QFILE_TEMP ||
+                    BITSET_TEST(state->temp_live, inst->src[i].index)) {
+                        continue;
+                }
+
+                bool already_counted = false;
+                for (int j = 0; j < i; j++) {
+                        if (inst->src[i].file == inst->src[j].file &&
+                            inst->src[i].index == inst->src[j].index) {
+                                already_counted = true;
+                        }
                 }
+                if (!already_counted)
+                        cost++;
         }
 
         return cost;