vc4: Fix inverted priority of instructions for QPU scheduling.
authorEric Anholt <eric@anholt.net>
Wed, 3 Dec 2014 00:31:29 +0000 (16:31 -0800)
committerEric Anholt <eric@anholt.net>
Fri, 5 Dec 2014 18:43:14 +0000 (10:43 -0800)
We were scheduling TLB operations as early as possible, and texture setup
as late as possible.  When I introduced prioritization, I visually
inspected that an independent operation got moved above texture results
collection, which tricked me into thinking it was working (but it was just
because texture setup was being pushed late).

total instructions in shared programs: 57651 -> 57486 (-0.29%)
instructions in affected programs:     18532 -> 18367 (-0.89%)

src/gallium/drivers/vc4/vc4_qpu_schedule.c

index 8aa83741ff53e63085ff1b767e69992237136035..2b0a6326b8cfa3e3c821c82f286be2223b48ba5c 100644 (file)
@@ -439,24 +439,24 @@ get_instruction_priority(uint64_t inst)
         uint32_t baseline_score;
         uint32_t next_score = 0;
 
-        /* Schedule texture read setup early to hide their latency better. */
-        if (is_tmu_write(waddr_add) || is_tmu_write(waddr_mul))
+        /* Schedule TLB operations as late as possible, to get more
+         * parallelism between shaders.
+         */
+        if (qpu_inst_is_tlb(inst))
                 return next_score;
         next_score++;
 
-        /* Default score for things that aren't otherwise special. */
-        baseline_score = next_score;
-        next_score++;
-
         /* Schedule texture read results collection late to hide latency. */
         if (sig == QPU_SIG_LOAD_TMU0 || sig == QPU_SIG_LOAD_TMU1)
                 return next_score;
         next_score++;
 
-        /* Schedule TLB operations as late as possible, to get more
-         * parallelism between shaders.
-         */
-        if (qpu_inst_is_tlb(inst))
+        /* Default score for things that aren't otherwise special. */
+        baseline_score = next_score;
+        next_score++;
+
+        /* Schedule texture read setup early to hide their latency better. */
+        if (is_tmu_write(waddr_add) || is_tmu_write(waddr_mul))
                 return next_score;
         next_score++;