i965/fs: Let sat-prop ignore live ranges if producer already has sat.
authorMatt Turner <mattst88@gmail.com>
Sun, 29 Jun 2014 01:00:27 +0000 (18:00 -0700)
committerMatt Turner <mattst88@gmail.com>
Tue, 1 Jul 2014 05:31:05 +0000 (22:31 -0700)
This sequence (where both x and w are used afterwards) wasn't handled.

   mul.sat x, y, z
   ...
   mov.sat w, x

We assumed that if x was used after the mov.sat, that we couldn't
propagate the saturate modifier, but in fact x was already saturated.

So ignore the live range check if the producing instruction already
saturates its result. Cuts one instruction from hundreds of TF2 shaders.

total instructions in shared programs: 1995631 -> 1994951 (-0.03%)
instructions in affected programs:     155248 -> 154568 (-0.44%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
src/mesa/drivers/dri/i965/brw_fs_saturate_propagation.cpp

index 1b3d3b76d13655035dbbf5a6debfe4ee16f42c59..29c8b2ea31832476b58e42c42362e7a58c34500d 100644 (file)
@@ -49,8 +49,6 @@ opt_saturate_propagation_local(fs_visitor *v, bblock_t *block)
 
       int src_var = v->live_intervals->var_from_reg(&inst->src[0]);
       int src_end_ip = v->live_intervals->end[src_var];
-      if (src_end_ip > ip && !inst->dst.equals(inst->src[0]))
-         continue;
 
       int scan_ip = ip;
       bool interfered = false;
@@ -63,10 +61,15 @@ opt_saturate_propagation_local(fs_visitor *v, bblock_t *block)
              scan_inst->dst.reg == inst->src[0].reg &&
              scan_inst->dst.reg_offset == inst->src[0].reg_offset &&
              !scan_inst->is_partial_write()) {
-            if (scan_inst->can_do_saturate()) {
-               scan_inst->saturate = true;
+            if (scan_inst->saturate) {
                inst->saturate = false;
                progress = true;
+            } else if (src_end_ip <= ip || inst->dst.equals(inst->src[0])) {
+               if (scan_inst->can_do_saturate()) {
+                  scan_inst->saturate = true;
+                  inst->saturate = false;
+                  progress = true;
+               }
             }
             break;
          }