vc4: Add quick algebraic optimization for clamping of unpacked values.
authorEric Anholt <eric@anholt.net>
Fri, 11 Dec 2015 05:54:41 +0000 (21:54 -0800)
committerEric Anholt <eric@anholt.net>
Fri, 11 Dec 2015 20:36:16 +0000 (12:36 -0800)
GL likes to saturate your incoming color, but if that color's coming from
unpacking from unorms, there's no point.  Ideally we'd have a range
propagation pass that cleans these up in NIR, but that doesn't seem to be
going to land soon.  It seems like we could do a one-off optimization in
nir_opt_algebraic, except that doesn't want to operate on expressions
involving unpack_unorm_4x8, since it's sized.

total instructions in shared programs: 87879 -> 87761 (-0.13%)
instructions in affected programs:     6044 -> 5926 (-1.95%)
total estimated cycles in shared programs: 349457 -> 349252 (-0.06%)
estimated cycles in affected programs:     6172 -> 5967 (-3.32%)

No SSPD on openarena (which had the biggest gains, in its VS/CSes), n=15.

src/gallium/drivers/vc4/vc4_opt_algebraic.c

index 207686b4af7defc689280fec788e533ddbd5702c..aea2b9dbe876aed6854177146b039bd5132d3772 100644 (file)
@@ -182,6 +182,24 @@ qir_opt_algebraic(struct vc4_compile *c)
 
                         break;
 
+                case QOP_FMIN:
+                        if (is_1f(c, inst->src[1]) &&
+                            inst->src[0].pack >= QPU_UNPACK_8D_REP &&
+                            inst->src[0].pack <= QPU_UNPACK_8D) {
+                                replace_with_mov(c, inst, inst->src[0]);
+                                progress = true;
+                        }
+                        break;
+
+                case QOP_FMAX:
+                        if (is_zero(c, inst->src[1]) &&
+                            inst->src[0].pack >= QPU_UNPACK_8D_REP &&
+                            inst->src[0].pack <= QPU_UNPACK_8D) {
+                                replace_with_mov(c, inst, inst->src[0]);
+                                progress = true;
+                        }
+                        break;
+
                 case QOP_FSUB:
                 case QOP_SUB:
                         if (is_zero(c, inst->src[1])) {