I discovered this while looking at a shader that was hurt by some other
work I'm doing. When I examined the changes, I was confused that one
instance of a comparison that was used in a discard_if was (incorrectly)
eliminated, while another instance used by a bcsel was (correctly) not
eliminated. I had to use NIR_PRINT=true to see exactly where things
when wrong.
A bunch of shaders in Goat Simulator, Dungeon Defenders, Sanctum 2, and
Strike Suit Zero were impacted.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Fixes: 405de7ccb6c ("nir/range-analysis: Rudimentary value range analysis pass")
All Intel platforms had similar results. (Ice Lake shown)
total instructions in shared programs:
16280659 ->
16281075 (<.01%)
instructions in affected programs: 21042 -> 21458 (1.98%)
helped: 0
HURT: 136
HURT stats (abs) min: 1 max: 9 x̄: 3.06 x̃: 3
HURT stats (rel) min: 1.16% max: 6.12% x̄: 2.23% x̃: 2.03%
95% mean confidence interval for instructions value: 2.93 3.19
95% mean confidence interval for instructions %-change: 2.08% 2.37%
Instructions are HURT.
total cycles in shared programs:
367168270 ->
367170313 (<.01%)
cycles in affected programs: 172020 -> 174063 (1.19%)
helped: 14
HURT: 111
helped stats (abs) min: 2 max: 80 x̄: 21.21 x̃: 9
helped stats (rel) min: 0.10% max: 4.47% x̄: 1.35% x̃: 0.79%
HURT stats (abs) min: 2 max: 584 x̄: 21.08 x̃: 5
HURT stats (rel) min: 0.12% max: 17.28% x̄: 1.55% x̃: 0.40%
95% mean confidence interval for cycles value: 5.41 27.28
95% mean confidence interval for cycles %-change: 0.64% 1.81%
Cycles are HURT.
r = (struct ssa_result_range){analyze_expression(alu, 0, ht).range, false};
break;
- case nir_op_mov:
- r = analyze_expression(alu, 0, ht);
+ case nir_op_mov: {
+ const struct ssa_result_range left = analyze_expression(alu, 0, ht);
+
+ /* See commentary in nir_op_bcsel for the reasons this is necessary. */
+ if (nir_src_is_const(alu->src[0].src) && left.range != eq_zero)
+ return (struct ssa_result_range){unknown, false};
+
+ r = left;
break;
+ }
case nir_op_fneg:
r = analyze_expression(alu, 0, ht);