With this reassociation, this lowering path is still beneficial.
Ice Lake
total instructions in shared programs:
17220191 ->
17207181 (-0.08%)
instructions in affected programs: 999871 -> 986861 (-1.30%)
helped: 3703
HURT: 17
helped stats (abs) min: 1 max: 686 x̄: 3.52 x̃: 3
helped stats (rel) min: 0.09% max: 51.97% x̄: 2.21% x̃: 1.35%
HURT stats (abs) min: 1 max: 9 x̄: 1.47 x̃: 1
HURT stats (rel) min: 0.08% max: 4.55% x̄: 0.78% x̃: 0.55%
95% mean confidence interval for instructions value: -4.01 -2.99
95% mean confidence interval for instructions %-change: -2.29% -2.11%
Instructions are helped.
total cycles in shared programs:
360871298 ->
360755040 (-0.03%)
cycles in affected programs:
9931334 ->
9815076 (-1.17%)
helped: 2388
HURT: 1569
helped stats (abs) min: 1 max: 10228 x̄: 93.54 x̃: 18
helped stats (rel) min: <.01% max: 74.11% x̄: 3.36% x̃: 1.07%
HURT stats (abs) min: 1 max: 1917 x̄: 68.27 x̃: 22
HURT stats (rel) min: <.01% max: 44.90% x̄: 3.44% x̃: 1.72%
95% mean confidence interval for cycles value: -39.48 -19.28
95% mean confidence interval for cycles %-change: -0.86% -0.46%
Cycles are helped.
total spills in shared programs: 12355 -> 12159 (-1.59%)
spills in affected programs: 295 -> 99 (-66.44%)
helped: 2
HURT: 1
total fills in shared programs: 25398 -> 25207 (-0.75%)
fills in affected programs: 288 -> 97 (-66.32%)
helped: 2
HURT: 1
LOST: 3
GAINED: 44
Iron Lake
total instructions in shared programs:
8169225 ->
8159729 (-0.12%)
instructions in affected programs:
1025712 ->
1016216 (-0.93%)
helped: 3352
HURT: 0
helped stats (abs) min: 1 max: 6 x̄: 2.83 x̃: 3
helped stats (rel) min: 0.15% max: 12.00% x̄: 1.51% x̃: 1.05%
95% mean confidence interval for instructions value: -2.86 -2.80
95% mean confidence interval for instructions %-change: -1.56% -1.46%
Instructions are helped.
total cycles in shared programs:
188656796 ->
188612280 (-0.02%)
cycles in affected programs:
18633584 ->
18589068 (-0.24%)
helped: 3085
HURT: 14
helped stats (abs) min: 2 max: 72 x̄: 14.45 x̃: 12
helped stats (rel) min: 0.02% max: 5.73% x̄: 0.73% x̃: 0.31%
HURT stats (abs) min: 2 max: 4 x̄: 3.71 x̃: 4
HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -14.55 -14.18
95% mean confidence interval for cycles %-change: -0.76% -0.69%
Cycles are helped.
GM45
total instructions in shared programs:
5026905 ->
5021856 (-0.10%)
instructions in affected programs: 584169 -> 579120 (-0.86%)
helped: 1776
HURT: 0
helped stats (abs) min: 1 max: 6 x̄: 2.84 x̃: 3
helped stats (rel) min: 0.15% max: 11.11% x̄: 1.43% x̃: 0.98%
95% mean confidence interval for instructions value: -2.88 -2.80
95% mean confidence interval for instructions %-change: -1.50% -1.37%
Instructions are helped.
total cycles in shared programs:
129047376 ->
129018918 (-0.02%)
cycles in affected programs:
12941924 ->
12913466 (-0.22%)
helped: 1722
HURT: 14
helped stats (abs) min: 4 max: 72 x̄: 16.56 x̃: 18
helped stats (rel) min: 0.02% max: 5.73% x̄: 0.72% x̃: 0.30%
HURT stats (abs) min: 2 max: 4 x̄: 3.71 x̃: 4
HURT stats (rel) min: <.01% max: <.01% x̄: <.01% x̃: <.01%
95% mean confidence interval for cycles value: -16.65 -16.13
95% mean confidence interval for cycles %-change: -0.76% -0.66%
Cycles are helped.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
}
/**
- * Replace flrp(a, b, c) with (b*c ± c) + a
+ * Replace flrp(a, b, c) with (b*c ± c) + a => b*c + (a ± c)
*
* \note: This only works if a = ±1.
*/
nir_ssa_def *const neg_c = nir_fneg(bld, c);
nir_instr_as_alu(neg_c->parent_instr)->exact = alu->exact;
- inner_sum = nir_fadd(bld, b_times_c, neg_c);
+ inner_sum = nir_fadd(bld, a, neg_c);
} else {
- inner_sum = nir_fadd(bld, b_times_c, c);
+ inner_sum = nir_fadd(bld, a, c);
}
nir_instr_as_alu(inner_sum->parent_instr)->exact = alu->exact;
- nir_ssa_def *const outer_sum = nir_fadd(bld, inner_sum, a);
+ nir_ssa_def *const outer_sum = nir_fadd(bld, inner_sum, b_times_c);
nir_instr_as_alu(outer_sum->parent_instr)->exact = alu->exact;
nir_ssa_def_rewrite_uses(&alu->dest.dest.ssa, nir_src_for_ssa(outer_sum));