v2: Rebase on v2 changes in previous two commits.
v3: Rebase on
85c35885b38 ("nir: Rework nir_src_as_alu_instr to not take
a pointer").
shader-db results:
Skylake and Broadwell had similar results. (Skylake shown)
total instructions in shared programs:
15297100 ->
15282141 (-0.10%)
instructions in affected programs: 956685 -> 941726 (-1.56%)
helped: 4527
HURT: 0
helped stats (abs) min: 1 max: 221 x̄: 3.30 x̃: 2
helped stats (rel) min: 0.07% max: 10.53% x̄: 1.85% x̃: 1.37%
95% mean confidence interval for instructions value: -3.48 -3.12
95% mean confidence interval for instructions %-change: -1.88% -1.81%
Instructions are helped.
total cycles in shared programs:
372809551 ->
372597886 (-0.06%)
cycles in affected programs:
13645512 ->
13433847 (-1.55%)
helped: 4362
HURT: 125
helped stats (abs) min: 1 max: 2088 x̄: 50.73 x̃: 28
helped stats (rel) min: 0.01% max: 28.20% x̄: 2.77% x̃: 2.39%
HURT stats (abs) min: 1 max: 1836 x̄: 76.90 x̃: 28
HURT stats (rel) min: <.01% max: 34.36% x̄: 3.03% x̃: 1.42%
95% mean confidence interval for cycles value: -50.98 -43.37
95% mean confidence interval for cycles %-change: -2.67% -2.55%
Cycles are helped.
total spills in shared programs: 23465 -> 23463 (<.01%)
spills in affected programs: 42 -> 40 (-4.76%)
helped: 1
HURT: 0
total fills in shared programs: 31766 -> 31763 (<.01%)
fills in affected programs: 69 -> 66 (-4.35%)
helped: 1
HURT: 0
Haswell
total instructions in shared programs:
13839992 ->
13828311 (-0.08%)
instructions in affected programs: 712503 -> 700822 (-1.64%)
helped: 3477
HURT: 0
helped stats (abs) min: 1 max: 221 x̄: 3.36 x̃: 2
helped stats (rel) min: 0.07% max: 10.64% x̄: 1.96% x̃: 1.52%
95% mean confidence interval for instructions value: -3.58 -3.14
95% mean confidence interval for instructions %-change: -2.01% -1.92%
Instructions are helped.
total cycles in shared programs:
387026330 ->
386872483 (-0.04%)
cycles in affected programs:
11329966 ->
11176119 (-1.36%)
helped: 3307
HURT: 139
helped stats (abs) min: 2 max: 1776 x̄: 49.58 x̃: 18
helped stats (rel) min: 0.01% max: 20.38% x̄: 2.27% x̃: 1.79%
HURT stats (abs) min: 1 max: 2314 x̄: 72.68 x̃: 20
HURT stats (rel) min: <.01% max: 33.99% x̄: 2.28% x̃: 0.96%
95% mean confidence interval for cycles value: -49.31 -39.98
95% mean confidence interval for cycles %-change: -2.15% -2.01%
Cycles are helped.
LOST: 1
GAINED: 0
Ivy Bridge
total instructions in shared programs:
12045602 ->
12038463 (-0.06%)
instructions in affected programs: 623837 -> 616698 (-1.14%)
helped: 2498
HURT: 0
helped stats (abs) min: 1 max: 39 x̄: 2.86 x̃: 2
helped stats (rel) min: 0.05% max: 10.00% x̄: 1.30% x̃: 1.05%
95% mean confidence interval for instructions value: -2.96 -2.75
95% mean confidence interval for instructions %-change: -1.34% -1.26%
Instructions are helped.
total cycles in shared programs:
181025675 ->
180891323 (-0.07%)
cycles in affected programs:
11329329 ->
11194977 (-1.19%)
helped: 2439
HURT: 47
helped stats (abs) min: 1 max: 1565 x̄: 57.06 x̃: 26
helped stats (rel) min: 0.02% max: 24.56% x̄: 2.02% x̃: 1.64%
HURT stats (abs) min: 1 max: 1269 x̄: 102.51 x̃: 43
HURT stats (rel) min: 0.11% max: 52.94% x̄: 4.15% x̃: 1.34%
95% mean confidence interval for cycles value: -59.91 -48.17
95% mean confidence interval for cycles %-change: -1.99% -1.82%
Cycles are helped.
Sandy Bridge, Iron Lake, and GM45 had similar results. (Sandy Bridge shown)
total instructions in shared programs:
10896368 ->
10896339 (<.01%)
instructions in affected programs: 3767 -> 3738 (-0.77%)
helped: 17
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 1.71 x̃: 1
helped stats (rel) min: 0.13% max: 9.52% x̄: 3.58% x̃: 2.73%
95% mean confidence interval for instructions value: -2.27 -1.14
95% mean confidence interval for instructions %-change: -5.14% -2.03%
Instructions are helped.
total cycles in shared programs:
155091109 ->
155091021 (<.01%)
cycles in affected programs: 47241 -> 47153 (-0.19%)
helped: 15
HURT: 8
helped stats (abs) min: 2 max: 81 x̄: 15.73 x̃: 4
helped stats (rel) min: 0.03% max: 10.59% x̄: 1.55% x̃: 0.71%
HURT stats (abs) min: 14 max: 32 x̄: 18.50 x̃: 17
HURT stats (rel) min: 0.32% max: 2.79% x̄: 2.43% x̃: 2.71%
95% mean confidence interval for cycles value: -14.59 6.93
95% mean confidence interval for cycles %-change: -1.41% 1.08%
Inconclusive result (value mean confidence interval includes 0).
Reviewed-by: Matt Turner <mattst88@gmail.com> [v2]
#include "compiler/glsl/ir.h"
#include "brw_fs.h"
#include "brw_nir.h"
+#include "nir_search_helpers.h"
#include "util/u_math.h"
#include "util/bitscan.h"
}
}
+/**
+ * Deteremine whether sources of a nir_op_fmul can be fused with a nir_op_fsign
+ *
+ * Checks the operands of a \c nir_op_fmul to determine whether or not
+ * \c emit_fsign could fuse the multiplication with the \c sign() calculation.
+ *
+ * \param instr The multiplication instruction
+ *
+ * \param fsign_src The source of \c instr that may or may not be a
+ * \c nir_op_fsign
+ */
+static bool
+can_fuse_fmul_fsign(nir_alu_instr *instr, unsigned fsign_src)
+{
+ assert(instr->op == nir_op_fmul);
+
+ nir_alu_instr *const fsign_instr =
+ nir_src_as_alu_instr(instr->src[fsign_src].src);
+
+ /* Rules:
+ *
+ * 1. instr->src[fsign_src] must be a nir_op_fsign.
+ * 2. The nir_op_fsign can only be used by this multiplication.
+ * 3. The source that is the nir_op_fsign does not have source modifiers.
+ * \c emit_fsign only examines the source modifiers of the source of the
+ * \c nir_op_fsign.
+ *
+ * The nir_op_fsign must also not have the saturate modifier, but steps
+ * have already been taken (in nir_opt_algebraic) to ensure that.
+ */
+ return fsign_instr != NULL && fsign_instr->op == nir_op_fsign &&
+ is_used_once(fsign_instr) &&
+ !instr->src[fsign_src].abs && !instr->src[fsign_src].negate;
+}
+
void
fs_visitor::nir_emit_alu(const fs_builder &bld, nir_alu_instr *instr)
{
break;
case nir_op_fmul:
+ for (unsigned i = 0; i < 2; i++) {
+ if (can_fuse_fmul_fsign(instr, i)) {
+ emit_fsign(bld, instr, result, op, i);
+ return;
+ }
+ }
+
inst = bld.MUL(result, op[0], op[1]);
inst->saturate = instr->dest.saturate;
break;