The (-abs(x) >= 0) => (x == 0) optimization is removed from the vec4 and
scalar parts. In the VS part, adding the new pattern was not
helpful. The pattern that is removed is really old, and it has been
handled by NIR for ages.
All Gen7+ platforms had similar results. (Broadwell shown)
total instructions in shared programs:
14715715 ->
14715709 (<.01%)
instructions in affected programs: 474 -> 468 (-1.27%)
helped: 6
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.12% max: 1.35% x̄: 1.28% x̃: 1.35%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -1.40% -1.15%
Instructions are helped.
total cycles in shared programs:
559569911 ->
559569809 (<.01%)
cycles in affected programs: 5963 -> 5861 (-1.71%)
helped: 6
HURT: 0
helped stats (abs) min: 16 max: 18 x̄: 17.00 x̃: 17
helped stats (rel) min: 1.45% max: 1.88% x̄: 1.73% x̃: 1.85%
95% mean confidence interval for cycles value: -18.15 -15.85
95% mean confidence interval for cycles %-change: -1.95% -1.51%
Cycles are helped.
Iron Lake and Sandy Bridge had similar results. (Iron Lake shown)
total instructions in shared programs:
7780915 ->
7780913 (<.01%)
instructions in affected programs: 246 -> 244 (-0.81%)
helped: 2
HURT: 0
total cycles in shared programs:
177876108 ->
177876106 (<.01%)
cycles in affected programs: 3636 -> 3634 (-0.06%)
helped: 1
HURT: 0
GM45
total instructions in shared programs:
4799152 ->
4799151 (<.01%)
instructions in affected programs: 126 -> 125 (-0.79%)
helped: 1
HURT: 0
total cycles in shared programs:
122052654 ->
122052652 (<.01%)
cycles in affected programs: 3640 -> 3638 (-0.05%)
helped: 1
HURT: 0
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
foreach_block_and_inst(block, fs_inst, inst, cfg) {
switch (inst->opcode) {
case BRW_OPCODE_MOV:
+ if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
+ inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
+ inst->dst.is_null() &&
+ (inst->src[0].abs || inst->src[0].negate)) {
+ inst->src[0].abs = false;
+ inst->src[0].negate = false;
+ progress = true;
+ break;
+ }
+
if (inst->src[0].file != IMM)
break;
}
break;
case BRW_OPCODE_CMP:
- if (inst->conditional_mod == BRW_CONDITIONAL_GE &&
- inst->src[0].abs &&
- inst->src[0].negate &&
- inst->src[1].is_zero()) {
+ if ((inst->conditional_mod == BRW_CONDITIONAL_Z ||
+ inst->conditional_mod == BRW_CONDITIONAL_NZ) &&
+ inst->src[1].is_zero() &&
+ (inst->src[0].abs || inst->src[0].negate)) {
inst->src[0].abs = false;
inst->src[0].negate = false;
- inst->conditional_mod = BRW_CONDITIONAL_Z;
progress = true;
break;
}
progress = true;
}
break;
- case BRW_OPCODE_CMP:
- if (inst->conditional_mod == BRW_CONDITIONAL_GE &&
- inst->src[0].abs &&
- inst->src[0].negate &&
- inst->src[1].is_zero()) {
- inst->src[0].abs = false;
- inst->src[0].negate = false;
- inst->conditional_mod = BRW_CONDITIONAL_Z;
- progress = true;
- break;
- }
- break;
case SHADER_OPCODE_BROADCAST:
if (is_uniform(inst->src[0]) ||
inst->src[1].is_zero()) {