On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using
POW requires putting 2.0 in a register, while EXP2 doesn't.
I believe that EXP2 will be faster than POW on basically all GPUs, so
it makes sense to optimize it.
Looking at the savage2 subset of shader-db:
total instructions in shared programs: 113225 -> 113179 (-0.04%)
instructions in affected programs: 2139 -> 2093 (-2.15%)
instances of 'math pow': 795 -> 749 (-6.14%)
instances of 'math exp': 389 -> 435 (11.8%)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
return (ir == NULL) ? false : ir->is_one();
}
+static inline bool
+is_vec_two(ir_constant *ir)
+{
+ return (ir == NULL) ? false : ir->is_value(2.0, 2);
+}
+
static inline bool
is_vec_negative_one(ir_constant *ir)
{
/* 1^x == 1 */
if (is_vec_one(op_const[0]))
return op_const[0];
+
+ /* pow(2,x) == exp2(x) */
+ if (is_vec_two(op_const[0]))
+ return expr(ir_unop_exp2, ir->operands[1]);
+
break;
case ir_unop_rcp: