i965: Implement nir_op_uadd_carry and _usub_borrow without accumulator.
authorFrancisco Jerez <currojerez@riseup.net>
Thu, 9 Jul 2015 18:42:28 +0000 (21:42 +0300)
committerFrancisco Jerez <currojerez@riseup.net>
Thu, 16 Jul 2015 15:29:32 +0000 (18:29 +0300)
commitb00cd6e4a0f9a84d514f428428be348900236e2e
tree09054d5729657d29ecf52a07ad3497de6416b37b
parent3ee2daf23dc91b8dfc017b5c89c10ab1376ba4df
i965: Implement nir_op_uadd_carry and _usub_borrow without accumulator.

This gets rid of two no16() fall-backs and should allow better
scheduling of the generated IR.  There are no uses of usubBorrow() or
uaddCarry() in shader-db so no changes are expected.  However the
"arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and
"arb_gpu_shader5/execution/built-in-functions/fs-uaddCarry" piglit
tests go from 40 to 28 instructions.  The reason is that the plain ADD
instruction can easily be CSE'ed with the original addition, and the
b2i negation can easily be propagated into the source modifier of
another instruction, so effectively both operations are performed with
just one instruction.

v2: Rely on carry_to_arith() and borrow_to_arith() to lower these
    (Ilia Mirkin).

Reviewed-by: Matt Turner <mattst88@gmail.com>
src/mesa/drivers/dri/i965/brw_fs_nir.cpp
src/mesa/drivers/dri/i965/brw_shader.cpp
src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp